-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Description
Preconditions and environment
- Magento 2.4-develop branch (commit 0a3b703)
- Built-in Full Page Cache enabled (non-Varnish mode)
- Default
\Magento\Framework\App\PageCache\Identifierimplementation fromlib/internal/Magento/Framework/App/PageCache - No custom plugins/extensions overriding the Identifier class
Steps to reproduce
-
Enable built-in Full Page Cache mode.
-
Visit any storefront page with marketing/tracking parameters included in the URL, e.g.:
https://example.com/?utm_source=test&utm_medium=cpc&gclid=TEST123&foo=bar
-
Observe that Magento applies regex stripping patterns inside
Magento\Framework\App\PageCache\Identifier::getValue() to remove marketing parameters. -
However, inspect the final cache key (e.g. via debugging
Identifier::getValue()
or enabling cache debug mode):- The sanitized URL (with tracking params removed) is used only for generating the base URL.
- The
$queryportion of the FPC identifier is rebuilt using the original request’s query array.
-
As a result, the cache identifier still contains all original query parameters,
including marketing parameters that were intended to be stripped. -
This leads to different FPC entries for equivalent URLs that differ only by
marketing/tracking parameters.
Expected result
-
Marketing/tracking parameters defined in
getMarketingParameterPatterns()
(e.g., utm_*, gclid, fbclid, msclkid, etc.) should be removed before building
the Full Page Cache identifier. -
The query string used in the identifier should be reconstructed from the
sanitized URL (after marketing params are removed), not from the original
request's query array. -
URLs that differ only by ignored marketing parameters should produce the
same built-in FPC cache key.
Actual result
-
Marketing parameters (e.g., utm_*, gclid, fbclid, etc.) are stripped by regex
into a sanitized $url, but then reintroduced during cache identifier generation. -
In
reconstructUrl(),$queryis rebuilt from:
$this->request->getUri()->getQueryAsArray()
which contains the full original query string including all marketing parameters. -
Therefore, the resulting cache key still includes marketing parameters,
causing cache fragmentation and defeating the purpose of the new
marketing-parameter-stripping logic. -
Only the order of query parameters is normalized (via ksort()),
but the actual marketing parameters are not removed.
Additional information
In Identifier::getValue(), marketing parameters are stripped from the URL
using preg_replace(). The resulting sanitized URL is passed to reconstructUrl().
Inside reconstructUrl(), the base URL is derived from the sanitized URL
(using strtok($url, '?')). However, the query portion is rebuilt from:
$this->request->getUri()->getQueryAsArray()
This array is constructed from the original request URI, not the sanitized URL.
Therefore, all marketing parameters that were removed by regex are added
back into the query string used to generate the cache key.
The intended behavior (mirroring Magento's Varnish VCL) is to exclude these
parameters entirely from cache key generation.
A pull request addressing this issue has been submitted:
Release note
No response
Triage and priority
- Severity: S0 - Affects critical data or functionality and leaves users without workaround.
- Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
- Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
- Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
- Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status