Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow stripping query parameters in the reverse proxy #1624

Merged

Conversation

Toflar
Copy link
Member

@Toflar Toflar commented Apr 2, 2020

Q A
Fixed issues -
Docs PR or issue TODO once merged

This PR will allow to strip query parameters that are not required for the application but only for e.g. some tracking tools in the front end. It increases the cache hit rate and prevents these parameters from flooding the cache.

While implementing this I thought there's also the use case where people trust the Contao developers to maintain a proper blacklist but only need one exception to that. Like e.g. you need the utm_* parameters in PHP but you don't want to whitelist all of the allowed query parameters.
Same might apply for cookies so I unified that :)

@Toflar Toflar added the feature label Apr 2, 2020
@Toflar Toflar added this to the 4.10 milestone Apr 2, 2020
@Toflar Toflar self-assigned this Apr 2, 2020
@Toflar
Copy link
Member Author

Toflar commented Apr 2, 2020

CS issue unrelated, master is failing @leofeyer :)

ausi
ausi previously approved these changes Apr 5, 2020
manager-bundle/src/HttpKernel/ContaoCache.php Outdated Show resolved Hide resolved
manager-bundle/src/HttpKernel/ContaoCache.php Outdated Show resolved Hide resolved
@Toflar Toflar force-pushed the feature/strip-query-params-subscriber branch from 28dde96 to 2c0ad69 Compare April 6, 2020 08:00
@Toflar Toflar requested a review from ausi April 6, 2020 08:01
@@ -82,7 +94,7 @@ public function preHandle(CacheEvent $event): void
if (0 !== \count($this->whitelist)) {
$this->filterCookies($request, $this->whitelist, true);
} else {
$this->filterCookies($request, self::BLACKLIST);
$this->filterCookies($request, array_diff(self::BLACKLIST, $this->disabledFromBlacklist));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using array_diff here could lead to unexpected results I think. If someone wants to allow the _utma cookie and calls disableFromBlacklist(['_utma']) the cookie would still get stripped because the regex for this cookie is __utm.+.

I think we should extend the filterCookies() method and remove every name from $removeCookies that matches the disableFromBlacklist array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wanted to explicitly disallow this use case. If you want to remove something from the blacklist, specify it correctly. You can still use the whitelist if you want to have full control.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if we change one of the regexes in the blacklist in a future update of Contao? The disableFromBlacklist() would no longer work than for that regex and this would be BC break.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, would be easy to solve with an alias in the disableFromBlacklist() method 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what should I do then if I want to remove just the _utma (not all _utm*) cookie from the blacklist?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a use case imho. Either you need all of the _utm query parameters or none.
But you're free trying to optimize this in a later version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also think of something like a bimodal_transport cookie which is currently blacklisted by (.*)?modal(.*)?. But you are right, I will provide a separate pull request for this 👍

@leofeyer leofeyer merged commit bd43b42 into contao:master Apr 8, 2020
@leofeyer
Copy link
Member

leofeyer commented Apr 8, 2020

Thank you @Toflar.

@Toflar Toflar deleted the feature/strip-query-params-subscriber branch April 8, 2020 12:51
@aschempp
Copy link
Member

aschempp commented May 5, 2020

we should probably blacklist the very common UTM parameters as well?
https://en.wikipedia.org/wiki/UTM_parameters

@leofeyer
Copy link
Member

leofeyer commented May 5, 2020

What do you mean? The regex utm_[a-z]+ already blocks all UTM parameters, doesn't it?

@aschempp
Copy link
Member

aschempp commented May 5, 2020

🤦 haven't seen that

leofeyer pushed a commit that referenced this pull request Jun 15, 2020
Description
-----------

As discussed in #1624 (comment)

This pull request probably needs a reabase once #1798 got resolved.

Commits
-------

ba8a8f8 Improve disabled cookie deny list
52ca008 Improve disabled query parameter deny list
94178bc Merge branch 'master' into feature/improve-disable-deny-list
leofeyer pushed a commit to contao/core-bundle that referenced this pull request Jun 15, 2020
Description
-----------

As discussed in contao/contao#1624 (comment)

This pull request probably needs a reabase once #1798 got resolved.

Commits
-------

ba8a8f8c Improve disabled cookie deny list
52ca0082 Improve disabled query parameter deny list
94178bcc Merge branch 'master' into feature/improve-disable-deny-list
@leofeyer leofeyer changed the title Allow to strip query parameters in reverse proxy Allow stripping query parameters in the reverse proxy Jul 21, 2020
AlexejKossmann pushed a commit to AlexejKossmann/contao that referenced this pull request Apr 6, 2021
Description
-----------

| Q                | A
| -----------------| ---
| Fixed issues     | -
| Docs PR or issue | TODO once merged

This PR will allow to strip query parameters that are not required for the application but only for e.g. some tracking tools in the front end. It increases the cache hit rate and prevents these parameters from flooding the cache.

While implementing this I thought there's also the use case where people trust the Contao developers to maintain a proper blacklist but only need one exception to that. Like e.g. you need the `utm_*` parameters in PHP but you don't want to whitelist all of the allowed query parameters.
Same might apply for cookies so I unified that :)

Commits
-------

2c0ad69 Allow to strip query parameters in reverse proxy
1d957d0 Merge branch 'master' into feature/strip-query-params-subscriber
AlexejKossmann pushed a commit to AlexejKossmann/contao that referenced this pull request Apr 6, 2021
Description
-----------

As discussed in contao#1624 (comment)

This pull request probably needs a reabase once contao#1798 got resolved.

Commits
-------

ba8a8f8 Improve disabled cookie deny list
52ca008 Improve disabled query parameter deny list
94178bc Merge branch 'master' into feature/improve-disable-deny-list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants