Skip to content

[vcl] Avoid conditional hashing when possible #29360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

gquintard
Copy link
Contributor

This depends on #28928 and extends the vcl_hash clean up started there.

Conditional hashing creates opportunities for collisions and should be avoided. It it's not possible, like in the case of the X-Magento-Vary cookie, a balancing hash_data call is added.

Note that it's not necessary to balance the calls in the /graphql since the URL is already part of the hash (via the built-in vcl) so we operate in a restricted subspace and collisions are not possible.

Contribution checklist (*)

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • All automated tests passed successfully (all builds are green)

@m2-assistant
Copy link

m2-assistant bot commented Aug 2, 2020

Hi @gquintard. Thank you for your contribution
Here is some useful tips how you can test your changes using Magento test environment.
Add the comment under your pull request to deploy test or vanilla Magento instance:

  • @magento give me test instance - deploy test instance based on PR changes
  • @magento give me 2.4-develop instance - deploy vanilla Magento instance

❗ Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s)
    For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names. Allowed build names are:

  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE,
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests

You can find more information about the builds here

ℹ️ Please run only needed test builds instead of all when developing. Please run all test builds before sending your PR for review.

For more details, please, review the Magento Contributor Guide documentation.

⚠️ According to the Magento Contribution requirements, all Pull Requests must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

🕙 You can find the schedule on the Magento Community Calendar page.

📞 The triage of Pull Requests happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

🎥 You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@ghost ghost assigned ihor-sviziev Aug 5, 2020
@ihor-sviziev ihor-sviziev added Progress: pending review Severity: S3 Affects non-critical data or functionality and does not force users to employ a workaround. and removed Progress: ready for testing labels Aug 5, 2020
@ihor-sviziev
Copy link
Contributor

@magento run all tests

@magento-engcom-team
Copy link
Contributor

Hi @ihor-sviziev, thank you for the review.
ENGCOM-7953 has been created to process this Pull Request

@sdzhepa sdzhepa added the Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it label Aug 11, 2020
@sidolov sidolov added Risk: high Priority: P3 May be fixed according to the position in the backlog. labels Sep 10, 2020
@sidolov
Copy link
Contributor

sidolov commented Sep 10, 2020

Risk: high since the changes to Varnish configuration may affect the caching mechanism.

@engcom-Bravo
Copy link
Contributor

Dev experience is required for testing of this PR. Please note that Manual testing has not been performed.

@engcom-Foxtrot engcom-Foxtrot changed the title Avoid conditional hashing when possible [vcl] Avoid conditional hashing when possible Oct 6, 2020
@engcom-Charlie
Copy link
Contributor

@gquintard could you please describe how can I check it manually?
Thank you.

@gquintard
Copy link
Contributor Author

Hi @engcom-Charlie ,

This PR plugs the exploit of this snippet:

    if (req.http.Store) {
        hash_data(req.http.Store);
    }
    if (req.http.Content-Currency) {
        hash_data(req.http.Content-Currency);
    }

To see it, you will need a page /graphql that returns two different pages for:

  • a request that has a store header with value XXX and no content-currency header
  • a request that has a content-currency header with value XXX and no store header

With the old code, you should never be able to have both pages at the same time because their hashes will collide.

Let me know if that's not clear, I'll be happy to discuss it in greater details

@engcom-Charlie
Copy link
Contributor

@gquintard I try to check on 2.4-develop by the next scenario:

  1. Set Catalog Price Scope to Website
  2. Create second website/store/storeview
  3. Set custom product price on the second website
  4. Clean cache
  5. Make products query first time only with Store header and the second time only with Content-Currency

AR:
Responses are different with correct prices as expected

Could you please take a look?
Thank you.

@gquintard
Copy link
Contributor Author

@engcom-Charlie , alright, can you do please do this please:

  • clean the cache
  • let run varnishlog -w /tmp/log.bin -g request
  • send the two requests
  • ctrl+C varnishlog and share /tmp/log.bin, for example of filebin.varnish-software.com?

It could be that magento is adding a vary header to the response, which is the right thing to do. Meaning the VCL changes wouldn't be a bug fix, but just a best practice edit.

@engcom-Alfa
Copy link
Contributor

Hi @gquintard .

There are varnish logs on two requests:

  1. With Content-Currency header
    Content-Currency.zip

  2. With Store header:
    Store.zip

@gquintard
Copy link
Contributor Author

alright, you are getting the right response because you are sending a POST and the VCL contains these:

    if (req.method != "GET" && req.method != "HEAD") {
        return (pass);
    }
    if (req.url ~ "/graphql" && req.http.Authorization ~ "^Bearer") {
        return (pass);
    }

So, the vcl_hash code only comes into play if:

  • you are sending a GET/HEAD request
  • to /graphql
  • without a authorization header

And the question is: does that ever happens, and if it does, is Magento's response different based on the store and content-currency headers?

If yes, then you can test with this to confirm the VCL works, if no, then we can simplify the VCL because we are hashing useless data.

@engcom-Charlie
Copy link
Contributor

@gquintard just checked on 2.4-develop by the next scenario:

  1. Set Catalog Price Scope to Website
  2. Create second website/store/storeview
  3. Set custom product price on the second website
  4. Clean entire cache
  5. Make GET products queries
    -- first time only with Store header
    -- second time only with Content-Currency

AR:
Responses are different with correct prices as expected

@gquintard
Copy link
Contributor Author

can show me some logs for those requests? (same varnishlog -g request -w log.bin command as before)

@engcom-Alfa
Copy link
Contributor

@gquintard Yes of course

logs.zip

@gquintard
Copy link
Contributor Author

Hi,

So, looking at the logs, you are still getting passes, so Varnish is not caching anything:

-   ReqMethod      GET
-   ReqURL         /graphql?query=%7B+++products(filter%3A+%7Bsku%3A+%7Bin%3A+%5B%22Simple+1%22%5D%7D%7D%2C+sort%3A+%7Bname%3A+ASC%7D)+%7B+++++items+%7B+++++++name+++++++sku+++++++price_range+%7B+++++++++minimum_price+%7B+++++++++++regular_price+%7B+++++++++++++value++++++
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: magento24.loc
-   ReqHeader      Connection: keep-alive
-   ReqHeader      User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Safari/537.36
-   ReqHeader      Store: second_store_view
-   ReqHeader      Authorization: Bearer 4w3x53wad1g2g679q72hkkwgzwb8frvc

and this is because of this VCL snippet:

    if (req.url ~ "/graphql" && req.http.Authorization ~ "^Bearer") {
        return (pass);
    }

Are you able to send requests without authorization headers?

Alternatively, is it actually useful to try and cache /graphql requests at all?

@engcom-Alfa
Copy link
Contributor

engcom-Alfa commented Nov 23, 2020

Hi,
here are requests without authorization headers
log.zip

@gquintard
Copy link
Contributor Author

Hi,

We are almost there, promise. On this run, you used Content-Currency: USD and Store: second_store_view which will cause them to have a different hash. The bug we have here is that if the headers have the same value, we can get a hash collision.

To recap, we need:

  • to create the hash collision with two requests:
    • one with Content-Currency: XXX and no Store header, and the other with Store: XXX and no Content-Currency, but the XXX must be the same
    • if there's a cookie, it's must be the same for both requests
  • to make sure the response will be cached:
    • path should be /graphql
    • GET method
    • no Authorization header

@engcom-Alfa
Copy link
Contributor

Hi, @gquintard .
Here are logs with:

First request:
Content-Currency: USD

Second request:
Store: USD

log.zip

@gquintard
Copy link
Contributor Author

There we go! The first request is a miss, and the second one is a hit, even though it shouldn't.

This PR should take care of that

@gquintard
Copy link
Contributor Author

hi, we are almost at the one-year anniversary of that PR, is there something I can do to make it happen?

@mrtuvn
Copy link
Contributor

mrtuvn commented Sep 4, 2023

@gquintard can you rebase the code so we can proceed this one

Conditional hashing can lead to collisions and should be avoided. As an
example, this code:

``` vcl
sub vcl_hash {
	if (req.http.a) {
		hash_data(req.http.a);
	}
	if (req.http.b) {
		hash_data(req.http.b);
	}
}
```

will return the same hash for these two requests:

```
GET / HTTP/1.1
a: foo
```

and

```
GET / HTTP/1.1
b: foo
```

whereas

``` vcl
sub vcl_hash {
	hash_data(req.http.a);
	hash_data(req.http.b);
}
```

is correct and simpler.
@gquintard
Copy link
Contributor Author

@mrtuvn , here you go

@mrtuvn
Copy link
Contributor

mrtuvn commented Sep 7, 2023

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@gquintard
Copy link
Contributor Author

Could somebody look at the test I added and point at where it would show up in the test results?

@gquintard
Copy link
Contributor Author

I'm closing this and will reopen one, WITH TESTS!

@engcom-Charlie
Copy link
Contributor

Hi @gquintard,

Thank you for your contribution!

As per this, should we close this PR. Let us know when you are ready with the test coverage, will reopen it.

Thank you!

@gquintard
Copy link
Contributor Author

yes, please close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auto-Tests: Not Required Changes in Pull Request does not require coverage by auto-tests Award: category of expertise Component: PageCache has dependency Priority: P3 May be fixed according to the position in the backlog. Release Line: 2.4 Risk: high Severity: S3 Affects non-critical data or functionality and does not force users to employ a workaround. Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants