Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HTTP cache revalidation bugs #24388

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Fix HTTP cache revalidation bugs #24388

wants to merge 4 commits into from

Conversation

jdm
Copy link
Member

@jdm jdm commented Oct 7, 2019

This fixes two cache related issues:

  • old cached content could be viewed as a result of revalidating a newer cached response
  • the expiry of a cached response refreshed through a revalidation could be calculated without any of the original response's headers


This change is Reviewable

@highfive
Copy link

highfive commented Oct 7, 2019

Heads up! This PR modifies the following files:

  • @KiChjang: components/net/http_loader.rs, components/net/http_cache.rs

@highfive highfive added the S-awaiting-review There is new code that needs to be reviewed. label Oct 7, 2019
@highfive
Copy link

highfive commented Oct 7, 2019

warning Warning warning

  • These commits modify net code, but no tests are modified. Please consider adding a test!

@jdm jdm changed the title Cache fun Fix HTTP cache revalidation bugs Oct 7, 2019
@jdm
Copy link
Member Author

jdm commented Oct 7, 2019

r? @gterzian

@highfive highfive assigned gterzian and unassigned SimonSapin Oct 7, 2019
Copy link
Member

@gterzian gterzian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with one change, and I'm less sure about the overwriting of existing cached resource. I'll do a wpt run since I expect at least one test to fail.

let mut stored_headers = cached_resource.data.metadata.headers.lock().unwrap();
stored_headers.extend(response.headers);
constructed_response.headers = stored_headers.clone();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes a lot of sense, it seemed like it was a mistake to calculate the expiry without first replacing the headers.

{
// Ensure that any existing complete response is overwritten by the new
// complete response.
let existing_complete_response = entry.iter().position(|response| {
Copy link
Member

@gterzian gterzian Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm less sure about this, since in theory different complete responses could be cached and used to satisfy different requests, for example in the case of responses with different Vary headers. We can try running this test with those changes:

name: "HTTP cache doesn't invalidate existing Vary response",

@gterzian
Copy link
Member

gterzian commented Oct 8, 2019

@bors-servo try=wpt

Expectation: failure of "HTTP cache doesn't invalidate existing Vary response"

@bors-servo
Copy link
Contributor

⌛ Trying commit 2313f7f with merge daa7a3a...

bors-servo pushed a commit that referenced this pull request Oct 8, 2019
Fix HTTP cache revalidation bugs

This fixes two cache related issues:
* old cached content could be viewed as a result of revalidating a newer cached response
* the expiry of a cached response refreshed through a revalidation could be calculated without any of the original response's headers

---
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [x] These changes fix #24385
- [ ] There are tests for these changes

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24388)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

💔 Test failed - linux-rel-wpt

@highfive highfive added the S-tests-failed The changes caused existing tests to fail. label Oct 8, 2019
@CYBAI
Copy link
Member

CYBAI commented Oct 8, 2019

I think this is what @gterzian expected 👀

{
    "status": "FAIL", 
    "group": "default", 
    "message": "assert_less_than: Response 3 does not come from cache expected a number less than 3 but got 3", 
    "stack": "checkResponse@http://web-platform.test:8000/fetch/http-cache/http-cache.js:136:9\n", 
    "subtest": "HTTP cache doesn't invalidate existing Vary response", 
    "test": "/fetch/http-cache/vary.html", 
    "line": 168808, 
    "action": "test_result", 
    "expected": "PASS"
}

"there are {} cached responses for {:?}",
cached_resources.len(),
request.url()
);
for cached_resource in cached_resources.iter_mut() {
Copy link
Member

@gterzian gterzian Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple, and perhaps effective, fix for the "refresh of older resource", for now, might be to sort those by their original expiry. Then we can leave a TODO regarding the full "selection" logic as described at https://tools.ietf.org/html/rfc7234#section-4.3.4

Copy link
Member

@gterzian gterzian Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I think it might be best to start with the strictest case, and default to None for the less certain ones, and then over time move down the ladder of selection.

The weak validator and no validator case mentioned at https://tools.ietf.org/html/rfc7234#section-4.3.4 are softer and require us to handle all other "strong validator" cases correctly, so we probably shouldn't even try to do those for now, since the risk of erroneous caching is not worth it.

As described at https://tools.ietf.org/html/rfc7232#section-2.2.2 , if the stored resource has a Date header, and the Last-Modified header of the 304 response is at least 60 seconds earlier than the stored date, the Last-Modified header becomes a strong validator.

And per the "strong validator" case at https://tools.ietf.org/html/rfc7234#section-4.3.4:

All of the stored responses with the
same strong validator are selected. If none of the stored
responses contain the same strong validator, then the cache MUST
NOT use the new response to update any stored responses.

So if we don't handle this case first, we are not following this MUST NOT.

Then we would need to cover all other "strong validator" cases, and only then can we start to try to use weak or no validators.

@@ -331,8 +333,12 @@ fn create_cached_response(
// TODO: take must-revalidate into account <https://tools.ietf.org/html/rfc7234#section-5.2.2.1>
// TODO: if this cache is to be considered shared, take proxy-revalidate into account
// <https://tools.ietf.org/html/rfc7234#section-5.2.2.7>
let has_expired =
(adjusted_expires < time_since_validated) || (adjusted_expires == time_since_validated);
let has_expired = adjusted_expires <= time_since_validated;
Copy link
Member

@gterzian gterzian Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, we could be returning None if has_expired is true, and the stored resource doesn't contain a Date and Last-Modified header, since we need those to do a successful refresh, and if we can't, we should not construct a response that requires validation, and instead just go to the network "normally".

We could even purge such an expired resource that can't be successfully refreshed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmh, one issue here is that fetch doesn't have a hook to deal with the "could not select a stored response for update", as Step 7.4 seems to assume that a refresh is always a succesful operation.

I've filed whatwg/fetch#950

@bors-servo
Copy link
Contributor

☔ The latest upstream changes (presumably #24318) made this pull request unmergeable. Please resolve the merge conflicts.

@highfive highfive added the S-needs-rebase There are merge conflict errors. label Oct 8, 2019
@gterzian gterzian mentioned this pull request Jun 20, 2020
5 tasks
Copy link
Member

@gterzian gterzian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the easiest way to unblock this one, is to simply refresh the most recent entry and use it to construct a response.

@@ -721,6 +727,11 @@ impl HttpCache {
assert_eq!(response.status.map(|s| s.0), Some(StatusCode::NOT_MODIFIED));
let entry_key = CacheKey::new(&request);
if let Some(cached_resources) = self.entries.get_mut(&entry_key) {
trace!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would say the easy way to fix the problem, is to choose the cached resources that is the most recent, and then select that one for update and construct a response from it.

This also matches the "weak validator" part of https://httpwg.org/http-core/draft-ietf-httpbis-cache-latest.html#rfc.section.4.3.4

We can do the strong validator checking and so on in a different PR, since the problem here was the cache "going back in time", choosing the most recent entry should fix it.

.as_ref()
.map_or(false, |s| s.0 == StatusCode::OK)
{
// Ensure that any existing complete response is overwritten by the new
Copy link
Member

@gterzian gterzian Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not overwrite entries, since it will make it harder to have a spec compliant cache(needs to support multiple entries, matching on Etag and so on).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-awaiting-review There is new code that needs to be reviewed. S-needs-rebase There are merge conflict errors. S-tests-failed The changes caused existing tests to fail.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reload goes back in time (http cache issue?)
6 participants