Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upHttp cache: introduce cache entries, prevent cache misses when resource will be fetched #24203
Conversation
highfive
commented
Sep 12, 2019
|
Heads up! This PR modifies the following files:
|
7986748
to
d6ea5dc
|
@bors-servo try=wpt |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
|
|
|
The big comment is that this looks like it's structured differently from the spec, which (IIRC) leaves all caching to Also, it would be nice if a |
| @@ -224,6 +225,13 @@ pub fn main_fetch( | |||
| // Step 11. | |||
| // Not applicable: see fetch_async. | |||
|
|
|||
| // Get the cache entry corresponding to the url, after potential hsts switch. | |||
| let cache_entry = if let Ok(mut http_cache) = context.state.http_cache.write() { | |||
| http_cache.get_entry(&request) | |||
This comment has been minimized.
This comment has been minimized.
asajeffrey
Sep 12, 2019
Member
Hmm, is this spec-complaint? Shouldn't anything that touches the cache be in http_network_or_cache_fetch?
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
•
Author
Member
Strictly-speaking, I think it's ok, since at this point we're not actually running any specified cache related logic, we're just getting the entry corresponding to the current request, later to be used to check the cache. So basically the "entry" is the interface to the "cache" from the spec perspective, and it will be used later in http_network_or_cache_fetch.
I think the spec in this case is not really providing the full algorithm, rather describing the outcome from a black-box perspective.
For example, see the diagram for the state-machine of the Chromium cache, little of which shows up in the spec: https://www.chromium.org/developers/design-documents/network-stack/http-cache
Same thing goes for the Gecko cache: https://developer.mozilla.org/en-US/docs/Mozilla/HTTP_cache#Implementation (more specifically, see their CacheEntry description).
If anything, ours matches the spec most closely, and yet ours seems to be the most naive implementation of them all.
This comment has been minimized.
This comment has been minimized.
asajeffrey
Sep 16, 2019
Member
I'd still prefer a solution that kept all the caching logic inside http_network_or_cache_fetch, I find it confusing moving some bits of caching out. What's the motivation for doing that?
| response: response, | ||
| needs_validation: has_expired, | ||
| }; | ||
| if cached_resource.aborted.load(Ordering::Relaxed) { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
•
Author
Member
The other parts of this workflow also use Relaxed, and we could change that in this PR as well in another commit, see
servo/components/net/fetch/methods.rs
Line 480 in fb5a121
This comment has been minimized.
This comment has been minimized.
| response: response, | ||
| needs_validation: has_expired, | ||
| }; | ||
| if cached_resource.aborted.load(Ordering::Relaxed) { | ||
| return None; |
This comment has been minimized.
This comment has been minimized.
asajeffrey
Sep 12, 2019
Member
Shouldn't we construct a new CachedResponse at this point and return it, rather than returning None?
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
Author
Member
This is used inside a while loop at https://github.com/servo/servo/pull/24203/files#diff-e980a3e3bf7e4cb8a8d3dda60ce7a408R770
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
Author
Member
Actually I realize this change was left over from #22813, and I'm not sure if its' really necessary. Basically there is no need to make this return an Option<CacheResource>, based on the check for aborted, or at least it's not solving any known problem.
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
•
Author
Member
On second-thoughts, I think this is a change that makes sense, since we shouldn't construct an aborted response, and I should have extracted it from the pervious PR at the time.
So I switched to release/acquire for that flow.
| @@ -563,6 +580,132 @@ impl HttpCache { | |||
| } | |||
| } | |||
|
|
|||
| /// Get the cache entry corresponding to this request | |||
| pub fn get_entry(&mut self, request: &Request) -> Option<HttpCacheEntry> { | |||
This comment has been minimized.
This comment has been minimized.
asajeffrey
Sep 12, 2019
Member
Slightly annoyingly that means we need a write lock just to read the cache :( We may as well make it a Mutex rather than a RWLock.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
Author
Member
How about we just keep it a RwLock for now, since everything else on HttpState uses one?
servo/components/net/http_loader.rs
Line 64 in 9138d1d
This comment has been minimized.
This comment has been minimized.
| let entry = self | ||
| .entries | ||
| .entry(entry_key.clone()) | ||
| .or_insert(HttpCacheEntry::new(entry_key)); |
This comment has been minimized.
This comment has been minimized.
asajeffrey
Sep 12, 2019
Member
Interesting that invalidation can end up inserting an entry into the cache.
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
Author
Member
Yes, I think we actually want to return if there is no corresponding entry :)
| /// if a pending network fetch has already begun, | ||
| /// and a pending response has been stored in the cache, | ||
| /// or if there is no pending network fetch. | ||
| is_ready_to_construct: Arc<(Mutex<bool>, Condvar)>, |
This comment has been minimized.
This comment has been minimized.
asajeffrey
Sep 12, 2019
Member
Can we use futures rather than a condvar here? Essentially we're building a Future<Option<CachedResource>> here.
This comment has been minimized.
This comment has been minimized.
gterzian
Sep 13, 2019
•
Author
Member
I think that would require moving all of fetch to futures/tasks, since a fetch currently runs on a thread, and in such a setup its more reliable to do this with condvars and locks.
There is an issue at #22813
|
Ok lots of |
On that, with the current changes, there are in fact two separate "wait" workflows:
If by "joining" you mean actually sending the intermediary steps on the |
|
@bors-servo try=wpt |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
|
|
|
@bors-servo try=wpt |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
|
|
|
@bors-servo try=wpt (100ms timeout) |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
|
|
|
@bors-servo try=wpt (increase timeout from 100 to 500ms) |
Http cache: introduce cache entries, prevent cache misses when resource will be fetched <!-- Please describe your changes on the following line: --> FIX #24166 --- <!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: --> - [ ] `./mach build -d` does not report any errors - [ ] `./mach test-tidy` does not report any errors - [ ] These changes fix #___ (GitHub issue number if applicable) <!-- Either: --> - [ ] There are tests for these changes OR - [ ] These changes do not require tests because ___ <!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.--> <!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. --> <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/24203) <!-- Reviewable:end -->
|
Ok now its really ready for another round of review @asajeffrey |
|
Oh One way we could deal with that is change how prefetching is done, so that rather than having it just treated like a regular cache access, we treat the first request for the resource as being a cache hit, even if the resource didn't have cache metadata? This does worry me a bit, as it's not clear whether that's spec compiliant, but it would get this test to pass, and would improve pref in cases where we prefetched a resource that turned out to have no cache metadata. I'll read over the latest version. |
|
OK, I'm still confused about why we're moving caching logic out of |
| @@ -224,6 +225,13 @@ pub fn main_fetch( | |||
| // Step 11. | |||
| // Not applicable: see fetch_async. | |||
|
|
|||
| // Get the cache entry corresponding to the url, after potential hsts switch. | |||
| let cache_entry = if let Ok(mut http_cache) = context.state.http_cache.write() { | |||
| http_cache.get_entry(&request) | |||
This comment has been minimized.
This comment has been minimized.
asajeffrey
Sep 16, 2019
Member
I'd still prefer a solution that kept all the caching logic inside http_network_or_cache_fetch, I find it confusing moving some bits of caching out. What's the motivation for doing that?
| @@ -563,6 +580,132 @@ impl HttpCache { | |||
| } | |||
| } | |||
|
|
|||
| /// Get the cache entry corresponding to this request | |||
| pub fn get_entry(&mut self, request: &Request) -> Option<HttpCacheEntry> { | |||
This comment has been minimized.
This comment has been minimized.
I understand your concern, and I think the proposed changes actually fit in with your thinking, for the following reasons:
In terms of the motivations of the proposed approach, they are:
Please let me know what you think. |
…r clean-ups
|
OK, we appear to be at a bit of a stalemate here, about whether all caching logic should be in |
|
In favor of #24318 |

gterzian commentedSep 12, 2019
•
edited by SimonSapin
FIX #24166
./mach build -ddoes not report any errors./mach test-tidydoes not report any errorsThis change is