Skip to content
This repository has been archived by the owner on Dec 3, 2019. It is now read-only.

Revalidation is "EXPIRED" when there is no precondition #15

Open
SleeplessByte opened this issue Jul 14, 2018 · 6 comments
Open

Revalidation is "EXPIRED" when there is no precondition #15

SleeplessByte opened this issue Jul 14, 2018 · 6 comments

Comments

@SleeplessByte
Copy link

Hi again,

When you hit the revalidation code, the following rules are applied:
https://github.com/aw/CacheRules/blob/master/lib/cache_rules.rb#L127-L133

Which internally calls:
https://github.com/aw/CacheRules/blob/master/lib/helpers.rb#L365-L369

Now I could not find such mandatory. This might be a design choice in your library, but I don't think a Gateway Timeout is appropriate for all cases here.

Let's consider the case of a simple must-revalidate request.

< Cache-Control: must-revalidate, max-age=60
< Date: Fri, 13 Jul 2018 16:40:00 +0000
< HTTP 200 Ok

Meaning, fresh for 60 seconds, MUST NOT use stale when it has expired. If requested past 16:41, it SHOULD just retry the request. In this case, because no ETag or Last-Modified is present in the cached response, nor is there a If-None-Match in the request, it gives us a 504, but we have not even tried to reach the origin server.

I think you implemented it as such because of https://tools.ietf.org/html/rfc7234#section-4.3.1 where it says

   When sending a conditional request for cache validation, a cache
   sends one or more precondition header fields containing validator
   metadata from its stored response(s), which is then compared by
   recipients to determine whether a stored response is equivalent to a
   current representation of the resource.

However, when you don't have these headers, you would not send a conditional request, but a regular one. This is how both Chrome and Firefox have implemented it. It is mentioned in the mozilla docs: It is either validated or fetched again.

Because of the careful wording in the RFC, and not using a capitalized MUST/SHOULD in this paragraph, I believe you must always try to revalidate in the flow, regardless of the presence of the preconditions. It becomes, semantically, a conditional request if one of the headers is present, but otherwise it's a regular fetch request (and will always return a non-304 result).


Posted the RFC entry just for ease. The other mentions are only "triggering" extra invalidation/rules, but nothing says anything about an ETag / Last-Modified being mandatory.

https://tools.ietf.org/html/rfc7234#section-5.2.2.1   

   The "must-revalidate" response directive indicates that once it has
   become stale, a cache MUST NOT use the response to satisfy subsequent
   requests without successful validation on the origin server.

   The must-revalidate directive is necessary to support reliable
   operation for certain protocol features.  In all circumstances a
   cache MUST obey the must-revalidate directive; in particular, if a
   cache cannot reach the origin server for any reason, it MUST generate
   a 504 (Gateway Timeout) response.

   The must-revalidate directive ought to be used by servers if and only
   if failure to validate a request on the representation could result
   in incorrect operation, such as a silently unexecuted financial
   transaction.
@SleeplessByte
Copy link
Author

Related to #13, which I think is incorrect based on the spec and browser implementation :)

@aw
Copy link
Owner

aw commented Jul 15, 2018

So basically, i'm just missing a check for must-revalidate in the Cache-Control header?

I appreciate the amount of detail you provide in your issues, but it confuses me even more. Can you just tell me what's the problem? Thanks.

@aw
Copy link
Owner

aw commented Jul 16, 2018

I believe you must always try to revalidate in the flow, regardless of the presence of the preconditions

I think this is incorrect. There's no way to revalidate if you don't have either a strong or weak validator, i.e: If-None-Match, ETag, or Last-Modified. How can the server return a cached response if there's nothing to revalidate from? It's missing information. I think this applies even if must-revalidate is supplied when there's no strong/weak validator.

Even the Mozilla docs state:

"Validation can only occur if the server provided either a strong validator or a weak validator."

@SleeplessByte
Copy link
Author

SleeplessByte commented Jul 16, 2018

@aw of course. This one confused me a lot as well.

The issue here is that you can not [re]validate without preconditions and we both agree on this and the docs do as well.

The directive "must-revalidate", I believe, means something different than the action "revalidate". The former means, as per the spec, you may not use a stale response and is must validate. Now, when it can't send a conditional request (no if-none-match / if-not-modified since), implementors actually fetch, instead of 504. Therefore I think that the following from Jake Archibald is correct:

First he talks about the wording. You have actually implemented no-cache, no-store as he describes:

Note: no-cache doesn't mean "don't cache", it means it must check (or "revalidate" as it calls 
it) with the server before using the cached resource. no-store tells the browser not to cache it 
at all. Also must-revalidate doesn't mean "must revalidate", it means the local resource can 
be used if it's younger than the provided max-age, otherwise it must revalidate. Yeah. I know.

Then he goes on to explain what "must revalidate" actually is:

In this pattern you can add an ETag (a version ID of your choosing) or Last-Modified date header
to the response. Next time the client fetches the resource, it echoes the value for the content
it already has via If-None-Match and If-Modified-Since respectively, allowing the server to say
"Just use what you've already got, it's up to date", or as it spells it, "HTTP 304".

If sending ETag/Last-Modified isn't possible, the server always sends the full content.

https://jakearchibald.com/2016/caching-best-practices/

Note the can and not must, as well as the final line, which is exactly how Chromium, Mozilla handle it. If it finds must-revalidate and it doesn't have a precondition, it doesn't return 504. It just acts as if it was not cached at all.

Then we get back to your final statement: it only occurs provided a validator. This is then true. It does a "GET" to the origin otherwise. Therefore I propose to always make that request, as a GET, adding the preconditions, when you can (switching to a HEAD if you'd like, but this is just an optimization of course :)).

@aw
Copy link
Owner

aw commented Jul 16, 2018

@SleeplessByte

The directive "must-revalidate", I believe, means something different than the action "revalidate"

Ahhhh, well.. yeah! It's nice to have multiple pairs of eyes looking at this HTTP caching. Thanks for the link!

@SleeplessByte
Copy link
Author

tbh I think you did a stellar job. It's much easier to look for discrepancies than figuring out a proper implementation.

That said, I'll keep going :p

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants