Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status codes and caching #120

Closed
mnot opened this issue Jul 5, 2018 · 16 comments · Fixed by #241
Closed

Status codes and caching #120

mnot opened this issue Jul 5, 2018 · 16 comments · Fixed by #241

Comments

@mnot
Copy link
Member

@mnot mnot commented Jul 5, 2018

7234 says that a response can't be stored if the cache doesn't understand its status code.

Need to dig around in the history, but I wonder if that's right.

@mnot mnot added the caching label Jul 5, 2018
@MikeBishop
Copy link
Contributor

@MikeBishop MikeBishop commented Jul 5, 2018

Particular status codes are defined as cacheable by default or not. If the status code is unknown, you can't know which it is. I could see an argument for permitting storage if the response included explicit cacheability information.

Status codes which are statements about the connection and not the resource (421, 502, etc.) should never be cacheable, but it appears that those simply omit the statement of cacheability by default and hope that servers don't include an explicit lifetime on the response.

We could probably craft something tighter here (MUST NOT include freshness on an uncacheable response, MUST only cache unknown status codes with explicit freshness), but avoiding the quagmire seems like a reasonable choice for past-us to have made.

@mnot
Copy link
Member Author

@mnot mnot commented Jul 6, 2018

"cacheable by default" means that heuristic freshness can be applied, that's all. The intent was that any response with an extension status code can be cached if it has explicit freshness information, but the referenced text above leads the reader to believe otherwise.

@mnot
Copy link
Member Author

@mnot mnot commented Jul 6, 2018

Background thread from bis.

Reflecting on this, I think we went the wrong way; deployed implementations do cache responses with explicit freshness, even without understanding the status code.

Furthermore, requiring deployed caches to understand a new status code before it gets cached creates a disincentive to use HTTP properly; new status codes don't get cached, so people will try to work around this by abusing existing ones, etc.

I'm starting to believe a better solution would be to require new status codes to be designed with this in mind (option a in the linked thread); i.e., they can't carry freshness information if they can't be cached. That prevents us from designing a new e.g., 304 or 206, but I tend to think that tradeoff is OK.

@mnot
Copy link
Member Author

@mnot mnot commented Sep 5, 2018

Also affects here in 7231.

@mnot mnot added the discuss label Oct 9, 2018
@royfielding
Copy link
Member

@royfielding royfielding commented Nov 5, 2018

I guess I fail to see the value in new status codes that can be cached by ignorant caches. I have no problem with discouraging the definition of new status codes, in general, and even less problem with caches being conservative when receiving the unknown.

What's the problem we are trying to fix?

@mcmanus
Copy link

@mcmanus mcmanus commented Nov 6, 2018

in bkk: no conlusion reached

@mnot
Copy link
Member Author

@mnot mnot commented Dec 18, 2018

7234 8.2.2 says:

The definition of a new status code ought to specify whether or not it is cacheable. Note that all status codes can be cached if the response they occur in has explicit freshness information; however, status codes that are defined as being cacheable are allowed to be cached without explicit freshness information.

(emphasis added)

... so right now the specification conflicts with itself.

The problem that we're trying to solve here is that many caches have a long deployment cycle, and having them align their behaviour upon a list of specified status codes rather than an in-protocol signal isn't great.

The only reason not to do this, AFAICT, is that a new status code with operation similar to 206 Partial Content would be cached incorrectly if the cache directives in it don't apply to the response it occurs within.

My reaction to that is that we shouldn't be designing new mechanisms like partial content that mess with the HTTP so deeply; I hope we've learned by now that they don't work well.

However, if we really want to support that, we could do something like:

Cache-Control: max-age=60, ignore="206 499"

which says "this response is fresh for 60 seconds, but not if it occurs in a 206 or 499 response."

In the (unlikely) even that we want to introduce a new partial-like response status in the future, we just need to have those responses include this cache-control directive to explicitly mark that response as different.

That way, a generic cache could just look at cache-control to determine how to handle a response, without having to have knowledge of particular status codes.

@mnot
Copy link
Member Author

@mnot mnot commented Dec 19, 2018

FWIW, current behaviours tested at:
https://cache-tests.fyi/#status

@mcmanus
Copy link

@mcmanus mcmanus commented Mar 25, 2019

ietf104: discussion is around how quickly new status codes will be cachable. Roy might be agreeable to allowing unknown status codes to be cached with a mechanism like the ignore directive above.

@mnot
Copy link
Member Author

@mnot mnot commented Jul 5, 2019

Proposal:

All response status codes are cacheable if they have explicit freshness information. However, if a response carries Cache-Control: must-understand, a cache MUST NOT store a response unless it understands and implements the relevant semantics of the status code.

(It's not necessary to enumerate the status codes it applies to, because the party generating the status code -- whether the origin or an intermediary -- can append must-understand as appropriate.

@mnot mnot added the discuss label Jul 5, 2019
@reschke
Copy link
Contributor

@reschke reschke commented Jul 5, 2019

You mean "MUST NOT" store?

@mnot
Copy link
Member Author

@mnot mnot commented Jul 5, 2019

Yes. Edited.

@reschke
Copy link
Contributor

@reschke reschke commented Jul 5, 2019

So, in theory, this sounds good. But this is a new normative requirement, no?

@mnot
Copy link
Member Author

@mnot mnot commented Jul 5, 2019

Yes. That's where discussion took us.

@reschke
Copy link
Contributor

@reschke reschke commented Jul 5, 2019

Understood. FWIW, if we're ready to make new normative requirements here, then I'd argue that the discussion about handling quoted cache directive values is misleading; we could raise the bar for that as well (instead of weakening it).

Yes, meta-discussion.

@mnot
Copy link
Member Author

@mnot mnot commented Sep 2, 2019

@reschke, that's a very different situation. Here, we only need to get a few implementations to change their behaviour (respecting the MUST NOT above), and there isn't any risk to making that change; it's not going to break any existing content, because sites won't yet be using the new protocol element, and it isn't going to be used until there is a new 206-like status code that needs it. The biggest potential downside here is that implementations might not respect must-understand, making the future deployment of a 206-like status code difficult (whenever, if ever that happens).

OTOH, the quoted cache directives discussion is about getting many, many implementations to change their behaviour in a way that will change parsing several billion to trillion times a day. That has a fair amount of inherent risk attached to it, for very little reward (other than conforming to a model of parsing that doesn't match the deployed reality of the Web).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

5 participants