Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching considerations #1552

Closed
mnot opened this issue Jun 16, 2021 · 4 comments · Fixed by #1737
Closed

Caching considerations #1552

mnot opened this issue Jun 16, 2021 · 4 comments · Fixed by #1737
Assignees

Comments

@mnot
Copy link
Member

mnot commented Jun 16, 2021

We should figure out/document:

  1. Establishing the cache key - a cache needs to know how to compute a cache key for the request. The obvious (and very inefficient) way to do this is to use the entire request body as part of the key.

  2. Request canonicalisation - we should allow caches with knowledge of canonicalisation algorithms for a given format to use them.

  3. Indicating canonical form in requests - if a client (UA or intermediary) canonicalises, it would be nice if they could assert that in the request, so a downstream cache can assume it's there and just look for a byte-for-byte match, rather than re-canonicalising. This would allow a client to manufacture cache misses, but that's already possible in URLs...

  4. Indicating cache key in requests - E.g., a hash of the request body. Needs security analysis, though.

  5. Indicating cache key in responses - If the server can instruct caches as to what future requests would match it, that would be very advantageous. Necessarily requires knowledge of the request format, but we could define common ones for url encode, XML, JSON.

@martinthomson
Copy link
Contributor

I don't think that (4) is generally feasible. There might be cases where a cache trusts other entities sufficiently that a hash (Digest field?) can be used, but it will need to work this out for itself. This is probably the big cost associated with caching this method.

As for the general case of c14n, it's unfortunate that this is necessary, but it is probably better than relying on having some shared understanding of an abstract model for the resource representation.

I would not build this on any assumption that XML c14n is possible. A complete solution there is likely impossible. I know less about the warts of JSON, but even that is entering into dangerous territory. You might get more traction with schema-aware c14n with the understanding that it is not generally applicable to any document. That is, define tools so that specific XML- or JSON-based formats can opt in to the use of c14n. People might then apply that c14n if the format allows it; people might also apply the c14n if the format does not, but we have not promised them anything and so guarantees might not apply.

@mnot
Copy link
Member Author

mnot commented Jun 16, 2021

nod there's going to be a tradeoff between the amount of knowledge/processing necessary for c14n and cache efficiency. It might be good to explore what conventions a format needs to follow to make it easily canonical; I can't help but think that most use cases for this are going to be defining a new query language, not reusing an existing document or even data format.

@ioggstream
Copy link
Contributor

ioggstream commented Jun 23, 2021

I think that caching is a "must have", though probably we could just provide minimal hints and leave clients the onus of hitting the cache (eg. clients c14n-izing requests in some may increase cache hits).

  1. I won't mess with the request media-type (json? xml? x-www-form-urlencoded? ...) and just use the content checksum (eg. Content-Digest).

  2. I think that supporting media-type aware caches is really hard to implement (eg. see rfc8785 and rfc7493 only for json): there's room for experimental specs though...

  3. That's ok, though see (2) about supporting media-types is hard

  4. Agree wrt Security Considerations

  5. IF we want to support a caching based on a response cache key, this could be implemented with some caveats without re-sending the content at all:

    • Authentication information should be processed;
    • content MUST NOT contain authnz information

@mnot mnot self-assigned this Aug 4, 2021
@reschke
Copy link
Contributor

reschke commented Aug 23, 2021

Caching will be tricky for "similar" requests sent by different clients.

On the other hand, caching could be simple for requests that are repeated by the same client (like in a refresh operation). For this use case, we just need a way for the server to also point to a GETtable resource (maybe with a lifetime). If we solved that problem, would we still need full cacheability?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants