Provisions for regional caching #93

skef · 2022-05-17T18:06:19Z

In many cases a cache hit on an augmentation set will be very unlikely. However, there are cases where caching, and in particular regional caching (e.g. Akamai), can become not just relevant but important to performance. For example, when the initial subset is for a home page of a media service and the first augmentation is for the top article. Or when the initial subset is for the home page of a company and the first augmentation is for a popular tab or the first area added dynamically when scrolling.

Whether a request can be cached by existing services often depends on the request type (GET may be required) and the URL length. At present the spec is flexible when it comes to request method and the parameters are compressed but there are no length guarantees and some cases where the codepoints and indices will be fragmented in a way that takes more bytes to specify.

Additionally, any requirements for caching will typically be understood by the server rather than the client.

One way of addressing this need is to allow the server to respond to a GET or POST request with a different, cache-compatible URL to be used for the actual download of subset or augmentation data. This is similar to a temporary redirect, and in fact one of the existing redirect codes may suffice for this purpose (I unfortunately don't know off-hand).

(This alternative URL could contain a hash value generated from a canonicalized representation of the request, with the map stored on the server side so that the response can be regenerated on a cache miss. That possibility helps illustrate how the mechanism could work but I see no need to constrain the server-side implementation in the spec. Adobe's system has, at times, used a system like this to remain regional-cache-compatible.)

I suggest adding a section to the specification that addresses regional caching along the lines above. If we're confident that one of the existing redirect codes suffices then that section might just be informational. If we are not confident then a modest extension to the protocol may be needed.

garretrieger · 2022-05-21T01:34:56Z

Thanks, this is a use case that we haven't put a lot of thought into yet. I have a concern with the redirection approach though as it introduces an additional roundtrip which can be quite costly for latency. Also we've taken great pains to make the protocol completely stateless (that is a server isn't required to store state to implement a conforming implementation, but it is allowed for a server to have state if it wants). I'll need to think about this some more and see if there's a way to better accommodate first request caching without introducing a redirect.

If we do decide to allow for caching via a redirect, then I think we could just update the specification saying it's allowable for the server to return a HTTP redirect as long as the url it redirects too contains a valid response, but otherwise make no requirements for how the redirected url works (for example the identifiers used in the cache url).

garretrieger · 2022-06-03T20:54:29Z

Ah so the spec already talks about how redirects are handled in https://w3c.github.io/IFT/Overview.html#invalid-server-response. That is it allows redirects from the server.

So doing caching by redirecting to a cacheable URL is currently supported.

skef · 2022-06-28T03:35:37Z

I was asked to add more detail to this.

Support for caching may be more important for larger-scale services than some might expect. Adobe's system has been overhauled several times specifically to increase the amount of caching and probably will be overhauled again in the future.

I think there are three general areas of support for caching, the first being what has already been discussed:

Redirects to cached or cacheable URLs. A facility built on redirects has the advantages of server-side logic, which may include "super-setting" (including more code points and perhaps more features than strictly requested to reduce the frequency of subsequent requests and also the number of cached files). It has the disadvantages of a round trip and server-side state of some duration.
Canonical hash of the parameters as an extra URL parameter: Some caching services can respond directly based on a subset of URL parameters specified by their "client" -- which in this case is the operator of the server. If a hash of the subset parameters were added to every request its could be used as a key for hashing. This has the advantage of avoiding the round trip but the disadvantage of only partial support for "super-setting" -- the server-side can reduce the frequency of subsequent requests but requests with "common supersets" are not folded under the same cache key. I'm also not sure it's currently possible to do URL-key-based caching of POST requests, even if the key is always in the URL.
A-la-carte IFT: Let the server influence the client with javascript code to pick which IFT features it wants to use and which it wants to do itself, or to influence the request (e.g. with client-side "super-setting"). I've filed issue A-la-carte IFT #103 on this subject.

garretrieger · 2022-07-01T00:37:43Z

Some good news on this front I was recently pointed towards a new HTTP method QUERY (https://httpwg.org/http-extensions/draft-ietf-httpbis-safe-method-w-body.html) which works much better for caching then GET/POST. It encodes the request data like POST but retains the cache ability of GET requests.

I'd like to investigate it a bit more, but I'm strongly considering changing the specification to either exclusively use QUERY or for it at least to be the recommended method with GET/POST as fallbacks.

The QUERY spec specifically recommends that normalization should be applied to the request body to generate the cache key (https://httpwg.org/http-extensions/draft-ietf-httpbis-safe-method-w-body.html#section-2.1), I think this would solve many of the problems that you mentioned. For example normalization could upgrade the requested codepoint set to a cacheable superset that is known will be served by the backend server.

For the three options you mentioned:

Should be supported in the spec currently.
Not currently supported by the spec, but something that we could definitely consider adding. If we switch to QUERY that will solve the POST problem.
As mentioned in the a-la-carte issue a javascript API is out of scope for this spec, but is likely something we want as a separate specification.

skef · 2022-07-01T00:59:30Z

Some good news on this front I was recently pointed towards a new HTTP method QUERY

That does look good!

svgeesus · 2022-09-14T16:00:13Z

Tagging @martinthomson

garretrieger · 2022-11-09T19:22:38Z

The current plan for enabling caching for patch subset will be either:

to rely on the use of redirects (in spec), or
QUERY (planned but not yet in the spec, tracking issue: Add QUERY as a HTTP method type used for patch subset. #127)

One additional development that's related is the investigation I made into using precompressed brotli metablocks to partially cache portions of the font.

I'm going to close this issue for now as the remaining work will be tracked in the QUERY issue (#127), but please reopen if you think additional changes are needed.

skef mentioned this issue Jun 28, 2022

A-la-carte IFT #103

Open

svgeesus mentioned this issue Aug 25, 2022

Poor cacheing properties #119

Closed

garretrieger closed this as completed Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provisions for regional caching #93

Provisions for regional caching #93

skef commented May 17, 2022

garretrieger commented May 21, 2022 •

edited

garretrieger commented Jun 3, 2022

skef commented Jun 28, 2022

garretrieger commented Jul 1, 2022

skef commented Jul 1, 2022

svgeesus commented Sep 14, 2022

garretrieger commented Nov 9, 2022

Provisions for regional caching #93

Provisions for regional caching #93

Comments

skef commented May 17, 2022

garretrieger commented May 21, 2022 • edited

garretrieger commented Jun 3, 2022

skef commented Jun 28, 2022

garretrieger commented Jul 1, 2022

skef commented Jul 1, 2022

svgeesus commented Sep 14, 2022

garretrieger commented Nov 9, 2022

garretrieger commented May 21, 2022 •

edited