feat: add cache max-age of 5 seconds to ?live requests so request collapsing works#1656
Conversation
✅ Deploy Preview for electric-next ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
?live requests so request collapsing works
|
I get the live clients connecting at the same time. Because they block on data and then reconnect after it arrives.
Trying to wrap my head around the scenario here. New data arrives within 5 seconds, the live request for one (or many) clients returns. A subsequent request gets that response, even if between the initial response and their request additional data has also arrived? So if you miss being in the collapse gang and data is arriving in sequence you could end up with 5 seconds latency (for weird / hard to debug reasons from a front end POV)? |
|
If I understand correctly the 5s TTL starts when the response is returned (either due to a message or the timeout), and so if it was possible to have a 1ms TTL that would also allow all connections within the 20s long poll to be collapsed. what's the shortest possible TTL that triggers request collapsing? this is very cool! |
No because you'd get a cache hit for the initial response and then the client would immediately join the new gang at the new offset and get collapsed into that request. It's possible that they missed several responses within a five second TTL window but that'd just mean they get individual messages for a series of cached http responses so very fast so not the worst thing in the world (in a pretty unlikely edge case). In general, this is following our normal scheme where we trade off the possibility of needing to do a few http requests in sequence so that we can have longer caches & increase cache hit ratio. |
I'm a bit fuzzy on this but I think Fastly, et al. have heuristics around request collapsing so if they've seen similar URLs that have very short caches, they might stop pretty quickly collapsing requests. So e.g. a 1s cache might lead to later requests not being collapsed. Maybe. We could probably do more testing to figure out exactly how this works but it doesn't seem to matter too much as long as it satisfies the two considerations I outlined. |
I see -- so the idea is that the live response is triggered when there is data, which means it always contains a new offset, so the client then makes a new request with the new offset. Just to sanity check in case it's a bug that breaks this assumption, do we only set a cache header on a non-empty live request? What happens in the event of a 20s timeout response? |
We are but... it's already expired by then 😆 so it doesn't have any effect (other than to tell the CDN that they should keep collapsing requests to the origin) |
Trying to twist my brain around that statement. Also reading https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control#max-age and https://www.fastly.com/documentation/reference/http/http-headers/Age Is it that the response has an age added because the CDN monitored the time it took to generate the response? My naive mental model is that the 20s timeout response is our business, generates a response with a max-age and the cdn/client gets that with an age of 0. Thus caches it for 5 seconds. (On a separate note, I trust you that it's not but if an empty live response were to be cached it would also trigger zero delay polling between the client and the CDN.) |
|
yeah my mental model is that the max-age starts at the time the request is received at the CDN not when the response finishes. Which makes sense as the origin might take 500ms to transmit the data — but the staleness ticking clock starts as soon as the origin starts transmitting the data not when it happens to finish. Also nothing broke dramatically i.e. clients started furiously looping getting the same cached response. I'll be doing more testing today so we'll see. |
… to use for cache-busting (#1826) This PR is a fix for inconsistencies in caching in http proxying while clients are long-polling. It also adds `public` to our `cache-control` header as that's required by some http proxies in order to cache. HTTP Proxies don't treat the max-age in cache-control exactly the same way. Some start counting the age of the cache from the *beginning* of the request while others count from the *end* of the request. This inconsistency makes it difficult to reliably control caching and request collapsing behavior for long-polling requests. My previous PR in this area #1656 made request collapsing work nicely with proxies with the first behavior as they'd collapse all requests within the time from the start of a long-poll and the end of the max-age. And when the client went to request again after the long-poll had ended, the previous request cache had expired already so a new request would get sent to the origin. However, this approach caused issues with proxies with the second behavior as request collapsing would work but when the client re-polled, the cache hadn't yet expired so the client would go into an infinite loop requesting the same cached response over and over. So this PR adds a `cursor` generated by the server that clients use as part of `live` requests. This skips by any caches from the previous live request (which on proxies with the first behavior, would have expired already). The cursor is generated by finding the next alignment boundary. I.e. if the timeout is 20 seconds (which it is now but this could change) then we calculate the alignment boundary by taking the current unix timestamp and subtracting the Electric Epoch of October 9th, 2024 then dividing by 20 and rounding up and the multiplying by 20 again. In practice this partitions caches for live requests for a given offset into 20 second windows. --------- Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>
…ollapsing works (#1656) We need a short max-age cache on `?live` responses so http proxies will collapse long-polling requests. The time is a bit arbitrary but two considerations: - it's shorter than 20 seconds (our long-polling timeout) — which is necessary as otherwise clients would just poll the cached response over and over until the cache expired. - it's long enough to ensure the vast majority of clients all request within the same five second window. Live clients all get responses at the same time so all request again at the same time. So even accounting for world-wide spread of clients, five seconds should collect pretty much everyone. There's a very slight chance that a new message could be returned within the 5 second timeout and then someone with `?live` gets a cached response. But that's not a big deal as then their next response gets collapsed again with other clients.
… to use for cache-busting (#1826) This PR is a fix for inconsistencies in caching in http proxying while clients are long-polling. It also adds `public` to our `cache-control` header as that's required by some http proxies in order to cache. HTTP Proxies don't treat the max-age in cache-control exactly the same way. Some start counting the age of the cache from the *beginning* of the request while others count from the *end* of the request. This inconsistency makes it difficult to reliably control caching and request collapsing behavior for long-polling requests. My previous PR in this area #1656 made request collapsing work nicely with proxies with the first behavior as they'd collapse all requests within the time from the start of a long-poll and the end of the max-age. And when the client went to request again after the long-poll had ended, the previous request cache had expired already so a new request would get sent to the origin. However, this approach caused issues with proxies with the second behavior as request collapsing would work but when the client re-polled, the cache hadn't yet expired so the client would go into an infinite loop requesting the same cached response over and over. So this PR adds a `cursor` generated by the server that clients use as part of `live` requests. This skips by any caches from the previous live request (which on proxies with the first behavior, would have expired already). The cursor is generated by finding the next alignment boundary. I.e. if the timeout is 20 seconds (which it is now but this could change) then we calculate the alignment boundary by taking the current unix timestamp and subtracting the Electric Epoch of October 9th, 2024 then dividing by 20 and rounding up and the multiplying by 20 again. In practice this partitions caches for live requests for a given offset into 20 second windows. --------- Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>
We need a short max-age cache on
?liveresponses so http proxies will collapse long-polling requests.The time is a bit arbitrary but two considerations:
There's a very slight chance that a new message could be returned within the 5 second timeout and then someone with
?livegets a cached response. But that's not a big deal as then their next response gets collapsed again with other clients.