Skip to content

feat: add cache max-age of 5 seconds to ?live requests so request collapsing works#1656

Merged
KyleAMathews merged 3 commits into
mainfrom
live-cache
Sep 10, 2024
Merged

feat: add cache max-age of 5 seconds to ?live requests so request collapsing works#1656
KyleAMathews merged 3 commits into
mainfrom
live-cache

Conversation

@KyleAMathews
Copy link
Copy Markdown
Contributor

@KyleAMathews KyleAMathews commented Sep 9, 2024

We need a short max-age cache on ?live responses so http proxies will collapse long-polling requests.

The time is a bit arbitrary but two considerations:

  • it's shorter than 20 seconds (our long-polling timeout) — which is necessary as otherwise clients would just poll the cached response over and over until the cache expired.
  • it's long enough to ensure the vast majority of clients all request within the same five second window. Live clients all get responses at the same time so all request again at the same time. So even accounting for world-wide spread of clients, five seconds should collect pretty much everyone.

There's a very slight chance that a new message could be returned within the 5 second timeout and then someone with ?live gets a cached response. But that's not a big deal as then their next response gets collapsed again with other clients.

@netlify
Copy link
Copy Markdown

netlify Bot commented Sep 9, 2024

Deploy Preview for electric-next ready!

Name Link
🔨 Latest commit 9f5bc19
🔍 Latest deploy log https://app.netlify.com/sites/electric-next/deploys/66df5acf4889160008e2dc67
😎 Deploy Preview https://deploy-preview-1656--electric-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@KyleAMathews KyleAMathews changed the title feat: add cache max-age of 5 seconds to ?live requests so request collapsing works feat: add cache max-age of 5 seconds to ?live requests so request collapsing works Sep 9, 2024
@thruflo
Copy link
Copy Markdown
Contributor

thruflo commented Sep 10, 2024

I get the live clients connecting at the same time. Because they block on data and then reconnect after it arrives.

There's a very slight chance that a new message could be returned within the 5 second timeout and then someone with ?live gets a cached response.

Trying to wrap my head around the scenario here. New data arrives within 5 seconds, the live request for one (or many) clients returns. A subsequent request gets that response, even if between the initial response and their request additional data has also arrived?

So if you miss being in the collapse gang and data is arriving in sequence you could end up with 5 seconds latency (for weird / hard to debug reasons from a front end POV)?

@samwillis
Copy link
Copy Markdown
Contributor

If I understand correctly the 5s TTL starts when the response is returned (either due to a message or the timeout), and so if it was possible to have a 1ms TTL that would also allow all connections within the 20s long poll to be collapsed.

what's the shortest possible TTL that triggers request collapsing?

this is very cool!

@KyleAMathews
Copy link
Copy Markdown
Contributor Author

So if you miss being in the collapse gang and data is arriving in sequence you could end up with 5 seconds latency

No because you'd get a cache hit for the initial response and then the client would immediately join the new gang at the new offset and get collapsed into that request. It's possible that they missed several responses within a five second TTL window but that'd just mean they get individual messages for a series of cached http responses so very fast so not the worst thing in the world (in a pretty unlikely edge case).

In general, this is following our normal scheme where we trade off the possibility of needing to do a few http requests in sequence so that we can have longer caches & increase cache hit ratio.

@KyleAMathews KyleAMathews merged commit f18fa57 into main Sep 10, 2024
@KyleAMathews KyleAMathews deleted the live-cache branch September 10, 2024 13:35
@KyleAMathews
Copy link
Copy Markdown
Contributor Author

If I understand correctly the 5s TTL starts when the response is returned (either due to a message or the timeout), and so if it was possible to have a 1ms TTL that would also allow all connections within the 20s long poll to be collapsed.

I'm a bit fuzzy on this but I think Fastly, et al. have heuristics around request collapsing so if they've seen similar URLs that have very short caches, they might stop pretty quickly collapsing requests. So e.g. a 1s cache might lead to later requests not being collapsed. Maybe. We could probably do more testing to figure out exactly how this works but it doesn't seem to matter too much as long as it satisfies the two considerations I outlined.

KyleAMathews added a commit that referenced this pull request Sep 10, 2024
@thruflo
Copy link
Copy Markdown
Contributor

thruflo commented Sep 10, 2024

No because you'd get a cache hit for the initial response and then the client would immediately join the new gang at the new offset

I see -- so the idea is that the live response is triggered when there is data, which means it always contains a new offset, so the client then makes a new request with the new offset.

Just to sanity check in case it's a bug that breaks this assumption, do we only set a cache header on a non-empty live request? What happens in the event of a 20s timeout response?

@KyleAMathews
Copy link
Copy Markdown
Contributor Author

Just to sanity check in case it's a bug that breaks this assumption, do we only set a cache header on a non-empty live request? What happens in the event of a 20s timeout response?

We are but... it's already expired by then 😆 so it doesn't have any effect (other than to tell the CDN that they should keep collapsing requests to the origin)

KyleAMathews added a commit that referenced this pull request Sep 10, 2024
@thruflo
Copy link
Copy Markdown
Contributor

thruflo commented Sep 10, 2024

We are but... it's already expired by then 😆

Trying to twist my brain around that statement. Also reading https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control#max-age and https://www.fastly.com/documentation/reference/http/http-headers/Age

Is it that the response has an age added because the CDN monitored the time it took to generate the response? My naive mental model is that the 20s timeout response is our business, generates a response with a max-age and the cdn/client gets that with an age of 0. Thus caches it for 5 seconds.

(On a separate note, I trust you that it's not but if an empty live response were to be cached it would also trigger zero delay polling between the client and the CDN.)

@KyleAMathews
Copy link
Copy Markdown
Contributor Author

yeah my mental model is that the max-age starts at the time the request is received at the CDN not when the response finishes. Which makes sense as the origin might take 500ms to transmit the data — but the staleness ticking clock starts as soon as the origin starts transmitting the data not when it happens to finish.

Also nothing broke dramatically i.e. clients started furiously looping getting the same cached response. I'll be doing more testing today so we'll see.

KyleAMathews added a commit that referenced this pull request Oct 10, 2024
… to use for cache-busting (#1826)

This PR is a fix for inconsistencies in caching in http proxying while
clients are long-polling. It also adds `public` to our `cache-control`
header as that's required by some http proxies in order to cache.

HTTP Proxies don't treat the max-age in cache-control exactly the same
way. Some start counting the age of the cache from the *beginning* of
the request while others count from the *end* of the request.

This inconsistency makes it difficult to reliably control caching and
request collapsing behavior for long-polling requests.

My previous PR in this area
#1656 made request
collapsing work nicely with proxies with the first behavior as they'd
collapse all requests within the time from the start of a long-poll and
the end of the max-age. And when the client went to request again after
the long-poll had ended, the previous request cache had expired already
so a new request would get sent to the origin.

However, this approach caused issues with proxies with the second
behavior as request collapsing would work but when the client re-polled,
the cache hadn't yet expired so the client would go into an infinite
loop requesting the same cached response over and over.

So this PR adds a `cursor` generated by the server that clients use as
part of `live` requests. This skips by any caches from the previous live
request (which on proxies with the first behavior, would have expired
already).

The cursor is generated by finding the next alignment boundary. I.e. if
the timeout is 20 seconds (which it is now but this could change) then
we calculate the alignment boundary by taking the current unix timestamp
and subtracting the Electric Epoch of October 9th, 2024 then dividing by
20 and rounding up and the multiplying by 20 again.

In practice this partitions caches for live requests for a given offset
into 20 second windows.

---------

Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>
KyleAMathews added a commit that referenced this pull request Nov 1, 2024
…ollapsing works (#1656)

We need a short max-age cache on `?live` responses so http proxies will
collapse long-polling requests.

The time is a bit arbitrary but two considerations:
- it's shorter than 20 seconds (our long-polling timeout) — which is
necessary as otherwise clients would just poll the cached response over
and over until the cache expired.
- it's long enough to ensure the vast majority of clients all request
within the same five second window. Live clients all get responses at
the same time so all request again at the same time. So even accounting
for world-wide spread of clients, five seconds should collect pretty
much everyone.

There's a very slight chance that a new message could be returned within
the 5 second timeout and then someone with `?live` gets a cached
response. But that's not a big deal as then their next response gets
collapsed again with other clients.
KyleAMathews added a commit that referenced this pull request Nov 1, 2024
KyleAMathews added a commit that referenced this pull request Nov 1, 2024
… to use for cache-busting (#1826)

This PR is a fix for inconsistencies in caching in http proxying while
clients are long-polling. It also adds `public` to our `cache-control`
header as that's required by some http proxies in order to cache.

HTTP Proxies don't treat the max-age in cache-control exactly the same
way. Some start counting the age of the cache from the *beginning* of
the request while others count from the *end* of the request.

This inconsistency makes it difficult to reliably control caching and
request collapsing behavior for long-polling requests.

My previous PR in this area
#1656 made request
collapsing work nicely with proxies with the first behavior as they'd
collapse all requests within the time from the start of a long-poll and
the end of the max-age. And when the client went to request again after
the long-poll had ended, the previous request cache had expired already
so a new request would get sent to the origin.

However, this approach caused issues with proxies with the second
behavior as request collapsing would work but when the client re-polled,
the cache hadn't yet expired so the client would go into an infinite
loop requesting the same cached response over and over.

So this PR adds a `cursor` generated by the server that clients use as
part of `live` requests. This skips by any caches from the previous live
request (which on proxies with the first behavior, would have expired
already).

The cursor is generated by finding the next alignment boundary. I.e. if
the timeout is 20 seconds (which it is now but this could change) then
we calculate the alignment boundary by taking the current unix timestamp
and subtracting the Electric Epoch of October 9th, 2024 then dividing by
20 and rounding up and the multiplying by 20 again.

In practice this partitions caches for live requests for a given offset
into 20 second windows.

---------

Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants