fix: during live requests, the server returns a cursor for the client to use for cache-busting by KyleAMathews · Pull Request #1826 · electric-sql/electric

KyleAMathews · 2024-10-09T20:01:07Z

This PR is a fix for inconsistencies in caching in http proxying while clients are long-polling. It also adds public to our cache-control header as that's required by some http proxies in order to cache.

HTTP Proxies don't treat the max-age in cache-control exactly the same way. Some start counting the age of the cache from the beginning of the request while others count from the end of the request.

This inconsistency makes it difficult to reliably control caching and request collapsing behavior for long-polling requests.

My previous PR in this area #1656 made request collapsing work nicely with proxies with the first behavior as they'd collapse all requests within the time from the start of a long-poll and the end of the max-age. And when the client went to request again after the long-poll had ended, the previous request cache had expired already so a new request would get sent to the origin.

However, this approach caused issues with proxies with the second behavior as request collapsing would work but when the client re-polled, the cache hadn't yet expired so the client would go into an infinite loop requesting the same cached response over and over.

So this PR adds a cursor generated by the server that clients use as part of live requests. This skips by any caches from the previous live request (which on proxies with the first behavior, would have expired already).

The cursor is generated by finding the next alignment boundary. I.e. if the timeout is 20 seconds (which it is now but this could change) then we calculate the alignment boundary by taking the current unix timestamp and subtracting the Electric Epoch of October 9th, 2024 then dividing by 20 and rounding up and the multiplying by 20 again.

In practice this partitions caches for live requests for a given offset into 20 second windows.

… to use for cache-busting

netlify · 2024-10-09T20:18:36Z

✅ Deploy Preview for electric-next ready!

Name	Link
🔨 Latest commit	`b63ee9a`
🔍 Latest deploy log	https://app.netlify.com/sites/electric-next/deploys/6707c8e7287e6e00087d44c1
😎 Deploy Preview	https://deploy-preview-1826--electric-next.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

balegas

Okay, this will work. Now we can increase the max-age if we want, right? More collapsing, more data at the edge.
It also helps with nextjs caching behaviour, which I was sidestepping by adding a random to the URL... sounds familiar.

I've approved, but might be useful if team has a look at the Elixir before merging.

KyleAMathews · 2024-10-09T21:13:45Z

Now we can increase the max-age if we want, right

Yup — we could make it configurable. Probably also in practice we'll need to let clients set the long-polling seconds as well as there's some proxies that have pretty short timeouts. But the default could be much higher.

msfstef

What an annoying issue from lack of standardisation! I like having a custom cache cursor though for the live requests, relatively cheap to implement and maintain and gives us better control.

I've left some comments and questions for clarification

msfstef · 2024-10-10T12:46:15Z

  >()

  #lastOffset: Offset
+  #nextLiveCursor: string // Seconds since our Electric Epoch 😎


nit: slight inconsistency in naming between "live next cursor" and "next live cursor" - maybe we can even name it "live cache buster" or "live cache cursor" to make its purpose explicit in its name?

true — I'll fix

msfstef · 2024-10-10T12:50:07Z

    validateOptions(options)
    this.options = { subscribe: true, ...options }
    this.#lastOffset = this.options.offset ?? `-1`
+    this.#nextLiveCursor = ``


what does this mean for the request collapsing behaviour on the first live request, before a shared cursor is retrieved? as I'm thinking about it I don't think it's anything serious but worth clarifying

yeah the initial request is different so collapsing/caching will be different. I couldn't think of any way around this but it's also not that big of deal as it's just one more request basically getting to Electric.

msfstef · 2024-10-10T12:52:25Z

+      now = DateTime.utc_now()
+
+      diff_in_seconds = DateTime.diff(now, oct9th2024, :second)
+      next_interval = ceil(diff_in_seconds / 20) * 20


this is an arbitrary 20 here even though it's supposed to be related to the long poll timeout - perhaps make this function take an "interval size" as a parameter and use conn.assigns.config[:long_poll_timeout] when it gets called to ensure it stays consistent?

Oh sweet! I didn't know there was a config already for this — I'll switch to it.

msfstef · 2024-10-10T12:53:47Z


+  defmodule TimeUtils do
+    def seconds_since_oct9th_2024_next_interval do
+      oct9th2024 = DateTime.from_naive!(~N[2024-10-09 00:00:00], "Etc/UTC")


you can store this value as an alias in this module, like

@oct9th2024 DateTime.from_naive!(~N[2024-10-09 00:00:00], "Etc/UTC") def seconds_since_oct9th_2024_next_interval do ...

so it gets calculated once and reused rather than parsing the date on every call

ok good idea as this will be called a ton

Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>

… to use for cache-busting (#1826) This PR is a fix for inconsistencies in caching in http proxying while clients are long-polling. It also adds `public` to our `cache-control` header as that's required by some http proxies in order to cache. HTTP Proxies don't treat the max-age in cache-control exactly the same way. Some start counting the age of the cache from the *beginning* of the request while others count from the *end* of the request. This inconsistency makes it difficult to reliably control caching and request collapsing behavior for long-polling requests. My previous PR in this area #1656 made request collapsing work nicely with proxies with the first behavior as they'd collapse all requests within the time from the start of a long-poll and the end of the max-age. And when the client went to request again after the long-poll had ended, the previous request cache had expired already so a new request would get sent to the origin. However, this approach caused issues with proxies with the second behavior as request collapsing would work but when the client re-polled, the cache hadn't yet expired so the client would go into an infinite loop requesting the same cached response over and over. So this PR adds a `cursor` generated by the server that clients use as part of `live` requests. This skips by any caches from the previous live request (which on proxies with the first behavior, would have expired already). The cursor is generated by finding the next alignment boundary. I.e. if the timeout is 20 seconds (which it is now but this could change) then we calculate the alignment boundary by taking the current unix timestamp and subtracting the Electric Epoch of October 9th, 2024 then dividing by 20 and rounding up and the multiplying by 20 again. In practice this partitions caches for live requests for a given offset into 20 second windows. --------- Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>

Fixes #2589 ~We introduced the concept of a cursor in #1826 in order to avoid infinite loops of clients running into cached live responses by artificially "moving the cache forward" via the time based, coordinated cache buster.~ ~However we should not need it, as our cache buster is already the `offset` parameter. We were running into this issue because we were caching live responses that do not move the cache forward, i.e. empty live responses, so clients would continuously hit the same cache over and over again.~ ~Even with the "cursor" fix we still run into this issue but in a different way - empty live responses create a "chain" rather than a loop of cache hits, that can be arbitrarily long as we allow these cached live responses to be revalidated as well. This means someone who is at the tip of the log might end up following a huge chain of responses with no changes in them, and each of those requests made might trigger a separate revalidation request to the origin.~ ~Request collapsing on cache misses works regardless of the cache policy you set on a response (since the CDN does not yet know what the caching policy on the response will be). This is [a canonical example](https://developers.cloudflare.com/cache/concepts/revalidation/#example-2) from Cloudflare.~ ~Request collapsing on stale cache hits, as discussed [in the other Cloudflare example](https://developers.cloudflare.com/cache/concepts/revalidation/#example-1) will send back only a single revalidation out of the collapsed requests.~ ~Therefore to avoid these infinite loops, we can simply _not cache_ live responses with no changes in them, which retains the request collapsing behaviour without creating any infinite loops.~ ~For live responses that _do_ contain changes, we set the usual 5 second lifetime + 5 second stale lifetime, so that clients that are slightly behind can catch up using the cache, but the cursor is not needed as these cached live responses move the `offset` forward.~ ~If we go ahead with this change we do need to keep the cursor header present as we require it in our official client, although I've ripped out any logic for it since it won't actually be used and it is better to not change it between requests to ensure cache consistency - we can discuss a path to deprecation~ ### UPDATE We just use an etag that is always different for live responses that contain no changes to ensure they never get revalidated.

KyleAMathews added 5 commits October 9, 2024 13:48

fix: during live requests, the server returns a cursor for the client…

218e466

… to use for cache-busting

Add changeset

12c2b67

Fix formatting

fc04d63

fix tests

a1b2674

Add public a few more places

026fb5a

missing public

e774600

thruflo reviewed Oct 9, 2024

View reviewed changes

Comment thread website/electric-api.yaml

Update openapi spec

f22b278

balegas approved these changes Oct 9, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into live-cursor

b63ee9a

msfstef reviewed Oct 10, 2024

View reviewed changes

KyleAMathews and others added 5 commits October 10, 2024 06:11

Update packages/typescript-client/src/constants.ts

66b3182

Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>

Update packages/typescript-client/src/client.ts

1585b72

Co-authored-by: Stefanos Mousafeiris <msfstef@gmail.com>

use config value for long poll timeout & cache creation of epoch date

84dcc3c

Improve name in client for cache buster

f030e5b

Fix

1d35059

KyleAMathews merged commit 41845cb into main Oct 10, 2024

KyleAMathews deleted the live-cursor branch October 10, 2024 15:37

thruflo mentioned this pull request Oct 10, 2024

Add Elixir client #1833

Merged

This was referenced Oct 11, 2024

Fix race condition for live-polling cursors #1837

Closed

Fix: prevent the server from returning the same cursor as it received #1856

Merged

KyleAMathews mentioned this pull request Apr 11, 2025

Don't revalidate old live requests #2589

Closed

msfstef mentioned this pull request Apr 14, 2025

feat: Do not cache empty live responses #2593

Merged

Conversation

KyleAMathews commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify Bot commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for electric-next ready!

Uh oh!

Uh oh!

balegas left a comment

Choose a reason for hiding this comment

Uh oh!

KyleAMathews commented Oct 9, 2024

Uh oh!

msfstef left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

KyleAMathews commented Oct 9, 2024 •

edited

Loading

netlify Bot commented Oct 9, 2024 •

edited

Loading