After shape.synced resolves, there is some delay before data available in local DB #1365

Gobbo89 · 2024-06-13T23:15:42Z

Hello, I was talking with @balegas about this potential issue a few weeks ago.
I've noticed this since electric-sql v0.9.x.
I personally do not think it's the intended behavior, it looks more like a bug or at least something that could be improved.

As the tile says, it looks like after shape.synced resolves, there still is some delay before the data is actually available in the local DB.
This happens with an already initialized client DB. While the client is not running, you add some data in PG. Then you start the client.

The following code is just a shortened version of the problem, it might not be enough to reproduce.

const shape = await db.items.sync()
await shape.synced

const syncedItems = await db.items.findMany()
console.log('Synced items', syncedItems) // Gives me []

// However, if I wait an arbitrary amount of time...
setTimeout(async () => {
  const syncedItems = await db.items.findMany()
  console.log(`Synced items (timeout)`, syncedItems) // Gives me ['...', (...)]
}, 15)

Because it might be a bit tricky to understand the conditions under which this occurs, I have created a repo to reproduce this consistently, along with some log screenshots.

Thank you

The text was updated successfully, but these errors were encountered:

linear · 2024-06-13T23:15:45Z

VAX-1963 After shape.synced resolves, there is some delay before data available in local DB

msfstef · 2024-06-17T15:48:52Z

@Gobbo89 the first sync that you perform on a client establishes the shape subscription, and receives the "initial sync data" to get the database up caught up to the point where the subscription was established.

Shape subscriptions are persisted to the database, so the next time you run your client and subscribe to the same shape, the sync promise resolves immediately because the subscription is persisted, and basically replication resumes from the point it was left at. This means that shape.synced immediately resolves, your query returns the data already available in the database, and because replication has resumed it very shortly applies the more recent changes from upstream, matching the behaviour that you observed.

There is no concept of being fully "synced" or "caught up" with the database, as changes are continuously streamed to the client with no well defined end to them, but I understand the API having a promise called sync which, when first establishing a shape subscription, resolves once the initial batch of data is delivered can be confusing.

Would a name like syncedOnce perhaps clarify the observed behaviour? Also, it should be possible to create a separate API to wait for the replication to "catch up" to the changes available at the point of starting the replication - would that be closer to what you were expecting out of the synced promise?

Gobbo89 · 2024-06-17T18:47:28Z

Hey @msfstef thanks for the clarification.

This means that shape.synced immediately resolves, your query returns the data already available in the database

What lead me to think otherwise, is this part of the docs:
the second synced promise resolves when the data in the shape has fully synced onto the local device
That is indeed true only if we are talking about the very first sync between a client and the sync service, but it's very misleading in the other cases, since the synced promise resolves immediately, like you said.

Put that together with point no. 1 stating "the first sync() promise resolves when the shape subscription has been confirmed by the server" and it sounds like the second promises always resolves after the server data (which we are aware of, after no.1) is loaded in the DB on the local device.

Would a name like syncedOnce perhaps clarify the observed behaviour?

I don't know. You risk to go down the rabbit hole of trying and make variable / function names most descriptive as possible, when in this case it's more complicated than that, and you can put that logic in just a name.
At the end of the day, in some cases that promises does resolve when the data is synced, just not always.
Maybe simply adjusting that point no.2 in the docs is enough. For example, I find the comment in the code sample much safer:

// Resolves once the initial data load
// for the shape is complete.

Elaborating on that could help avoid this type of confusion.

Also, it should be possible to create a separate API to wait for the replication to "catch up" to the changes available at the point of starting the replication - would that be closer to what you were expecting out of the synced promise?

Oh so you'd change synced to -> syncedOnce and introduce a new, separated synced that actually waits for the "catch up"?
Like I've said above, I am not sure about function names, but for sure having a separate API that does that would be really helpful for me! As long as it's well documented I believe it'd be a win for other users too.

icehaunter · 2024-06-18T08:37:21Z

Hey @Gobbo89, thanks a lot for the extensive response!

If you don't mind, can you tell us about how the "caught up" API would be useful for you, given it won't guarantee that no changes are pending propagation from the server (since it's a distributed system there is no "stable" state)?

We can make an API that tells you that you're "caught up" to whatever the state of the server was when you were connected, and I see how that can be useful, but I'd love some user's perspective on this so that we expose this feature in a useful way. Another limitation to note is this "caught up" promise will have to be resolved at the same time for all already-established subscriptions upon reconnection: since the replication stream is shared and any transaction may contain changes relevant to any number of shapes, the "caught up" is about the client's position in the replication stream, not of any particular subscriptions

Gobbo89 · 2024-06-18T09:23:09Z

Hey @icehaunter,

In my scenario I have an Electric Sync client running in a Node.js server.
That's an edge server, local to my shop, meant to be the source of truth for its clients.
Please note that those clients will not use Electric to sync with the edge server, that is done in another way.

Now, the cloud PG DB can receive create/update/delete operations by other sources, that will be active even while the shop is shut down and offline.
If I start up my local Node.js server after a couple of days of downtime (let's say the weekend?), it might take a good while to "catch up" with the state of PG tables.
Therefore, I would like said Node.js server to accept requests from its clients only after it "completely" caught up with PG.

Of course "completely" is a relative concept here: we are talking about that specific instant, because the next instant other changes might be pushed to PG, which then would need to be synced with the local DB. However, you would assume that the total size of changes accumulated during a N-day downtime will always be much bigger than instant-to-instant changes.

Those small (in terms of size) changes do not need a special handling, while from my POV it makes sense to handle the possibly massive delta on Node.js server startup in a more controlled way.

Hope this helps understanding my perspective.

Another limitation to note is this "caught up" promise will have to be resolved at the same time for all already-established subscriptions upon reconnection

the "caught up" is about the client's position in the replication stream, not of any particular subscriptions

Yes of course, that's clear to me.

balegas · 2024-06-20T07:54:57Z

It has been on our radar that we need some way of catching up with the server when reconnecting. A well-known anomaly of not handling the pending log of operations is the waterfall of updates that we receive after reconnection trigger reactivity and we see the app state advancing in fast-forward mode.

it's true that being up-to-date is a moving target and therefore we would not provide the guarantee that we caught up with the tip of the server, but we definitely can let the client know that 'some point' has been reached. Many strategies could be used: reaching certain LSN; associating LSNs with time; control messages when there are no more changes for a user.

I think we should open an issue to track this internally, but there might be one already :).

Addresses some of the suggestions in #1365. --------- Co-authored-by: msfstef <msfstef@gmail.com>

msfstef · 2024-06-20T09:11:54Z

Updated the docs and added explanatory docstrings on the sync API and synced promise, closing this issue thank you for raising this as it can indeed be confusing!

msfstef self-assigned this Jun 17, 2024

msfstef added the Improvement label Jun 18, 2024

thruflo mentioned this issue Jun 18, 2024

docs: clarify when await shape.synced resolves. #1380

Merged

msfstef linked a pull request Jun 20, 2024 that will close this issue

docs: clarify when await shape.synced resolves. #1380

Merged

thruflo added a commit that referenced this issue Jun 20, 2024

docs: clarify when await shape.synced resolves. (#1380)

a94e860

Addresses some of the suggestions in #1365. --------- Co-authored-by: msfstef <msfstef@gmail.com>

thruflo closed this as completed in #1380 Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After shape.synced resolves, there is some delay before data available in local DB #1365

After shape.synced resolves, there is some delay before data available in local DB #1365

Gobbo89 commented Jun 13, 2024

linear bot commented Jun 13, 2024

msfstef commented Jun 17, 2024

Gobbo89 commented Jun 17, 2024

icehaunter commented Jun 18, 2024

Gobbo89 commented Jun 18, 2024

balegas commented Jun 20, 2024

msfstef commented Jun 20, 2024

After shape.synced resolves, there is some delay before data available in local DB #1365

After shape.synced resolves, there is some delay before data available in local DB #1365

Comments

Gobbo89 commented Jun 13, 2024

linear bot commented Jun 13, 2024

msfstef commented Jun 17, 2024

Gobbo89 commented Jun 17, 2024

icehaunter commented Jun 18, 2024

Gobbo89 commented Jun 18, 2024

balegas commented Jun 20, 2024

msfstef commented Jun 20, 2024