Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync between devices Or Cloud. Approahces ? #136

Closed
joeblew99 opened this issue Sep 5, 2019 · 10 comments
Closed

sync between devices Or Cloud. Approahces ? #136

joeblew99 opened this issue Sep 5, 2019 · 10 comments

Comments

@joeblew99
Copy link

With a client side database it raises the prospect of a User having many devices ( web, Desktop, mobiles) and how best to approach the pattern for data synchronisation.

Event Log
One way is to maintain an Event Log for local mutations on each Client.
This contains a JSON of the Mutation.
The Event Log entries are then sent to a Cloud Server and stored against the device ID, and then deleted on the client.
On the clud Server the Events are NEVER deleted. This is because when you add a new device you need to get the Events and replay them to it.
Your other devices then get a SSE event push of the events and update the DB
BAD

  • The event entries must be versioned because one devices software could be at a different version from another devices version. Although this can be easily mitigated by forcing a software update before pulling in the Event logs. If the user does not update they cant get event log update though - not good.

Etcd could be a very nice and simple option for the Cloud server.
There is a SSE system for Server and Flutter that uses this.
https://github.com/wallforfry/dart_mercure
https://github.com/dunglas/mercure

This is just one approach to how to skin this cat to of course.
Very curious to see what others think.


Its important to note that this is NOT trying to do transactions between different users.
THats a whole different area.

@joeblew99
Copy link
Author

What about if a User is on an old version and doing mutations offline ?

When they go online, the events ( stamped with version number) will go back to the Sync Server.
But their other devices is at a version ahead and so when it gets the event it tries to update the local DB and fails because the schema is different.

Maybe Protobufs will help ?
Protocol buffers might help because the fields are numbered and so if a client gets protobuf data with a field in it that it does not have the protobuf type for it just ignores that field. It would then update the database cleanly. However when it upgrades to the next version that does have that Protobug type and Table, it will have that record without the field from before.

Time:
I am discounting the problem of time differences between devices because a user is assumed to only be using one device at a time, and so the system should never get a clash.

Changing the same record on many devices:
In this case we have no choice but to use the last in winds pattern.

@simolus3
Copy link
Owner

simolus3 commented Sep 7, 2019

Some thoughts I have on this:

On the clud Server the Events are NEVER deleted. This is because when you add a new device you need to get the Events and replay them to it.

I've seen event logs being used for cross-device synchronization. But storing the entire history on the server and replaying it on new devices can get very expensive for long-time or very active users, who can easily have hundreds of thousands of events. Also, some events of the past might not be relevant anymore: Say a user does something like

  1. create a file
  2. make a bunch (or a lot) of edits in that file
  3. delete that file

If another client missed all three steps (which could contain hundreds of events), that's not a problem at all because all these events, when combined, have no effect on any state. I think the most common way event logs are used is that the server does not store any events, ever (or at least it doesn't expose them). Instead, it only stores the current state. When a client connects, it sends all the local edits to the server, which takes care of updating the state. The client can then grab the fresh state snapshot from the server. This can solve some problems on how clients are supposed to deal with outdated event logs / protocol changes.

Of course, different approaches will work to a different degree based on the actual use case. In a chat application, it might be desirable to always have all messages sync across all devices, even very old ones, so it makes more sense to actually store all event logs. On a notes app, being aware of older snapshots might be less important, which can justify not storing the entire event log,

@joeblew99
Copy link
Author

Thanks for the discussion.

During sync what do you imagine is a snapshot ? You said that the client sends all their events that I presume are transacted with the mysql and mysql gives the client a snapshot.

Really appreciate this discussion as I am playing with the FFI stuff and the mysql FFI stuff. It's quite a compelling proposition in terms of reducing complexity in the stack

@simolus3
Copy link
Owner

simolus3 commented Oct 18, 2019

During sync what do you imagine is a snapshot

I would imagine some redux-like model where the client sends all their actions (or events) to the server. The server can then fold them into state (e.g. f(old_state, event) = newState). The server could probably get away with only keeping the current state (or snapshot) in most cases.

For example, let's imagine a scenario where we're managing users, and we'd like to store their name. In that case, the current name would be a snapshot (which I assume the server could make available via some GET /user/name call). So handling name changes could just look like

// server side logic:
Future handleClientSync(List<Event> eventsFromClient) async {
  final currentName = serverDb.loadName(clientId);
  final nextName = eventsFromClient.fold(currentName, (name, event) => event.name);
  await serverDb.changeName(clientId, currentName);
}

So the server doesn't store the events in its database, only the current name, which we can call a snapshot.

@joeblew99
Copy link
Author

joeblew99 commented Oct 19, 2019 via email

@simolus3
Copy link
Owner

Yeah, it would probably be harder to implement a proper conflict resolution algorithm if the server only has the current state. But solving those conflicts is always tough and highly dependent on the actual problem domain.
Even Google Docs sometimes asks you to just pick a version when it can't merge changes together, and it's very good at sync in general. So while conflict resolution in offline sync is a very challenging problem to solve, in most cases it's "good enough" to just naively apply those changes that the server receives first. If there's a conflict, it might be acceptable to just reject the data and ask the user to re-do their changes.

@listepo
Copy link

listepo commented Mar 22, 2021

@simolus3 any plans to add sync like https://nozbe.github.io/WatermelonDB/Advanced/Sync.html ?

@simolus3
Copy link
Owner

No, it would be much more complex since sqlite3 doesn't really have a synchronization protocol.

@davidmartos96
Copy link
Contributor

davidmartos96 commented May 14, 2021

@simolus3

No, it would be much more complex since sqlite3 doesn't really have a synchronization protocol.

What do you mean by that? Doesn't WatermelonDB use sqlite under the hood?
Looking at the details of their sync implementation (https://nozbe.github.io/WatermelonDB/Implementation/SyncImpl.html#sync-procedure) I'd say all steps would be doable with either moor or sqflite primitives.
The only part I'm not that sure would be the write only locks, but maybe it could be done with WAL mode or "BEGIN IMMEDIATE" transactions.

I don't know how good the general sync solution of WatermelonDB is, as I don't have prior experience with it, but it looks popular. Have you used it before @listepo ?

@CodingSoot
Copy link

Looking at the details of their sync implementation (https://nozbe.github.io/WatermelonDB/Implementation/SyncImpl.html#sync-procedure) I'd say all steps would be doable with either moor or sqflite primitives.

I think drift is one of the most suitable dart packages to include a sync mechanism like the one WatermelonDB provides. It doesn't have to suit all usecases, but only be good enough for the most common ones. WatermelonDB did an amazing job at that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants