crdt usage leading to unintended consequences #29

Closed
wants to merge 1 commit into
from

Projects

None yet

2 participants

@mbrevoort
Contributor

The synchronization of state between clients and server is leading to some unindented duplication and orphaning of data.

  1. When a client disconnects from the server, the server is appropriately removing the service from the crdt set however the row is actually not getting removed from the crdt document and it remains in the history. When that client reconnects, and syncs with the server, all of the items in the history (even the one that's not part of the services set any longer), fires as a "register" even so that clients copy believes that there are still online services for services that actually went online. And this is causing duplicate entries for every stop/start of the client.
  2. When the server is restarted, it gets it's state from the client. If the server itself registered any services, these are not properly unregistered and when the are synced from the client, queries to the server have duplicate entries for every time the server was stopped/started.

I've found a way around 1, but have a clarifying issue as to how to properly delete from crdt dominictarr/crdt#21

Number 2 I don't have a solution for yet, and wonder in general with the way crdt is being employed here if it should be emphasized that the servers version of the state is authoritative. In the paste, before seaport was changed to use crdt, it worked very cleanly: the server would restart and all of the clients would reassert themselves.

I'm happy to continue to work through this, fix and pull request, but I need to better understand the intent and an agreed approach. What should happen to the state of things when the server restarts?

Contributor

When a node restarts it's given (randomly generated) a different node id and then any data associated with the previous node id that's been syncronized to any other node maybe orphaned, especially if the node is a seaport service or the node goes down at the same time as a seaport server and the server can't unregister the nodes data in time and emit the change to the other nodes.

This results in both ghost nodes and data as well as memory leaks.

Contributor

I think I have a reasonable solution for this, but I'll wait to submit the pull request until my crdt pull request is accepted since there is a dependency there.

Here's the approach:

If a client looses contact with the Seaport server and then reconnects, delete all of the crdt rows not owned by that client node.

This eliminates the client from polluting the server and other nodes with potentially stale and/or orphaned data. It also in essence gives that authority back to the Seaport server.

Thoughts?

@mbrevoort mbrevoort added a commit to mbrevoort/seaport that referenced this pull request Apr 8, 2013
@mbrevoort mbrevoort Fixes for crdt data consistency issues #29
Depends on new version of crdt pending acceptance of dominictarr/crdt#21
c3ed120
Owner

Once the crdt patches go through I'm ready to merge this.

@mbrevoort mbrevoort referenced this pull request in dominictarr/crdt Apr 10, 2013
Merged

Proper way to delete? #21

@mbrevoort mbrevoort Fixes for crdt data consistency issues #29
Depends on new version of crdt pending acceptance of dominictarr/crdt#21
cf1c517
Contributor

The new crdt version has been published and I squashed my commits. I removed my original reinitialize logic and replaced with a heartbeat. Relevant discussion here: dominictarr/crdt#21

Owner

This was merged. The squashing must've confused github about auto-closing.

@substack substack closed this Apr 13, 2013
Contributor

Sorry about the confusion, didn't want to unnecessarily confuse the history. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment