Syncing initial state or state after a crash where state is lost #290

cmeiklejohn · 2018-04-11T09:38:57Z

Anti-entropy isn't triggered immediately when a new node joins the cluster when using the state-based propagation backend. Therefore, it may take time before a node sees updates from other nodes in the cluster.

Reproducer:

Server 1 starts up
Server 1 adds Process 1 to a lasp_pg group
Server 2 starts up
Server 2 joins as peer

The issue becomes more problematic when dealing with a new or failed and recovering node with the delta-based propagation backend. Consider the following example:

Server 1 starts up
Server 2 joins
Server 1 updates
Buffers, sends deltas to server 2
Server 2 acknowledges deltas
Server 2 shuts down, crash failure (or, rejoins with a new identifier)
Server 2 will receive no changes until the next change to that same data item -- nothing has been buffered for that node, nor if something was because it recovered with no disk, the buffer will be empty

cc: @russelldb @tsloughter @vitorenesduarte

cmeiklejohn changed the title ~~Syncing initial state~~ Syncing initial state or state after a crash where state is lost Apr 11, 2018

cmeiklejohn mentioned this issue Apr 11, 2018

Syncing initial state lasp-lang/lasp_pg#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing initial state or state after a crash where state is lost #290

Syncing initial state or state after a crash where state is lost #290

cmeiklejohn commented Apr 11, 2018

Syncing initial state or state after a crash where state is lost #290

Syncing initial state or state after a crash where state is lost #290

Comments

cmeiklejohn commented Apr 11, 2018