This is a work-in-progress and is not battle-tested.
bower install --save logoot
npm install --save logoot
What is Logoot?
Logoot is a conflict-free replicated data type that can be used to represent a sequence of atoms. Atoms in a Logoot sequence might be chunks of content or even characters in a string of text.
The key to Logoot is the way in which it generates identifiers for atoms. To understand this, here are a few definitions to start with:
pis an integer and
sis a globally unique site identifier. A "site" represents any independent copy of a given Logoot CRDT. One web client with independent editable views of the same sequence would be two separate sites on that client.
positionA list of identifiers.
atom identifierA pair
posis a position, and
cis the value of a vector clock at a given site. The site maintains a vector clock that increases incrementally with each no atom identifier generated.
sequence atomA pair
identis an atom identifier, and
vis any arbitrary value. This could be a single text character or a block in a text editor.
sequenceA list of sequence atoms. This might represent a document or a list of blocks in an editor. Every sequence implicitly has a minimum sequence atom and a maximum sequence atom. All other atoms in the sequence are created somewhere between these two.
Because of how these atom identifiers are structured, they are totally ordered, as opposed to causally ordered. No identifier cares about the identifier before it once it's been created, and so tombstones are not necessary.
Generating a Sequence Atom
Let's start with the empty sequence before any user has made edits to it:
[ [[[[0, 0]], 0], null], // Minimum sequence atom [[[[32767, 0]], 1], null] // Maximum sequence atom ]
Aside: Note that the integer in the maximum sequence atom's value is
32767. This is chosen somewhat arbitrarily, but it is common for
implementations to use the maximum unsigned 16-bit integer (and the original
paper recommends it). One wouldn't want to choose an integer greater than any
implementation's maximum safe integer value, and all implementations that
communicate with one another must share this same maximum.
Now, a user at site
1 inserts the first line into their local copy of the
[ [[[[0, 0]], 0], null], // Minimum sequence atom [[[[6589, 1]], 0], "Hello, world!"], [[[[32767, 0]], 1], null] // Maximum sequence atom ]
Because there is free space between the integer of the minimum sequence atom and the integer of the maximum sequence atom, Logoot chooses a random integer between the two (how this is chosen is somewhat arbitrary—it just must be between them) and ends up with the sequence identifier:
[ [ [ 6589, // Number between min/max 1 // Site identifier ] ], 0 // Next value of site's vector clock ]
As a result, the document is properly sequenced. Ordering of sequence atoms is done by iterating over their position list and comparing first the integer, and then the site identifier if the integer is equal.
Note that vector clock values are not compared. Vector clock values are used to ensure unique atom identifiers, not for ordering.
Let's look at a more complex example. Start with a document that looks like this:
[ [[[[0, 0]]], null], // Minimum sequence atom [[[[1, 1], [3, 2]], 5], "Hello, world from site 2!"], [[[[1, 1], [5, 4]], 1], "I came from site 4!"], [[[[32767, 0]]], null] // Maximum sequence atom ]
Now, at site
3, the user wants to insert a line between the two user-created
lines in the above sequence. Logoot iterates over the pairs of identifiers in
the "before" and "after" positions. Because the first identifier of each
[1, 1], Logoot can not insert an identifier directly between them,
so it moves on to the next pair,
[3, 2] and
[5, 4]. Because site 3's
site identifier is greater than site 2's, it can insert the identifier
here and preserve ordering, since
[3, 2] < [3, 3] < [5, 4].
The resulting sequence would be (assuming 3's vector clock is at
[ [[[[0, 0]]], null], // Minimum sequence atom [[[[1, 1], [3, 2]], 5], "Hello, world from site 2!"], [[[[1, 1], [3, 3]], 1], "Hello from site 3!"], [[[[1, 1], [5, 4]], 1], "I came from site 4!"], [[[[32767, 0]]], null] // Maximum sequence atom ]
Note that if this were actually site
1, things would be different, because
[3, 2] is not less than
[3, 1]. Instead, Logoot generates a random integer
between 3 and 5 (which is of course
4), and our resulting identifier would be:
[[[1, 1], [4, 1]], 1]
Hopefully this provides a good enough explanation of what Logoot is and why it may be an excellent option for a sequence CRDT. The paper presenting it is a relatively easy read, and you may also want to look at this project's Logoot.Sequence module and its tests.
- Make min and max implicit, do not force user to provide them.
- Prevent deleting min and max atoms.
- Make idempotent insert atom function.
- Make idempotent delete atom function.