Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LWW Registers in Maps may go "backwards" in time #100

Closed
lenary opened this issue Aug 1, 2014 · 1 comment
Closed

LWW Registers in Maps may go "backwards" in time #100

lenary opened this issue Aug 1, 2014 · 1 comment
Assignees

Comments

@lenary
Copy link
Contributor

lenary commented Aug 1, 2014

The merging behaviour around LWW and the Map's reset-remove semantic can produce some behaviour where the value of a LWW register may go back to a value with a previous timestamp.

So, to setup the scenario, imagine 2 writing nodes, A and B.

  1. Two things concurrently happen:
    a. B writes <b> into a register inside a map, with ts 0.
    b. concurrently, A writes <a> into the same register inside a map, with ts 1.
  2. A replicates to B. i.e. B := merge(B, A)
  3. A removes the register inside the map, at ts >1.
  4. A and B merge, so A := merge(A,B) and B := merge(B,A) (merge is commutative, so order of arguments doesn't actually produce any difference in result).

In Riak 2.0, the final value of the register inside the map on both nodes is <b>. However, if we examine the history of the register at B, we can see something peculiar. After step 1, the value is evidently <b> with ts 0. After step 2, the value must be <a> with ts 1, as it has a higher timestamp. However, after step 4, the value is back to <b> with ts 0. The timestamp has decreased, which should never happen.

The root cause of this is the mixture of the causal ordering (used by the map) and the timestamp ordering (used by the register inside the map). We are currently investigating fixes, but it looks likely that we will replace the lww-register (which only uses timestamps for value resolution) with a causal-lww register, which uses causal order as much as possible for value resolution, but falls back on timestamps when there are concurrent edits. This is very much like how riak_object works with allow_mult=false.

Unfortunately, due to the time required to develop and test a causal-lww register, it seems unlikely we will get this fixed before 2.0 comes out.

@lenary lenary added this to the 2.0 milestone Aug 1, 2014
@lenary lenary self-assigned this Aug 1, 2014
@lenary lenary removed this from the 2.0 milestone Aug 1, 2014
@russelldb
Copy link
Member

Hey, I don't think it needs "fixing" for 2.0. It's a known issue. Lets add it to the release notes and think about it for future maps. I think the Map we are releasing behaves quite well wrt to LWW reg, since it keeps dot->reg mappings for concurrent writes and only merges on update. The new 'Paulo' Map is more problematic since it merges-on-merge and values "escape" earlier.

There is no well defined semantic for this, in the literature or in general use, so what we go with becomes the state-of-the-art until we/someone gets smarter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants