Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riak_core_ring:reconcile is inconsistent when updates occur less than 1 second apart [JIRA: RIAK-2772] #855

Closed
joecaswell opened this issue Aug 24, 2016 · 2 comments

Comments

@joecaswell
Copy link

If updates to the same key in the ring metadata occur on different nodes during the same second, they are not reconciled. The merge_meta function selects the remote value when metadata entries differ but have the same timestamp, which causes the nodes to change the local ring and regossip. This can lead to nodes flip-flopping the value and a storm of gossip messages that causes extremely high message queues and heap usage by the gossip processes.

The below snippet demonstrates that the ring generated by the reconcile function depends on the order of its arguments instead of some deterministic property derived from the data.

OtherNode = 'node1@127.0.0.1'.
%% keep trying until we get 2 updates in the same second
Simultaneous = fun(RingA, RingB, Fun) ->
       {StartMega, StartSec, _} = erlang:now(),
       Ring1 = riak_core_ring:update_meta(key,val,RingA),
       Ring2 = rpc:call(OtherNode, riak_core_ring, update_meta,[key,val1,RingB]),
       {EndMega, EndSec, _} = erlang:now(),
       case StartMega == EndMega andalso StartSec == EndSec of
           true ->
               {Ring1, Ring2};
           false ->
               Fun(RingA, RingB, Fun)
       end     
end.    
rr(riak_core_ring),

{ok,LocalRing} = riak_core_ring_manager:get_my_ring(),
{ok,RemoteRing} = rpc:call(OtherNode,riak_core_ring_manager,get_my_ring,[]),
%% make conflicting simultaneous changes to both rings
{NewLocal, NewRemote} = Simultaneous(LocalRing, RemoteRing, Simultaneous),
%% also add some non-conflicting changes
Ring1 = riak_core_ring:update_meta(key3,val3,riak_core_ring:update_meta(key2,val2,NewLocal)),
timer:sleep(1001),
Ring2 = riak_core_ring:update_meta(key4,val4,riak_core_ring:update_meta(key2,valB,NewRemote)),
Meta1 = dict:fetch(key,Ring1#chstate_v2.meta),
Meta2 = dict:fetch(key,Ring2#chstate_v2.meta),
%% verify we have 2 changes with the same lastmod
Meta1#meta_entry.value == Meta2#meta_entry.value.
Meta1#meta_entry.lastmod == Meta2#meta_entry.lastmod.
%% simulate each node getting the other's update
{new_ring, Ring3} = riak_core_ring:reconcile(Ring1, Ring2),
{new_ring, Ring4} = riak_core_ring:reconcile(Ring2, Ring1),
%% show that the rings differ
riak_core_ring:equal_rings(Ring3,Ring4).
%% list the metadata keys from each to show they all merged 
%% except the ones with the same timestamp
Ring3#chstate_v2.nodename.
[ {K,riak_core_ring:get_meta(K, Ring3)} || K <- [key, key2, key3, key4]].
Ring4#chstate_v2.nodename.
[ {K,riak_core_ring:get_meta(K, Ring4)} || K <- [key, key2, key3, key4]].
@Basho-JIRA Basho-JIRA changed the title riak_core_ring:reconcile is inconsistent when updates occur less than 1 second apart riak_core_ring:reconcile is inconsistent when updates occur less than 1 second apart [JIRA: RIAK-2772] Aug 24, 2016
@Basho-JIRA
Copy link

When we pick this up (2.2.1?) please bring [~eleitch] along for the ride for knowledge sharing.

_[posted via JIRA by Douglas Rohrer]_

@Basho-JIRA
Copy link

2.0.8 PR: #886

[posted via JIRA by Brian Sparrow]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants