Permalink
Switch branches/tags
Commits on Oct 21, 2011
Commits on Oct 11, 2011
  1. Merge pull request #102 from basho/bz1242-stalled-handoff

    Fix BZ1242: Stalled handoff when ring is fixed-up
    jtuple committed Oct 11, 2011
Commits on Oct 10, 2011
  1. Fix bug that prevented handoff if ring was fixed-up.

    Fix BZ1242.
    
    Change riak_core_vnode to use get_raw_ring instead of get_my_ring, because
    the vector clock for get_my_ring may have local modifications that prevent
    riak_core_ring:ring_ready from returning true for the ring, even if the
    base raw ring was ready.
    
    Updated core_vnode_eqc to start the ring manager in order to retrieve the
    raw ring during tests.
    jtuple committed Oct 10, 2011
Commits on Oct 7, 2011
  1. Fix bz1235

    Change core_vnode to return error when receiving sync messages while
    exiting, change handoff_receiver to check for error responses.
    jtuple committed Oct 7, 2011
Commits on Oct 3, 2011
Commits on Sep 29, 2011
Commits on Sep 27, 2011
  1. Perform final sync once all handoff data has been sent.

    The new cluster membership code switched to forwarding
    once handoff is complete.  Without this change the vnode
    starts forwarding while the new owner is still processing
    buffered TCP data.
    jonmeredith committed Sep 27, 2011
Commits on Sep 26, 2011
  1. Fix bug with nodes leaving the cluster earlier than intended.

    Change ring_ready to wait on exiting nodes in addition to valid and leaving
    nodes. This ensure the ring converges on a node's intent to leave before the
    node leaves the cluster.
    
    Change claimant from moving itself from exiting to invalid. Instead, after
    the claimant moves to exiting, a new claimant will emerge that will move the
    previous claimant to invalid and initiate shutdown.
    jtuple committed Sep 26, 2011
Commits on Sep 23, 2011
  1. Fixed update_forwarding_mode return in deleted case.

    The caller wraps the state with the next state information.
    jonmeredith committed Sep 23, 2011
  2. Bump lager dependency version

    Jared Morrow committed Sep 23, 2011
  3. Made Mod:delete happen before unregister.

    Prevent a race with the master starting a new vnode.
    Changed coverage to run while in handoff - otherwise
    listkeys et al will bomb during partition transfer.
    jonmeredith committed Sep 23, 2011
  4. Added infinity timeout on finish_handoff call.

    On a very busy 6-node stagedevrel cluster was hitting.
    11:35:18.950 [error] gen_fsm <0.171.0> in state active terminated with reason: {timeout,{gen_server,call,[riak_core_gossip,{finish_handoff,45671926166590716193865151022383844364247891968,'dev1@127.0.0.1','dev3@127.0.0.1',riak_pipe_vnode}]}}
    
    The process is local and the call is monitored in case gossip dies.
    jonmeredith committed Sep 23, 2011
  5. Changed vnode to unregister from master before cleaning up.

    Fullsync repl was hanging because it delivered a fold message
    while finish_handoff was being called.  The message was never
    processed as the vnode immediately shut down rather than
    forwarding the messages in the queue.
    
    On completion of handoff, async unregister from the vnode master. The
    unregister call now passes the pid of the vnode unregistering
    and now the master sends an unregistered event once the vnode
    is removed from the master ETS table.
    
    While waiting for the acknowledgment of unregister the vnode goes
    into forwarding mode.
    jonmeredith committed Sep 23, 2011
Commits on Sep 21, 2011
  1. Update new partition claim algorithm after review + bug fixes

    Change claim_simulation.erl eunit test to run a simulation with both the
    new and old claim algorithm as suggested.
    
    Rename riak_core_new_claim:new_claim/2 to new_choose_claim/2 to match
    default_choose_claim/2.
    
    Fix two bugs in riak_core_new_claim.erl that are on code paths that cannot
    occur in 1.0 due to existing invariants, but should be fixed nevertheless:
    - Match error in prefilter_violations: change CNth to {CNth, _}.
    - Handle case where new_choose_claim fails to claim partitions by falling
      back to claim_rebalance_n.
    jtuple committed Sep 21, 2011
Commits on Sep 20, 2011
  1. Add new partition claim function and claim simulator

    Add riak_core_new_claim:new_wants_claim/2 and new_claim/2.
    Merge in claim simulation code provided by Greg Nelson (grourk@dropcam.com).
    Add pretty_print function to riak_core_ring.
    
    The new claim function is designed to reduce the number of partition transfers
    that occur when rebalancing the ring, aiming as close to possible for minimal
    consistent hashing.
    jtuple committed Sep 20, 2011