Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

We’re showing branches in this repository, but you can also compare across forks.

base fork: basho/riak_core
base: cet-bg-mgr-proto
head fork: basho/riak_core
compare: jdb-join-claim-improvement
Commits on Jun 20, 2011
Joseph Blomstedt jtuple Add model of proposed join/claim improvement 66efaa3
Commits on Jul 08, 2011
Joseph Blomstedt jtuple Update model of proposed join/claim improvement fde0f37
Commits on Jul 22, 2011
Joseph Blomstedt jtuple Merge branch 'master' into az462-join-claim-improvement 69e4040
Commits on Jul 28, 2011
Joseph Blomstedt jtuple Update join/claim model based on feedback/testing.
    -- Model uses real riak_core_ring structure and related functions.
    -- Model allows removed nodes to rejoin.
    -- Model no longer assumes removed/left nodes actually shutdown.
    -- Model forces handoff before testing for eventual ring convergence.
    -- Model/EQC correctly works with code that uses random module.
    -- Model no longer fails on certain undefined but acceptable behavior.

    Changes to the proposal reflected in model:
    -- Ring state includes a cluster name.
    -- Gossip from invalid nodes is ignored, and shutdown cast re-sent.
    -- Gossip from nodes in another cluster (by name) is ignored.
    -- Invalid status takes precedence over valid when status reconciled.
Joseph Blomstedt jtuple Initial implementation of new join/claim approach a6befc7
Commits on Jul 29, 2011
Joseph Blomstedt jtuple Merge branch 'az533-join-claim-improvement' into az533-join-claim-imp…

Commits on Aug 02, 2011
Joseph Blomstedt jtuple Add support for multiple vnode modules to new join/claim approach 7538e52
Commits on Aug 03, 2011
Joseph Blomstedt jtuple Cleanup and refactor new join/claim code 8430404
Joseph Blomstedt jtuple Additional cleanup of new join/claim code a335dbe
Joseph Blomstedt jtuple Backport changes to join/claim implementation to model + minor changes 1ffdd8b
Joseph Blomstedt jtuple Merge branch 'master' into az533-join-claim-improvement-master 62eba32
Joseph Blomstedt jtuple Merge branch 'az533-join-claim-improvement' into az533-join-claim-imp…
Commits on Aug 09, 2011
Joseph Blomstedt jtuple Fix bugs related to ring_ready and ring convergence f67f8a1
Commits on Aug 10, 2011
Joseph Blomstedt jtuple Change control commands to return error tuples not io:format e0d7a19
Joseph Blomstedt jtuple Ensure vnodes necessary for ownership changes are started ca814ea
Commits on Aug 12, 2011
Joseph Blomstedt jtuple Change various get/set ring updates to ring_trans c579f51
Joseph Blomstedt jtuple Change riak_core_ring: ring reconciliation, ring_ready, and all_members
Merge vclocks on all ring reconciliation paths, revert to old ring ready
behavior, and have all_members/1 use the private get_members/1.
Joseph Blomstedt jtuple Add preliminary claim/member status commands 925506e
Commits on Aug 16, 2011
Joseph Blomstedt jtuple Refactor ring_status into riak_core_status and enhance console ouput 745f9bc
Joseph Blomstedt jtuple Add logging of membership status changes 3f4d6c2
Joseph Blomstedt jtuple Add lager:debug for claimant transitions and vnode forwarding 1c8804d
Commits on Aug 17, 2011
Joseph Blomstedt jtuple Add riak_core_stat with gossip/ring statistics 6acc961
Joseph Blomstedt jtuple Augment random gossip with fixed gossip for faster state convergence
Consolidate code for random gossip into new riak_core_gossip:random_gossip,
and update all locations that manually implement random gossip to use this

Add new deterministic gossip code, riak_core_gossip:recursive_gossip, that
sends a node's ring to its children vertices in a tree decomposition of the
cluster members list.

Change the "gossip on ring changed" code to use recursive_gossip.
Continue using random_gossip for periodic (gossip_interval) gossip.
Commits on Aug 19, 2011
Joseph Blomstedt jtuple Add random_recursive_gossip to augment recursive_gossip
Change recursive_gossip to be done in two parts. The initial ring change uses
random_recursive_gossip to start the recursive gossip at a random starting
node, the reconciliation logic then continues to use the fixed recursive_gossip
logic to propagate the gossip forward. This change decreases gossip hot spots
when using recursive_gossip.
Joseph Blomstedt jtuple Update ring eunit tests and code comments, update rename_node
Fix existing riak_core tests to work with the new cluster membership
code, as well as add several new tests that cover the new reconciliation

Update riak_core_ring:rename_node to support the new members and seen fields.
Joseph Blomstedt jtuple Fix data races in gossip and shutdown/restart process
Change gossip to use ring_trans rather than set_my_ring in order to prevent
a data race with concurrent ring changes based on join/leave/remove commands.

Change the refresh_ring logic to be guarded by cluster name, avoiding the
case where a stable refresh cast arrives at a node after it has already
shutdown and been restarted. Since the restarted node will have a new
cluster name, the stable cast can be detected and ignored.
Commits on Aug 22, 2011
Joseph Blomstedt jtuple Add pending percent to member_status, fix bug in ring_ready_info
Add pending ring percent as a field in member_status that displays a node's
ring ownership after all pending ownership transfers have completed.

Fix riak_core_ring:ring_ready_info so that only nodes considered for ring
convergence are checked.
Commits on Aug 23, 2011
Jon Meredith jonmeredith Changed prop_claim_ensure_unique_nodes to add test node to ring befor…
…e claiming.
Commits on Aug 24, 2011
Joseph Blomstedt jtuple Merge pull request #65 from basho/az533-join-claim-improvement-master
New join/claim/membership implementation and model
Commits on Aug 26, 2011
Joseph Blomstedt jtuple Add joining/down member status, fix various bugs
Add 'joining' member status to new cluster membership model and
implementation. When a node joins a cluster, it comes in with status
'joining' rather than 'valid'. The claimant then moves a node from 'joining'
to 'valid' after it ensures all cluster members have learned of the new
node joining the cluster. This change guarantees that all 'valid' members
vote on ring ready consensus under various failure scenarios.

Add 'down' member status to new cluster membership model and implementation.
The state is designed to allow a user to mark a down node as 'down' in order
to allow the rest of the cluster to converge. A vote from a 'down' node is
not necessary for ring ready consensus, and therefore a 'down' node's ring
state may become outdated. If a 'down' node gossips to another node that
believes it to be down (such as after coming back online), the other node
tells the 'down' node to rejoin the cluster, therefore making its state
current. Nodes do not gossip to 'down' nodes, and 'down' node ownership is
not changed during a rebalance.

Incorporate minor changes and bug fixes:
-- Fix next merging bug in riak_core_ring:remove_node and model.
-- Fix bug with nodes moving from 'leaving' to 'exiting' while having pending
-- Fix negative random seed bug in join/membership model.
-- Change core vnode so that a vnode does not shutdown if a completed handoff
   was to node that is now known to be 'invalid'.
-- Change update_ring to remove tuples from next for all invalid nodes, even
   for completed transfers.
-- Change leave in model to be a local transition like in the implementation.
-- Handle "neither claimant valid" case in reconcile_ring.
Commits on Aug 29, 2011
Joseph Blomstedt jtuple Fix wrong future_indices arity, add ping test to riak_core:down b7ba083
Joseph Blomstedt jtuple Fix recursive_gossip when node() is non-active member 3ec1adc
Joseph Blomstedt jtuple Change ring convergence check to not consider exiting nodes 53ff9db
Commits on Aug 30, 2011
Joseph Blomstedt jtuple Change riak_core:join to disallow joining to self 133d1eb
Joseph Blomstedt jtuple Merge pull request #71 from basho/az612-membership-changes
az612: Add joining/down status, fix bugs
Commits on Sep 01, 2011
Joseph Blomstedt jtuple Add legacy gossip mode as well as support for rolling gossip upgrades
Change riak_core_ring to have two different versioned records corresponding
to the new and old ring data-structure, and add update/downgrade functions
that convert between the two formats.

Add legacy gossip mode that uses the old ring reconciliation logic as well
as the old gossip/claim procedure. The legacy mode uses the old logic but
encapsulates its data inside the new ring format (using upgrade/downgrade)
in order to minimize code duplication. This mode is enabled by setting
the application environment variable riak_core/legacy_gossip to true.

Add member metadata to the new ring format and the related get_member_meta,
update_member_meta accessors.

Add support for rolling upgrades and mixed-gossip hybrid clusters. The
appropriate ring format is negotiated through member metadata when
possible, falling back to RPC queries when necessary. The cluster gossip
protocol is determined as follows:
  -- If all nodes support the new membership protocol and are not running
     in legacy mode, the new protocol is used.

  -- If an old node or legacy-mode node joins the cluster, the entire
     cluster downgrades to legacy mode.

  -- If the old/legacy nodes leave the cluster, the new nodes return to the
     new protocol.
Joseph Blomstedt jtuple Ensure invalid nodes are not scheduled for ownership transfer d0542ec
Joseph Blomstedt jtuple Fix call cycle rejoin->join->legacy_gossip 6c4a003
Joseph Blomstedt jtuple Update outdated eunit tests 3f94935
Commits on Sep 02, 2011
Joseph Blomstedt jtuple Merge pull request #74 from basho/az630-gossip-rolling-upgrade
Az630: Add legacy gossip mode + rolling upgrade between gossip versions
Joseph Blomstedt jtuple Add preliminary vnode manager, change vnode forwarding logic
Add a preliminary version of a vnode manager which will someday replace
much of the pid-tracking functionality currently in riak_core_vnode_master.
Current vnode_manager provides easy interface for finding vnode mod/idx/pid
information and informs vnodes about ring changes as appropriate.

Move vnode forward-on-ownership-change logic out of request critical path,
taking advantage of vnode manager's vnode ring notification feature.
Joseph Blomstedt jtuple Allow vnode manager to trigger handoff independent of inactivity
Add ability to manually trigger vnode handoff, regardless of inactivity.

Change the vnode manager to use this feature to force pending ownership
transfers therefore allowing the cluster to rebalance under load. These
forced transfers are throttled by only forcing the first N pending
transfers scheduled by the claimant, where N is the application variable
riak_core/forced_ownership_handoff (default 8). Additional handoff can
still occur due to inactivity timeouts, but all handoff ultimately remains
limited by the handoff_concurrency setting.
Joseph Blomstedt jtuple Add forwarding of coverage commands de99b37
Joseph Blomstedt jtuple Add comment about vnode states 5e0f62b
Joseph Blomstedt jtuple Merge pull request #75 from basho/az642-vnode-manager
Az642: Preliminary VNode Manager / Rebalance Under Load
Joseph Blomstedt jtuple Minor changes to admin commands + add ready_members query
-- Join checks against current ring_size rather than application variable.
-- Join requires joining node to not already be part of a cluster.
-- Down fails if the cluster is in legacy gossip mode.
-- riak_core_ring:ready_members returns nodes guaranteed safe for requests.
Joseph Blomstedt jtuple Merge pull request #77 from basho/jdb-admin-changes
Minor changes before rebase/merge of new cluster membership code