Switch branches/tags
Commits on Sep 2, 2011
  1. Merge pull request #77 from basho/jdb-admin-changes

    jtuple committed Sep 2, 2011
    Minor changes before rebase/merge of new cluster membership code
  2. Minor changes to admin commands + add ready_members query

    jtuple committed Sep 2, 2011
    -- Join checks against current ring_size rather than application variable.
    -- Join requires joining node to not already be part of a cluster.
    -- Down fails if the cluster is in legacy gossip mode.
    -- riak_core_ring:ready_members returns nodes guaranteed safe for requests.
  3. Merge pull request #75 from basho/az642-vnode-manager

    jtuple committed Sep 2, 2011
    Az642: Preliminary VNode Manager / Rebalance Under Load
  4. Allow vnode manager to trigger handoff independent of inactivity

    jtuple committed Sep 1, 2011
    Add ability to manually trigger vnode handoff, regardless of inactivity.
    Change the vnode manager to use this feature to force pending ownership
    transfers therefore allowing the cluster to rebalance under load. These
    forced transfers are throttled by only forcing the first N pending
    transfers scheduled by the claimant, where N is the application variable
    riak_core/forced_ownership_handoff (default 8). Additional handoff can
    still occur due to inactivity timeouts, but all handoff ultimately remains
    limited by the handoff_concurrency setting.
  5. Add preliminary vnode manager, change vnode forwarding logic

    jtuple committed Sep 1, 2011
    Add a preliminary version of a vnode manager which will someday replace
    much of the pid-tracking functionality currently in riak_core_vnode_master.
    Current vnode_manager provides easy interface for finding vnode mod/idx/pid
    information and informs vnodes about ring changes as appropriate.
    Move vnode forward-on-ownership-change logic out of request critical path,
    taking advantage of vnode manager's vnode ring notification feature.
  6. Merge pull request #74 from basho/az630-gossip-rolling-upgrade

    jtuple committed Sep 2, 2011
    Az630: Add legacy gossip mode + rolling upgrade between gossip versions
Commits on Sep 1, 2011
  1. Update outdated eunit tests

    jtuple committed Sep 1, 2011
  2. Add legacy gossip mode as well as support for rolling gossip upgrades

    jtuple committed Aug 29, 2011
    Change riak_core_ring to have two different versioned records corresponding
    to the new and old ring data-structure, and add update/downgrade functions
    that convert between the two formats.
    Add legacy gossip mode that uses the old ring reconciliation logic as well
    as the old gossip/claim procedure. The legacy mode uses the old logic but
    encapsulates its data inside the new ring format (using upgrade/downgrade)
    in order to minimize code duplication. This mode is enabled by setting
    the application environment variable riak_core/legacy_gossip to true.
    Add member metadata to the new ring format and the related get_member_meta,
    update_member_meta accessors.
    Add support for rolling upgrades and mixed-gossip hybrid clusters. The
    appropriate ring format is negotiated through member metadata when
    possible, falling back to RPC queries when necessary. The cluster gossip
    protocol is determined as follows:
      -- If all nodes support the new membership protocol and are not running
         in legacy mode, the new protocol is used.
      -- If an old node or legacy-mode node joins the cluster, the entire
         cluster downgrades to legacy mode.
      -- If the old/legacy nodes leave the cluster, the new nodes return to the
         new protocol.
Commits on Aug 30, 2011
  1. Merge pull request #71 from basho/az612-membership-changes

    jtuple committed Aug 30, 2011
    az612: Add joining/down status, fix bugs
Commits on Aug 29, 2011
Commits on Aug 26, 2011
  1. Add joining/down member status, fix various bugs

    jtuple committed Aug 24, 2011
    Add 'joining' member status to new cluster membership model and
    implementation. When a node joins a cluster, it comes in with status
    'joining' rather than 'valid'. The claimant then moves a node from 'joining'
    to 'valid' after it ensures all cluster members have learned of the new
    node joining the cluster. This change guarantees that all 'valid' members
    vote on ring ready consensus under various failure scenarios.
    Add 'down' member status to new cluster membership model and implementation.
    The state is designed to allow a user to mark a down node as 'down' in order
    to allow the rest of the cluster to converge. A vote from a 'down' node is
    not necessary for ring ready consensus, and therefore a 'down' node's ring
    state may become outdated. If a 'down' node gossips to another node that
    believes it to be down (such as after coming back online), the other node
    tells the 'down' node to rejoin the cluster, therefore making its state
    current. Nodes do not gossip to 'down' nodes, and 'down' node ownership is
    not changed during a rebalance.
    Incorporate minor changes and bug fixes:
    -- Fix next merging bug in riak_core_ring:remove_node and model.
    -- Fix bug with nodes moving from 'leaving' to 'exiting' while having pending
    -- Fix negative random seed bug in join/membership model.
    -- Change core vnode so that a vnode does not shutdown if a completed handoff
       was to node that is now known to be 'invalid'.
    -- Change update_ring to remove tuples from next for all invalid nodes, even
       for completed transfers.
    -- Change leave in model to be a local transition like in the implementation.
    -- Handle "neither claimant valid" case in reconcile_ring.
Commits on Aug 24, 2011
  1. Merge pull request #65 from basho/az533-join-claim-improvement-master

    jtuple committed Aug 24, 2011
    New join/claim/membership implementation and model
Commits on Aug 23, 2011
Commits on Aug 22, 2011
  1. Add pending percent to member_status, fix bug in ring_ready_info

    jtuple committed Aug 22, 2011
    Add pending ring percent as a field in member_status that displays a node's
    ring ownership after all pending ownership transfers have completed.
    Fix riak_core_ring:ring_ready_info so that only nodes considered for ring
    convergence are checked.
Commits on Aug 19, 2011
  1. Fix data races in gossip and shutdown/restart process

    jtuple committed Aug 19, 2011
    Change gossip to use ring_trans rather than set_my_ring in order to prevent
    a data race with concurrent ring changes based on join/leave/remove commands.
    Change the refresh_ring logic to be guarded by cluster name, avoiding the
    case where a stable refresh cast arrives at a node after it has already
    shutdown and been restarted. Since the restarted node will have a new
    cluster name, the stable cast can be detected and ignored.
  2. Update ring eunit tests and code comments, update rename_node

    jtuple committed Aug 18, 2011
    Fix existing riak_core tests to work with the new cluster membership
    code, as well as add several new tests that cover the new reconciliation
    Update riak_core_ring:rename_node to support the new members and seen fields.
  3. Add random_recursive_gossip to augment recursive_gossip

    jtuple committed Aug 18, 2011
    Change recursive_gossip to be done in two parts. The initial ring change uses
    random_recursive_gossip to start the recursive gossip at a random starting
    node, the reconciliation logic then continues to use the fixed recursive_gossip
    logic to propagate the gossip forward. This change decreases gossip hot spots
    when using recursive_gossip.
Commits on Aug 17, 2011
  1. Augment random gossip with fixed gossip for faster state convergence

    jtuple committed Aug 17, 2011
    Consolidate code for random gossip into new riak_core_gossip:random_gossip,
    and update all locations that manually implement random gossip to use this
    Add new deterministic gossip code, riak_core_gossip:recursive_gossip, that
    sends a node's ring to its children vertices in a tree decomposition of the
    cluster members list.
    Change the "gossip on ring changed" code to use recursive_gossip.
    Continue using random_gossip for periodic (gossip_interval) gossip.
Commits on Aug 16, 2011
Commits on Aug 12, 2011
  1. Change riak_core_ring: ring reconciliation, ring_ready, and all_members

    jtuple committed Aug 12, 2011
    Merge vclocks on all ring reconciliation paths, revert to old ring ready
    behavior, and have all_members/1 use the private get_members/1.
Commits on Aug 10, 2011
Commits on Aug 9, 2011