Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: non-voting member for Catching up new servers #8568

Closed
lishuai87 opened this issue Sep 18, 2017 · 6 comments
Closed

raft: non-voting member for Catching up new servers #8568

lishuai87 opened this issue Sep 18, 2017 · 6 comments

Comments

@lishuai87
Copy link
Contributor

lishuai87 commented Sep 18, 2017

In thesis section 4.2.1 Catching up new servers, it says

When a server is added to the cluster, it typically will not store any log entries. If it is added to the
cluster in this state, its log could take quite a while to catch up to the leader’s, and during this time,
the cluster is more vulnerable to unavailability. For example, a three-server cluster can normally
tolerate one failure with no loss in availability. However, if a fourth server with an empty log is
added to the same cluster and one of the original three servers fails, the cluster will be temporarily
unable to commit new entries (see Figure 4.4(a)). Another availability issue can occur if many new
servers are added to a cluster in quick succession, where the new servers are needed to form a
majority of the cluster (see Figure 4.4(b)). In both cases, until the new servers’ logs were caught up
to the leader’s, the clusters would be unavailable.

etcd behave like this currently. The solution is the following

In order to avoid availability gaps, Raft introduces an additional phase before the configuration
change, in which a new server joins the cluster as a non-voting member. The leader replicates
log entries to it, but it is not yet counted towards majorities for voting or commitment purposes.
Once the new server has caught up with the rest of the cluster, the reconfiguration can proceed as
described above. (The mechanism to support non-voting servers can also be useful in other contexts;
for example, it can be used to replicate the state to a large number of servers, which can serve read-
only requests with relaxed consistency.)
@lishuai87
Copy link
Contributor Author

both https://github.com/logcabin/logcabin and https://github.com/hashicorp/raft/blob/library-v2-stage-one/membership.md
implemented non-voting member. In hashicorp/raft

We propose to re-define a configuration as a set of servers, where each server includes an address (as it does today) and a mode that is either:

    Voter: a server whose vote is counted in elections and whose match index is used in advancing the leader's commit index.
    Nonvoter: a server that receives log entries but is not considered for elections or commitment purposes.
    Staging: a server that acts like a nonvoter with one exception: once a staging server receives enough log entries to catch up sufficiently to the leader's log,
             the leader will invoke a membership change to change the staging server to a voter.
+-----------------------------------------------------------------------------+
|                                                                             |
|                      Start ->  +--------+                                   |
|            ,------<------------|        |                                   |
|           /                    | absent |                                   |
|          /       RemovePeer--> |        | <---RemovePeer                    |
|         /            |         +--------+               \                   |
|        /             |            |                      \                  |
|   AddNonvoter        |         AddVoter                   \                 |
|       |       ,->---' `--<-.      |                        \                |
|       v      /              \     v                         \               |
|  +----------+                +----------+                    +----------+   |
|  |          | ---AddVoter--> |          | -log caught up --> |          |   |
|  | nonvoter |                | staging  |                    |  voter   |   |
|  |          | <-DemoteVoter- |          |                 ,- |          |   |
|  +----------+         \      +----------+                /   +----------+   |
|                        \                                /                   |
|                         `--------------<---------------'                    |
|                                                                             |
+-----------------------------------------------------------------------------+

we can also use this approach, by introducing a new state, SuffrageState

enum SuffrageState {
	Nonvoter    = 0;
	Staging = 1;
	Voter = 2;
}

then add SuffrageState to Raft (for node itself), and Progress (for other nodes),
persist SuffrageState to Snapshot, ConfState...

@BlueBlue-Lee
Copy link
Contributor

mark

@xiang90
Copy link
Contributor

xiang90 commented Sep 18, 2017

/cc @bdarnell

@xiang90
Copy link
Contributor

xiang90 commented Sep 18, 2017

It seems fine to add this feature.

AddNonVoter should be fairly easy to add. Probably adding a new node state is enough (so we wont count it in voting and it wont vote). Not sure if we want to implement the demoteVoter initially though.

This was referenced Sep 25, 2017
@xiang90
Copy link
Contributor

xiang90 commented Nov 12, 2017

fixed by #8751

@xiang90 xiang90 closed this as completed Nov 12, 2017
@sachja
Copy link

sachja commented Jan 12, 2018

Does this solve the unavailability problem if currently the number of nodes is equal to majority and a new node is added or was down for a while and comes back up, while at the same time a current node goes down?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants