Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster membership changes #4

Open
keathley opened this issue Feb 26, 2018 · 6 comments
Open

Cluster membership changes #4

keathley opened this issue Feb 26, 2018 · 6 comments
Labels
enhancement New feature or request

Comments

@keathley
Copy link
Member

We need to support adding and removing peers to the raft cluster. While this is described briefly in section 6 of the raft paper I feel like the explanation is underspecified specifically with regards to the rejection of request vote rpcs and the explanation of "catching up" new peers before initiating the joint consensus.

I'm reading through the raft mailing list, and other resources that I'll link here in order to get a better feel for potential improvements to the solution. Right now my intuition is that we should look at using the AddServer and RemoveServer RPCs from the "ongaro thesis" which I've linked below.

Research / Links

@bitwalker
Copy link
Contributor

I'm certainly interested in working on this, so I'll set aside some time to read the paper again and catch up on some of the other links you mentioned here. My thought would be to review how etcd or Consul do it, and borrow their approach, since it is relatively battle-tested, rather than implementing our own interpretation of it (assuming we can't find a concrete "this is exactly how you do it" process).

@keathley
Copy link
Member Author

Most real implementations use the "pre-vote rpc" along with the AddServer and RemoveServer Rpcs. Both are defined in that "modifications for Raft consensus" document. Based on my research thats what etcd does as well. I'm not sure hashicorps raft has incorporated these changes at this point.

@keathley
Copy link
Member Author

keathley commented Mar 1, 2018

So after reading through all this I think it makes the most sense to use the explicit steps for initializing the cluster, adding peers, and removing peers. I think we can use something like:

@type peer :: module() | {module(), node()}

@spec initialize_cluster(peer()) :: :ok | {:error, term()}
@spec add_peer(peer(), peer()) :: :ok | {:error, {:redirect, peer()}} | {:error, :not_leader}
@spec remove_peer(peer(), peer()) :: :ok | {:error, {:redirect, peer()}} | {:error, :not_leader}

I think I have a pretty good handle on how all of this works based on the thesis paper so I'll try to lay that out here when I get a chance.

@bitwalker
Copy link
Contributor

So my take on this is that while an explicit API is nice to have (particularly for testing), it is possible to make this automatic using node monitoring (where we effectively get the equivalent of add_peer and remove_peer events). This is how Swarm handles automatically forming a cluster, although it does support a way to blacklist/whitelist nodes so that only those you want participating in the cluster are able to.

Reading through the paper and it's section on cluster membership changes, I didn't see anything that indicated this would be a problem, but I haven't dug into the internals of this library yet, so I don't know if there is a constraint based on the implementation.

I spent some time yesterday putting together a distributed process registry based on raft, and by far the thing that stood out to me was that initializing the cluster was a manual process, and there isn't a great place to put that in our own applications. It would be preferable if the implementation of the state machine server was working to form a cluster on it's own. This behaviour would need to be configurable, because there are obviously times where you may want to manually manage the configuration (testing again is an example), but I think for a library like this to be successful, it needs to be easy to use by default, and offer escape hatches for things which may need tweaking in some situations.

Thoughts?

@keathley keathley added the enhancement New feature or request label Mar 7, 2018
@keathley
Copy link
Member Author

keathley commented Mar 7, 2018

I totally agree that we need to make cluster membership changes automatic. I can work on implementing the underlying functions such as initialize_cluster, add_peer, etc. so that we can leverage them from a higher level abstraction that can automatically call those based on node monitoring.

@tzumby
Copy link

tzumby commented Oct 30, 2019

Hey guys, this may be a dumb question but wouldn't it be possible to just enumerate Node.list() and as nodes join the cluster have the already existing AppendEntries RPC get them to catch up ? I could see this being a problem if we have lots of nodes but from what I gather that's not the case with Raft clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants