-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster membership changes #4
Comments
I'm certainly interested in working on this, so I'll set aside some time to read the paper again and catch up on some of the other links you mentioned here. My thought would be to review how etcd or Consul do it, and borrow their approach, since it is relatively battle-tested, rather than implementing our own interpretation of it (assuming we can't find a concrete "this is exactly how you do it" process). |
Most real implementations use the "pre-vote rpc" along with the |
So after reading through all this I think it makes the most sense to use the explicit steps for initializing the cluster, adding peers, and removing peers. I think we can use something like: @type peer :: module() | {module(), node()}
@spec initialize_cluster(peer()) :: :ok | {:error, term()}
@spec add_peer(peer(), peer()) :: :ok | {:error, {:redirect, peer()}} | {:error, :not_leader}
@spec remove_peer(peer(), peer()) :: :ok | {:error, {:redirect, peer()}} | {:error, :not_leader} I think I have a pretty good handle on how all of this works based on the thesis paper so I'll try to lay that out here when I get a chance. |
So my take on this is that while an explicit API is nice to have (particularly for testing), it is possible to make this automatic using node monitoring (where we effectively get the equivalent of add_peer and remove_peer events). This is how Swarm handles automatically forming a cluster, although it does support a way to blacklist/whitelist nodes so that only those you want participating in the cluster are able to. Reading through the paper and it's section on cluster membership changes, I didn't see anything that indicated this would be a problem, but I haven't dug into the internals of this library yet, so I don't know if there is a constraint based on the implementation. I spent some time yesterday putting together a distributed process registry based on Thoughts? |
I totally agree that we need to make cluster membership changes automatic. I can work on implementing the underlying functions such as |
Hey guys, this may be a dumb question but wouldn't it be possible to just enumerate |
We need to support adding and removing peers to the raft cluster. While this is described briefly in section 6 of the raft paper I feel like the explanation is underspecified specifically with regards to the rejection of request vote rpcs and the explanation of "catching up" new peers before initiating the joint consensus.
I'm reading through the raft mailing list, and other resources that I'll link here in order to get a better feel for potential improvements to the solution. Right now my intuition is that we should look at using the
AddServer
andRemoveServer
RPCs from the "ongaro thesis" which I've linked below.Research / Links
The text was updated successfully, but these errors were encountered: