-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic membership changes #176
Comments
I've been thinking about this one for the last few days. Sorry for the long post 😟, but some ramblings about what I have in mind for this. Starting with my assumptions:
All nodes keep track of view changes, calculating who left and joined. Only add non-RAFT members, only remove RAFT members. Creating a sequence of changes, one single change at a time. This calculation is deterministic, so all nodes calculate the same thing. Only the leader can issue these requests, and once a node receives the append request, it removes the sequence head element. For two events that cancel each other, I think we are safe not applying them, for example, adding D and then removing it. Also, we need to identify the RAFT ID from the view addresses. A node that is not a RAFT member, be it through the configuration or by the internal command, does nothing. The extraneous member accepts requests only after the other members are aware. Nodes that restart can retrieve the membership information they last stored, join directly without waiting, and then receive the remaining data from the other members. The {A, B, C} -> {A, -B, -C, +D} The In more extreme cases, such as: {A, B, C} -> {A, B, C, +D, +E} -> {A, B, -C, -D, -E}
Some points of attention that come to mind are:
I wouldn't be surprised if I overlooked something or am overly simplifying. What do you think? Do we need to prepare something before starting? Again, sorry for the long post. |
Let's think about the requirements in a first step:
The more I think about this, the more I'm inclined to stay with a static membership... Let's discuss in the new year, by then I'll have had time to spend some thoughts on this. |
Yes, it would be something that adds or removes members automatically. But this requires an initial membership configured to start and correctly elect the first leader. Although, I think it makes sense to have this arbitrary membership disabled and have the static version by default. In cases of partitions, the node has to be the leader and replicate the new member list with the other nodes before changing the current membership/majority. For example, in the case Generally speaking, that would be the behavior for all changes that affect the majority. The current leader steps down, and the cluster becomes unavailable until the previous members join again and a new election happens. And it would still be unavailable even if other nodes (with different IDs) joined. But let's discuss and see if it makes sense to have something like this. Maybe this automatic membership is not needed after all. |
Hi Jose A discussion with @pruivo and @ttarant@redhat.com is needed... I want clear requirements before making such a big change... |
Is this a user request? IMO, automatic membership change is dangerous, and, except if you have a good reason, it shouldn't be implemented. The only request I have from Infinispan is to add methods to |
I agree with your assertion that dynamic membership changes are dangerous. No, this is not a user request but came from discussions at the last F2F (or the one before). |
Created #200 to keep track of that. |
When a new member
D
is added to{A,B,C}
, then all configuration files need to be changed to setmembers="A,B,C,D"
. This should be done dynamically, so that no configuration file changes should be required:add-member
,remove-member
), it uses the new configuration immediately, without waiting for it to be committed (Raft dissertation, ch. 4.1).members
configuration is stored in/with the snapshot.members
changedmembers
The advantage is that the config file never needs to be changed. However, members still need to be added/removed manually or programmatically.
The text was updated successfully, but these errors were encountered: