Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic membership changes #176

Closed
belaban opened this issue May 25, 2022 · 7 comments
Closed

Dynamic membership changes #176

belaban opened this issue May 25, 2022 · 7 comments
Assignees
Labels
Milestone

Comments

@belaban
Copy link
Member

belaban commented May 25, 2022

When a new member D is added to {A,B,C}, then all configuration files need to be changed to set members="A,B,C,D". This should be done dynamically, so that no configuration file changes should be required:

  • When a member gets a configuration change (add-member, remove-member), it uses the new configuration immediately, without waiting for it to be committed (Raft dissertation, ch. 4.1).
  • This requires that the members configuration is stored in/with the snapshot.
  • When a new member starts, its initial membership is configured by the XML config file
  • Then (if present) the snapshot is read and members changed
  • Then the log is read, this may also contain configuration changes, changing members

The advantage is that the config file never needs to be changed. However, members still need to be added/removed manually or programmatically.

@belaban belaban added this to the 1.0.10 milestone May 25, 2022
@belaban belaban self-assigned this May 25, 2022
@belaban belaban modified the milestones: 1.0.10, 1.0.11 Jun 13, 2022
@jabolina
Copy link
Member

I've been thinking about this one for the last few days. Sorry for the long post 😟, but some ramblings about what I have in mind for this. Starting with my assumptions:

  • The cluster needs an initial configuration to know who/how many are members. We need this to calculate the majority. We can see this as a minimal number of nodes.
  • The ELECTION still works, electing the leader - if necessary and possible - when the view change. We need this for safety.
  • All nodes receive the view updates in the same order.
  • The nodes can keep the same RAFT ID through restarts.

All nodes keep track of view changes, calculating who left and joined. Only add non-RAFT members, only remove RAFT members. Creating a sequence of changes, one single change at a time. This calculation is deterministic, so all nodes calculate the same thing. Only the leader can issue these requests, and once a node receives the append request, it removes the sequence head element. For two events that cancel each other, I think we are safe not applying them, for example, adding D and then removing it. Also, we need to identify the RAFT ID from the view addresses.

A node that is not a RAFT member, be it through the configuration or by the internal command, does nothing. The extraneous member accepts requests only after the other members are aware. Nodes that restart can retrieve the membership information they last stored, join directly without waiting, and then receive the remaining data from the other members.

The ELECTION will ensure that we have a majority. We can proceed with adding and removing members as long as we have a leader, so we have a guarantee that we will not have a split with different member views. For a configuration containing members A, B, and C, cases such as:

{A, B, C} -> {A, -B, -C, +D}

The ELECTION guarantees that we proceed when it's safe only. If we receive another view adding B or C again, we can proceed directly without issuing add commands, as they will be in the configuration file. After a leader election, it requests to add the new member D.

In more extreme cases, such as:

{A, B, C} -> {A, B, C, +D, +E} -> {A, B, -C, -D, -E}

ELECTION again guarantees that we do not proceed with changing the members. Since we don't have a majority, it must wait until then to start issuing the requests. The nodes keep track of the changes of removing the members in the sequence, and once we receive an update adding one of them back again, we don't need to issue the request. Pretty much any change that affects the majority has this behavior.

Some points of attention that come to mind are:

  • What to do when the node receives a remove for itself?
  • Waiting to be accepted as a member is the safe way, but could it take too much time?
  • It would be better not to send events for non-RAFT instead of discarding them on the destination.

I wouldn't be surprised if I overlooked something or am overly simplifying. What do you think? Do we need to prepare something before starting? Again, sorry for the long post.

@belaban
Copy link
Member Author

belaban commented Dec 26, 2022

Let's think about the requirements in a first step:

  • When a member whose ID is not in RAFT.membership joins, do we automatically want to add it to membership, ie. change the majority?
  • When a member leaves, do we automatically want to change membership? E.g. we have {A,B,C} and B and C leave: what happens? Perhaps we want the cluster to be non-operational, as there's no majority, or do we want A to be able to commit changes, as A has a majority in membership {A}.
  • In the latter case, what do we do when we have a split brain, and {a} on one side and {b} on the other. In the above case, both A and B would be able to assume leadership and commit changes!

The more I think about this, the more I'm inclined to stay with a static membership... Let's discuss in the new year, by then I'll have had time to spend some thoughts on this.
@pruivo and Tristan should be part of this, too.

@jabolina
Copy link
Member

Yes, it would be something that adds or removes members automatically. But this requires an initial membership configured to start and correctly elect the first leader. Although, I think it makes sense to have this arbitrary membership disabled and have the static version by default.

In cases of partitions, the node has to be the leader and replicate the new member list with the other nodes before changing the current membership/majority. For example, in the case B and C left, the cluster would be non-operational, as leader A calculates the majority on the list {A, B, C} before issuing the remove commands. In this case, where B and C left, A would become a follower. The cluster would only be back operational when there is a majority again to elect a new leader, and this only would happen strictly if B or C came back.

Generally speaking, that would be the behavior for all changes that affect the majority. The current leader steps down, and the cluster becomes unavailable until the previous members join again and a new election happens. And it would still be unavailable even if other nodes (with different IDs) joined.

But let's discuss and see if it makes sense to have something like this. Maybe this automatic membership is not needed after all.

@belaban
Copy link
Member Author

belaban commented Jan 3, 2023

Hi Jose
the question is that - when we have {A,B,C}, and B and C leave, do we want to dynamically change the membership to {A}? In this case, A would become leader as it has the majority. If not, the cluster would be non-operational as A does not have a majority.
In the former case, we can get into trouble with multiple leaders in a split brain scenario (see my last bullet item in the comment before yours)...

A discussion with @pruivo and @ttarant@redhat.com is needed... I want clear requirements before making such a big change...

@pruivo
Copy link
Collaborator

pruivo commented Jan 3, 2023

Is this a user request? IMO, automatic membership change is dangerous, and, except if you have a good reason, it shouldn't be implemented.

The only request I have from Infinispan is to add methods to RaftHandle to add and remove members. It would be nice to have JMX operations too.

@belaban
Copy link
Member Author

belaban commented Jan 3, 2023

I agree with your assertion that dynamic membership changes are dangerous. No, this is not a user request but came from discussions at the last F2F (or the one before).
@jabolina I guess let's refine the new requirements (add them (and this comment) to the JIRA) and provide what Pedro needs.

@jabolina
Copy link
Member

jabolina commented Jan 3, 2023

I agree with your assertion that dynamic membership changes are dangerous. No, this is not a user request but came from discussions at the last F2F (or the one before). @jabolina I guess let's refine the new requirements (add them (and this comment) to the JIRA) and provide what Pedro needs.

Created #200 to keep track of that.

@belaban belaban closed this as completed Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants