Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only a higher term leader can receive a client request to add v to the log #11

Open
dawsonme opened this issue May 17, 2021 · 3 comments

Comments

@dawsonme
Copy link

When i use TLC to model check the raft specification, i find that there is a situation that the cluster has two leaders with different term, i.e, one leader with currentTerm 4, the other leader with currentTerm 5. And I think that only the leader with currentTerm 5 can receive a client request to add v to log, but in Raft specification, the Timeout(i) transition doesn't restrict it, so the leader with currentTerm 4 can also receive the client request.

@fritzalder
Copy link

Any leader can receive client requests but since you need a quorum to become leader, there should not be two leaders...? Not if you have no network partitioning at least. Could you send some trace that results in two leaders? I would be interested in that.

@dawsonme
Copy link
Author

Thanks for your reply. When i use TLC to model check the raft specification, I set the invariant that there are two leaders in the system. Then tlc model checker reported a error trace. For example, there are three servers, and server one is leader with currentTerm 2, server two and server three are both follower with currentTerm 2. Then, server three timeouts, and became a candidate with currentTerm 3. After that, server three received the vote from server two and itself, and became a new leader with currentTerm 3. Now, there are two leaders at the system with different currentTerm, e.g., leader server one with currentTerm 2 and leader server three with currentTerm 3. I think that raft can make sure that there are not two leaders with the same currentTerm, not different currentTerm.

@chiyc
Copy link

chiyc commented Jan 9, 2023

I'm late to this, but my understanding is the leader of the older term 2 will not successfully complete any client requests. Being in a leader state does not necessarily make it the leader. The leader of the latest term is still the one and only valid leader for the cluster and in practice from the client's perspective for safety.

Let's say the leader of term 2 becomes reconnected after a network partition and is still in the leader state. It receives a client request and sends AppendEntries requests for term 2 to its peers who are now on term 3. Then, it will find out that its term is outdated and become a follower. The spec doesn't cover responses back to the client, but this is probably where the server will let the client know it's not the leader.

If it's still disconnected from its peers but not the client for some reason, then it'll still not be able to fulfill the client request. The false leader will soon step down because its own election timeout for tracking heartbeat responses expires. This is a detail found in the dissertation. The client or some other system can also implement availability mechanisms to handle such situations to improve latency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants