New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes should not forget who they voted for #56

Open
colin-scott opened this Issue Aug 4, 2015 · 11 comments

Comments

Projects
None yet
3 participants
@colin-scott
Contributor

colin-scott commented Aug 4, 2015

When a Candidate receives an AppendEntries message from a leader in the same term, it correctly steps down to Follower.

However, it currently forgets who it voted for that term.

This can lead to two leaders being elected in the same term: the Candidate that stepped down may now vote for another Candidate who will eventually become the second leader.

This behavior can occur even after merging in #55.

I have a test case that triggers this behavior. Here are the messages delivered to trigger this bug [format is sender,receiver,message, and the cluster has 4 nodes in it]:

MsgEvent(deadLetters,raft-member-4,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4))))
MsgEvent(deadLetters,raft-member-1,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4))))
MsgEvent(deadLetters,raft-member-1,Timer(election-timer,ElectionTimeout,false,1))
MsgEvent(deadLetters,raft-member-4,Timer(election-timer,ElectionTimeout,false,1))
MsgEvent(deadLetters,raft-member-3,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4))))
MsgEvent(deadLetters,raft-member-1,Timer(election-timer,ElectionTimeout,false,3))
MsgEvent(deadLetters,raft-member-4,Timer(election-timer,ElectionTimeout,false,3))
MsgEvent(raft-member-4,raft-member-4,BeginElection)
MsgEvent(deadLetters,raft-member-2,ChangeConfiguration(StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4))))
MsgEvent(raft-member-4,raft-member-1,RequestVote(Term(2),Actor[akka://new-system-0/user/raft-member-4#-1663394226],Term(0),0))
MsgEvent(deadLetters,raft-member-3,Timer(election-timer,ElectionTimeout,false,1))
MsgEvent(deadLetters,raft-member-2,Timer(election-timer,ElectionTimeout,false,1))
MsgEvent(deadLetters,raft-member-3,Timer(election-timer,ElectionTimeout,false,3))
MsgEvent(raft-member-3,raft-member-3,BeginElection)
MsgEvent(raft-member-3,raft-member-3,BeginElection)
MsgEvent(raft-member-4,raft-member-4,BeginElection)
MsgEvent(deadLetters,raft-member-2,Timer(election-timer,ElectionTimeout,false,3))
MsgEvent(raft-member-1,raft-member-4,VoteCandidate(Term(2)))
MsgEvent(raft-member-4,raft-member-2,RequestVote(Term(2),Actor[akka://new-system-0/user/raft-member-4#-1663394226],Term(0),0))
MsgEvent(raft-member-2,raft-member-4,VoteCandidate(Term(2)))
MsgEvent(raft-member-4,raft-member-4,ElectedAsLeader)
MsgEvent(raft-member-4,raft-member-1,AppendEntries(term:Term(2),prevLog:(Term(1),1),entries:List()))
MsgEvent(raft-member-3,raft-member-1,RequestVote(Term(2),Actor[akka://new-system-0/user/raft-member-3#-463317739],Term(0),0))
MsgEvent(raft-member-4,raft-member-2,AppendEntries(term:Term(2),prevLog:(Term(1),1),entries:List()))
MsgEvent(raft-member-1,raft-member-3,VoteCandidate(Term(2)))
MsgEvent(raft-member-3,raft-member-2,RequestVote(Term(2),Actor[akka://new-system-0/user/raft-member-3#-463317739],Term(0),0))
MsgEvent(raft-member-2,raft-member-3,VoteCandidate(Term(2)))

At this point, both raft-member-3 and raft-member-4 are leaders for Term(2).

Let me know if you want more detailed reproducing steps.

@dmitraver

This comment has been minimized.

Show comment
Hide comment
@dmitraver

dmitraver Aug 6, 2015

Contributor

I don't quite understand how two leaders can be elected. When the the candidate that stepped down will start a new election he will increment his term and will start sending requests to all other members with a new term in a request and soon previous leader will discover that his term is out of date and will revert to the follower state. This behavior of leader however is not implemented because now he doesn't revert to follower after receiving new term number.

Contributor

dmitraver commented Aug 6, 2015

I don't quite understand how two leaders can be elected. When the the candidate that stepped down will start a new election he will increment his term and will start sending requests to all other members with a new term in a request and soon previous leader will discover that his term is out of date and will revert to the follower state. This behavior of leader however is not implemented because now he doesn't revert to follower after receiving new term number.

@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 6, 2015

Contributor

@dmitraver in this particular test case, when the candidate that stepped down starts a new election, it does indeed increment its term and send requests to all members. However, those requests are delayed indefinitely, so others do not step down immediately.

And, regardless of how long the requests are delayed for: Raft's "LeaderSafety" invariant states that at no point in time, should there be two nodes who both believe themselves to be leader for the same Term. You're quite right this should eventually correct itself, but in the interrim you could have a very bad situation, e.g. the two leaders could overwrite eachother's committed entries, and end up with corrupted state machines by the time the leaders right themselves out.

Does that make sense?

Contributor

colin-scott commented Aug 6, 2015

@dmitraver in this particular test case, when the candidate that stepped down starts a new election, it does indeed increment its term and send requests to all members. However, those requests are delayed indefinitely, so others do not step down immediately.

And, regardless of how long the requests are delayed for: Raft's "LeaderSafety" invariant states that at no point in time, should there be two nodes who both believe themselves to be leader for the same Term. You're quite right this should eventually correct itself, but in the interrim you could have a very bad situation, e.g. the two leaders could overwrite eachother's committed entries, and end up with corrupted state machines by the time the leaders right themselves out.

Does that make sense?

@dmitraver

This comment has been minimized.

Show comment
Hide comment
@dmitraver

dmitraver Aug 6, 2015

Contributor

Yep, I think that checking for correct term should be enough to solve this problem because members should also reject AppendEntries if they come from different term. Correct me if I'm wrong. Anyway I will fix the term handling for leader and see what is going on then and as I understand your test can be used to check this behavior. Could you please tell me how can I run it myself?

Contributor

dmitraver commented Aug 6, 2015

Yep, I think that checking for correct term should be enough to solve this problem because members should also reject AppendEntries if they come from different term. Correct me if I'm wrong. Anyway I will fix the term handling for leader and see what is going on then and as I understand your test can be used to check this behavior. Could you please tell me how can I run it myself?

@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 6, 2015

Contributor

Hm, I don't understand your point. Members currently do reject AppendEntries if they come from a lower term [see Follower and Candidate].

Anyhow, here are the reproducing steps for the execution:

git clone -b raft-56 git@github.com:NetSys/sts2-applications.git
git clone git@github.com:NetSys/sts2-experiments.git experiments
sbt assembly
java -d64 -Xmx15g -cp target/scala-2.11/randomSearch-assembly-0.1.jar Main

Near the bottom of the console output, you should see:

Violation?: Some(RaftViolation(Map(ElectionSafety:raft-member-1:raft-member-4 -> Set(raft-member-4, raft-member-1))))

Meaning that raft-member-1 and raft-member-4 are leaders for the same term.

I also added println's to RaftActor to show what the state of each actor is before and after each message receive. Look for BEFORE RECEIVE and AFTER RECEIVE markers.

Note that this snapshot of the akka-raft code differs slightly from the master branch -- I wanted to show that this bug can be triggered in isolation from the other issues.

The changes are:

  • I merged your pull request #55 [reject if term < currentTerm, and for all nodes other than leader, step down if term > currentLeader]
  • I added code in Leader.scala to step down if term > currentTerm for RequestVote and VoteCandidate
  • Fix #29 [allow nodes to reissue votes for the same candidate in the same term] and fix #45 [distinguish votes from the same follower]
Contributor

colin-scott commented Aug 6, 2015

Hm, I don't understand your point. Members currently do reject AppendEntries if they come from a lower term [see Follower and Candidate].

Anyhow, here are the reproducing steps for the execution:

git clone -b raft-56 git@github.com:NetSys/sts2-applications.git
git clone git@github.com:NetSys/sts2-experiments.git experiments
sbt assembly
java -d64 -Xmx15g -cp target/scala-2.11/randomSearch-assembly-0.1.jar Main

Near the bottom of the console output, you should see:

Violation?: Some(RaftViolation(Map(ElectionSafety:raft-member-1:raft-member-4 -> Set(raft-member-4, raft-member-1))))

Meaning that raft-member-1 and raft-member-4 are leaders for the same term.

I also added println's to RaftActor to show what the state of each actor is before and after each message receive. Look for BEFORE RECEIVE and AFTER RECEIVE markers.

Note that this snapshot of the akka-raft code differs slightly from the master branch -- I wanted to show that this bug can be triggered in isolation from the other issues.

The changes are:

  • I merged your pull request #55 [reject if term < currentTerm, and for all nodes other than leader, step down if term > currentLeader]
  • I added code in Leader.scala to step down if term > currentTerm for RequestVote and VoteCandidate
  • Fix #29 [allow nodes to reissue votes for the same candidate in the same term] and fix #45 [distinguish votes from the same follower]
@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 6, 2015

Contributor

As far as I can tell, the only remaining issue here is with this line of code:

https://github.com/ktoso/akka-raft/blob/master/src/main/scala/pl/project13/scala/akka/raft/Candidate.scala#L67

The Candidate forgets who it voted for when it steps down to Follower, and then votes again in the same term for someone else.

Contributor

colin-scott commented Aug 6, 2015

As far as I can tell, the only remaining issue here is with this line of code:

https://github.com/ktoso/akka-raft/blob/master/src/main/scala/pl/project13/scala/akka/raft/Candidate.scala#L67

The Candidate forgets who it voted for when it steps down to Follower, and then votes again in the same term for someone else.

@dmitraver

This comment has been minimized.

Show comment
Hide comment
@dmitraver

dmitraver Aug 7, 2015

Contributor

Excellent work Colin. I will investigate your changes on a weekend and will update my pull request accordingly.

Contributor

dmitraver commented Aug 7, 2015

Excellent work Colin. I will investigate your changes on a weekend and will update my pull request accordingly.

@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 8, 2015

Contributor

For what it's worth, I have a (non-pull-request-worthy) fix for this issue here:

NetSys/demi-applications@ef443f5

Contributor

colin-scott commented Aug 8, 2015

For what it's worth, I have a (non-pull-request-worthy) fix for this issue here:

NetSys/demi-applications@ef443f5

@ktoso ktoso added the protocol-fix label Aug 8, 2015

@ktoso

This comment has been minimized.

Show comment
Hide comment
@ktoso

ktoso Aug 8, 2015

Owner

@colin-scott I totally think all those fixes are definitely PR worthy actually! Would you not want to send in PRs with those (I'd like to give credit where it's due for the fixes this way)?
Thanks for the continued research/work on it!

Owner

ktoso commented Aug 8, 2015

@colin-scott I totally think all those fixes are definitely PR worthy actually! Would you not want to send in PRs with those (I'd like to give credit where it's due for the fixes this way)?
Thanks for the continued research/work on it!

@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 8, 2015

Contributor

Sure I can send in PRs, I just don't have time right now to write proper unit tests. Do you still want me to send them?

Contributor

colin-scott commented Aug 8, 2015

Sure I can send in PRs, I just don't have time right now to write proper unit tests. Do you still want me to send them?

@ktoso

This comment has been minimized.

Show comment
Hide comment
@ktoso

ktoso Aug 8, 2015

Owner

I'd be more than happy to see those small fixes as PRs and would then add more and more tests (best to uncover what this specific PR fixes). I.e. I'd take it upon myself to add tests proving that the patch makes sense as described :)

Owner

ktoso commented Aug 8, 2015

I'd be more than happy to see those small fixes as PRs and would then add more and more tests (best to uncover what this specific PR fixes). I.e. I'd take it upon myself to add tests proving that the patch makes sense as described :)

@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 8, 2015

Contributor

OK, sounds good -- I'll do that sometime today

Contributor

colin-scott commented Aug 8, 2015

OK, sounds good -- I'll do that sometime today

@ktoso ktoso modified the milestone: 1.0 Aug 9, 2015

@colin-scott colin-scott referenced this issue Aug 14, 2015

Open

Raft 56 #73

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment