indexOnMajority is wrong #42

Open
schuster opened this Issue Apr 15, 2015 · 3 comments

Comments

Projects
None yet
3 participants
@schuster

Instead of calculating the newest index that a majority of the servers have in their logs, indexOnMajority appears to calculate something like the "mode" of matching indices: it returns the index that the largest number of members have as their match index.

For example, say that a cluster of five servers have match indices 1, 1, 2, 3, and 4. The majority of servers have at least entry 2 in their logs, but this function will return 1 because more servers have 1 as their match index.

I think this algorithm (pseudocode) should work instead:
sortedIndices = sortDescending(matchIndices)
return sortedIndices[ceiling(length(config.members) / 2) - 1]

@ktoso ktoso added the protocol-fix label Apr 15, 2015

@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 8, 2015

Contributor

I have an execution trace that triggers this behavior:

In a four node cluster, the leader is raft-member-3. Right before receiving an AppendSuccessful from raft-member-2 for the 1st log entry [and before receiving AppendSuccessful messages from any other members], the leader's state is:

BEFORE RECEIVE, LOG: ReplicatedLog(List(Entry(AppendWord(WORD1),Term(1),1,Some(Actor[akka://new-system-0/deadLetters]))),0,5)
BEFORE RECEIVE, STATE: LeaderMeta(Actor[akka://new-system-0/user/raft-member-3#-2011805761],Term(1),StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4)))
BEFORE RECEIVE, nextIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 2, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 2, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 2, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 2),2)
BEFORE RECEIVE, matchIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 0, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 0, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 1, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 0),0)

Upon receiving the message:

[INFO] [08/08/2015 01:18:38.792] [new-system-0-akka.actor.default-dispatcher-3] [akka://new-system-0/user/raft-member-3] Follower Actor[akka://new-system-0/user/raft-member-2#1657544712] took write in term: Term(1), next index was: 2
AFTER RECEIVE, LOG: ReplicatedLog(List(Entry(AppendWord(WORD1),Term(1),1,Some(Actor[akka://new-system-0/deadLetters]))),1,5)
**[INFO] [08/08/2015 01:18:38.797] [new-system-0-akka.actor.default-dispatcher-3] [akka://new-system-0/user/raft-member-3] Consensus for persisted index: 1. (Comitted index: 0, will commit now: true)**
AFTER RECEIVE, STATE: LeaderMeta(Actor[akka://new-system-0/user/raft-member-3#-2011805761],Term(1),StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4)))
AFTER RECEIVE, nextIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 2, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 2, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 2, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 2),2)
AFTER RECEIVE, matchIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 0, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 1, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 1, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 0),0)

[It should not have reached consensus at that point]

Note that this console output is using 1-indexed logs

Contributor

colin-scott commented Aug 8, 2015

I have an execution trace that triggers this behavior:

In a four node cluster, the leader is raft-member-3. Right before receiving an AppendSuccessful from raft-member-2 for the 1st log entry [and before receiving AppendSuccessful messages from any other members], the leader's state is:

BEFORE RECEIVE, LOG: ReplicatedLog(List(Entry(AppendWord(WORD1),Term(1),1,Some(Actor[akka://new-system-0/deadLetters]))),0,5)
BEFORE RECEIVE, STATE: LeaderMeta(Actor[akka://new-system-0/user/raft-member-3#-2011805761],Term(1),StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4)))
BEFORE RECEIVE, nextIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 2, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 2, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 2, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 2),2)
BEFORE RECEIVE, matchIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 0, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 0, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 1, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 0),0)

Upon receiving the message:

[INFO] [08/08/2015 01:18:38.792] [new-system-0-akka.actor.default-dispatcher-3] [akka://new-system-0/user/raft-member-3] Follower Actor[akka://new-system-0/user/raft-member-2#1657544712] took write in term: Term(1), next index was: 2
AFTER RECEIVE, LOG: ReplicatedLog(List(Entry(AppendWord(WORD1),Term(1),1,Some(Actor[akka://new-system-0/deadLetters]))),1,5)
**[INFO] [08/08/2015 01:18:38.797] [new-system-0-akka.actor.default-dispatcher-3] [akka://new-system-0/user/raft-member-3] Consensus for persisted index: 1. (Comitted index: 0, will commit now: true)**
AFTER RECEIVE, STATE: LeaderMeta(Actor[akka://new-system-0/user/raft-member-3#-2011805761],Term(1),StableRaftConfiguration(Set(raft-member-1, raft-member-2, raft-member-3, raft-member-4)))
AFTER RECEIVE, nextIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 2, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 2, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 2, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 2),2)
AFTER RECEIVE, matchIndex: LogIndexMap(Map(Actor[akka://new-system-0/user/raft-member-1#-1436375817] -> 0, Actor[akka://new-system-0/user/raft-member-2#1657544712] -> 1, Actor[akka://new-system-0/user/raft-member-3#-2011805761] -> 1, Actor[akka://new-system-0/user/raft-member-4#1901315846] -> 0),0)

[It should not have reached consensus at that point]

Note that this console output is using 1-indexed logs

@ktoso ktoso modified the milestone: 1.0 Aug 9, 2015

@colin-scott

This comment has been minimized.

Show comment
Hide comment
@colin-scott

colin-scott Aug 13, 2015

Contributor

For what it's worth, I have a fix for this (soon to be pull-requested) here:

NetSys/demi-applications@60c330f

[Assumes 1-indexed logs]

Contributor

colin-scott commented Aug 13, 2015

For what it's worth, I have a fix for this (soon to be pull-requested) here:

NetSys/demi-applications@60c330f

[Assumes 1-indexed logs]

@ktoso

This comment has been minimized.

Show comment
Hide comment
@ktoso

ktoso Aug 13, 2015

Owner

Great :) I don't mind switching to 1-indexed along the way, makes things more consistent with the paper => easier to follow the code.

Owner

ktoso commented Aug 13, 2015

Great :) I don't mind switching to 1-indexed along the way, makes things more consistent with the paper => easier to follow the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment