New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: don't ack MsgPreVotes when it cannot succeed #8517

Closed
wants to merge 1 commit into
base: master
from

Conversation

4 participants
@irfansharif
Contributor

irfansharif commented Sep 7, 2017

The changeset tightens the constraint around granted PreVotes.
Previously PreVote requests of the same term as the node's current term
would be granted despite there being no way for it to succeed
(MsgPreVotes currently specify the next term, if the next term of the requesting
node is the same as our current term, when subsequently calling for an
election it will fail). In this case we will now simply reject the vote.


While here we add a panic for a comment specified-constraint.

// The m.Term > r.Term clause is for MsgPreVote. For MsgVote m.Term
// should always equal r.Term.

TestRawNodeStep was in violation of this by sending in Msg{Pre,}Vote
messages with unspecified terms (defaulting to 0) when the recipient
node was at a higher term.

+cc @bdarnell @xiang90

raft: don't ack MsgPreVotes when it cannot succeed
The changeset tightens the constraint around positive PreVote responses.
Previously PreVote requests of the same term as the node's current term
would be granted despite there being no way for it to succeed
(MsgPreVotes specify the next term, if the next term of the requesting
node is the same as our current term, when subsequently calling for an
election it will fail). In this case we will now simply reject the vote.

---

While here we add a panic for a comment specified-constraint.

    // The m.Term > r.Term clause is for MsgPreVote. For MsgVote m.Term
    // should always equal r.Term.

TestRawNodeStep was in violation of this by sending in Msg{Pre,}Vote
messages with unspecified terms (defaulting to 0) when the recipient
node was at a higher term.
@bdarnell

This comment has been minimized.

Show comment
Hide comment
@bdarnell

bdarnell Sep 7, 2017

Member

LGTM, but can you add a test of a PreVote at the current term that would fail without this change?

Member

bdarnell commented Sep 7, 2017

LGTM, but can you add a test of a PreVote at the current term that would fail without this change?

@xiang90

This comment has been minimized.

Show comment
Hide comment
@xiang90

xiang90 Sep 8, 2017

Contributor

lgtm after adding the test @bdarnell suggested.

Contributor

xiang90 commented Sep 8, 2017

lgtm after adding the test @bdarnell suggested.

@xiang90 xiang90 added the Raft label Sep 8, 2017

@xiang90

This comment has been minimized.

Show comment
Hide comment
@xiang90

xiang90 Sep 14, 2017

Contributor

/cc @javaforfun would you like to help write a test for this code? i would like to get this merged as soon as possible.

Contributor

xiang90 commented Sep 14, 2017

/cc @javaforfun would you like to help write a test for this code? i would like to get this merged as soon as possible.

if m.Type == pb.MsgVote && m.Term != r.Term {
panic(fmt.Sprintf("m.Type == pb.MsgVote && m.Term != r.Term: %d != %d", m.Term, r.Term))
}
if (r.Vote == None || r.Vote == m.From || (m.Type == pb.MsgPreVote && m.Term > r.Term)) &&

This comment has been minimized.

@lishuai87

lishuai87 Sep 15, 2017

Contributor

seems no different to previous.
do you mean this?
(r.Vote == None || r.Vote == m.From) && (m.Type == pb.MsgVote || (m.Type == pb.MsgPreVote && m.Term > r.Term)) &&

@lishuai87

lishuai87 Sep 15, 2017

Contributor

seems no different to previous.
do you mean this?
(r.Vote == None || r.Vote == m.From) && (m.Type == pb.MsgVote || (m.Type == pb.MsgPreVote && m.Term > r.Term)) &&

This comment has been minimized.

@xiang90

xiang90 Sep 18, 2017

Contributor

the difference that we do not grant prevote if m.term <= t.term like before.

@xiang90

xiang90 Sep 18, 2017

Contributor

the difference that we do not grant prevote if m.term <= t.term like before.

This comment has been minimized.

@xiang90

xiang90 Sep 18, 2017

Contributor

the if checking seems incorrect though. i am fixing it right now.

@xiang90

xiang90 Sep 18, 2017

Contributor

the if checking seems incorrect though. i am fixing it right now.

This comment has been minimized.

@bdarnell

bdarnell Jan 23, 2018

Member

I'm looking at this again and the change doesn't make sense to me. It doesn't change the behavior of MsgPreVote - for MsgPreVote the condition was m.Term > r.Term both before and after the change. And adding the guard m.Type == pb.MsgPreVote doesn't really change anything because we know that for MsgVote, the terms are equal and the m.Term > r.Term condition will never be true.

The new code is slightly more explicit, so we might want to merge this as a cleanup, but I don't think there's a bug being fixed here. I've been unable to modify the test in #8571 to find any cases where this change makes a difference.

I do see one small bug: If m.Type == MsgPreVote && m.Term == r.Term && r.Vote == None && r.lead != None, then we will grant a PreVote when we shouldn't. However, there are not going to be enough nodes with r.Vote == None && r.lead != None to allow the PreVote to reach a quorum unless there have also been membership changes during the term.

I'll send a PR with some cleanups here. Testing the "leader without vote" scenario is difficult because it relies on membership changes, so unless you feel strongly I think I'll skip the test for now (I'm going to do some work on PreVote testing separately).

@bdarnell

bdarnell Jan 23, 2018

Member

I'm looking at this again and the change doesn't make sense to me. It doesn't change the behavior of MsgPreVote - for MsgPreVote the condition was m.Term > r.Term both before and after the change. And adding the guard m.Type == pb.MsgPreVote doesn't really change anything because we know that for MsgVote, the terms are equal and the m.Term > r.Term condition will never be true.

The new code is slightly more explicit, so we might want to merge this as a cleanup, but I don't think there's a bug being fixed here. I've been unable to modify the test in #8571 to find any cases where this change makes a difference.

I do see one small bug: If m.Type == MsgPreVote && m.Term == r.Term && r.Vote == None && r.lead != None, then we will grant a PreVote when we shouldn't. However, there are not going to be enough nodes with r.Vote == None && r.lead != None to allow the PreVote to reach a quorum unless there have also been membership changes during the term.

I'll send a PR with some cleanups here. Testing the "leader without vote" scenario is difficult because it relies on membership changes, so unless you feel strongly I think I'll skip the test for now (I'm going to do some work on PreVote testing separately).

@xiang90 xiang90 closed this in #9204 Jan 23, 2018

xiang90 pushed a commit that referenced this pull request Jan 23, 2018

raft: Clarify conditions for granting votes and prevotes.
This includes one theoretical logic change: A node that knows the
leader of the current term will no longer grant votes, even if it has
not yet voted in this term. It also adds a `m.Type == MsgPreVote`
guard on the `m.Term > r.Term` check, which was previously thought to
be incorrect (see #8517) but was actually just unclear.

Closes #8517
Closes #8571
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment