Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upstorage: correctly handle empty forwarded proposals #31066
Conversation
tschottdorf
requested a review
from cockroachdb/core-prs
as a
code owner
Oct 8, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tschottdorf
reviewed
Oct 8, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/storage/replica.go, line 4107 at r1 (raw file):
} if rg.Status().RaftState != raft.StateLeader {
Isn't this not enough though? We might think we're the leader but we're not. Wouldn't that result in us tracking something that may never actually get committed, and rejecting reproposals based on that?
tschottdorf
requested a review
from
nvanbenschoten
Oct 8, 2018
petermattis
approved these changes
Oct 8, 2018
for the protection against processing empty proposals.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/storage/replica.go, line 4107 at r1 (raw file):
We might think we're the leader but we're not.
How could that happen? This method is called immediately after raftGroup.Step which should have an up to date view of leadership. One problem that could happen is that we're not the leader for part of Message, but one of the Entries steps the state machine causing us to become the leader.
tschottdorf
reviewed
Oct 8, 2018
Reviewable status:
complete! 1 of 0 LGTMs obtained
pkg/storage/replica.go, line 4107 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
We might think we're the leader but we're not.
How could that happen? This method is called immediately after
raftGroup.Stepwhich should have an up to date view of leadership. One problem that could happen is that we're not the leader for part ofMessage, but one of theEntriessteps the state machine causing us to become the leader.
That's just the local view of leadership, the rest of the replica set could've moved on to a new leader under a higher term.
I think this is OK though, for then the next reproposal would reach a new leader.
Good point though about obtaining leadership. Aren't we in the clear though? We Step the whole message into the Raft group before checking the leadership.
petermattis
reviewed
Oct 8, 2018
Reviewable status:
complete! 1 of 0 LGTMs obtained
pkg/storage/replica.go, line 4107 at r1 (raw file):
Good point though about obtaining leadership. Aren't we in the clear though? We Step the whole message into the Raft group before checking the leadership.
I think you're correct. We step the whole message, but ErrProposalDropped will be returned if we're not the leader at the start of processing Message.Entries and see MsgProp.
bdarnell
reviewed
Oct 8, 2018
Reviewable status:
complete! 1 of 0 LGTMs obtained
pkg/storage/replica.go, line 4119 at r1 (raw file):
switch e.Type { case raftpb.EntryNormal: if len(e.Data) == 0 {
Both kinds of "empty" proposals exist. Raft itself generates log entries with len(e.Data) == 0 for a couple of reasons (new leaders and concurrent config changes), and we generate the "command id with empty data" ones when waking the leader. I think we still need the guard here for empty data after decoding the command id, in addition to the guard for completely empty entries.
I also don't understand how a raft-generated empty entry would ever get forwarded. These entries are only generated on leaders and become MsgApp immediately, never MsgProp.
nvanbenschoten
requested changes
Oct 8, 2018
Reviewable status:
complete! 1 of 0 LGTMs obtained
pkg/storage/replica.go, line 4107 at r1 (raw file):
I think this is OK though, for then the next reproposal would reach a new leader.
Yes, that's the rationale that's being used to justify this. However, I'm really starting to doubt our ability to figure out what's going to end up in the Raft log and what's not. I'd like to discuss #30064 (comment) as an alternative to all of this.
pkg/storage/replica_test.go, line 8575 at r1 (raw file):
}, expDrop: false, expRemotePropsAfter: 2,
Hold on, we don't want to make this change. This was the cause of a memory leak before which was fixed in #29618.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tschottdorf
Oct 9, 2018
Member
|
I'm going to let this sit.
#30064 (comment)
seems reasonable to me!
On Tue, Oct 9, 2018 at 12:48 AM Nathan VanBenschoten < ***@***.***> wrote:
***@***.**** requested changes on this pull request.
[image: <img class="emoji" title=":lgtm_cancel:" alt=":lgtm_cancel:" align="absmiddle" src="https://reviewable.io/lgtm_cancel.png" height="20" width="61"/>]
<https://camo.githubusercontent.com/a4a681670ff21a0685eaeb1e6cfe1d2543652ce0/68747470733a2f2f72657669657761626c652e696f2f6c67746d5f63616e63656c2e706e67>
*Reviewable
<https://reviewable.io/reviews/cockroachdb/cockroach/31066#-:-LOKrEQmDIiiUFldWYRN:b8b79qu>*
status: [image:
--
…-- Tobias
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Closing. @nvanbenschoten will rip all of this out. |
tschottdorf commentedOct 8, 2018
The code assumed that an empty proposal would be a command ID with empty
data, but it's really the empty slice (command IDs+data is our encoding;
Raft itself emits an entry with completely blank data).
Fixes #31050.
Release note: None