-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: ignore non-live probing follower during log truncation #34502
Conversation
e519e7a
to
6f3a0cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves a test. I'm surprised no existing tests break due to this.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):
if !progress.RecentActive { // Make no exceptions for followers who haven't contacted // us within a reasonable period of time.
Can you elaborate on what it means to "make no exceptions"? Perhaps this should be something like: "If a follower hasn't contacted us within a reasonable period of time, do not include that follower's Raft log position in the decision for where to truncate".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deserves a test.
Absolutely. Wasn't my intention to merge without one. Just wanted to show the fix.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I completely missed that comment and went straight to the code. Apologies.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
6f3a0cf
to
5b2bccd
Compare
Ready for a look. Turns out that a bug in the test prevented me from catching this in the first place. |
5b2bccd
to
007afcf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Can you elaborate on what it means to "make no exceptions"? Perhaps this should be something like: "If a follower hasn't contacted us within a reasonable period of time, do not include that follower's Raft log position in the decision for where to truncate".
Ping.
pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):
Match: 0, Next: v, }
These test cases are super hard to read. There is a minimum of code, yet I think we'd be better served by more code where each test case is easier to understand (i.e. datadriven
tests). That said, this is fine for now. Just my griping.
In the previous code, a follower in probing status which was not recently active (i.e. a dead node) would permanently suppress log truncations unless the Raft log was above threshold size (but the size tracks only what the current leaseholder has written, i.e., it can undercount dramatically). As a result, snapshots to other nodes would get blocked if the log was in fact large (>16mb), leading to ranges which effectively couldn't change their set of members. This wasn't caught in TestComputeTruncateDecisionProgressStatusProbe because of a bug in the test, which was accidentally setting the Match instead of the Next of the probing follower. By setting Match, the probing follower behaved differently from the case which would trigger the bug, and so the outcome the test asserts is actually still the same (except that it failed when the bug in the test was fixed until the bug in the truncation code was also fixed). Release note (bug fix): Prevent a problem that would cause the Raft log to grow very large which in turn could prevent replication changes. wiptest
After recent fixes to the test and the code, the test was still not suggesting a truncation, but only because fewer than 90 entries were truncatable. To make it abundantly clear that a truncation would happen if this weren't the case, switch out the probing follower in the test so that a truncation would result. Release note: None
007afcf
to
678e0fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @petermattis)
pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Ping.
Thanks. Pong.
pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):
Previously, petermattis (Peter Mattis) wrote…
These test cases are super hard to read. There is a minimum of code, yet I think we'd be better served by more code where each test case is easier to understand (i.e.
datadriven
tests). That said, this is fine for now. Just my griping.
I actually looked at that before settling on this based on need to be cherry-picked and not wanting to hold up the fix. Other than that, I agree with you and I have a WIP locally in which I try to bike shed what the datadriven test would look like. I'll send that as a separate PR (maybe not for this particular test, but I do want to start translating some to establish good examples to lean on).
It wasn't looking at the progress.State, which should be harmless but better not to trust that the Match field is correctly populated for followers in probing status. Note that there's a potential behavior change here: if a follower needs a snapshot, it will have a Match field. But if we're computing a quorum index, we implicitly assume that progress can be made from that index since a quorum of followers "has it". A follower which needs a snapshot is not able to help out with progress until it has been caught up, so including it in the quorum index is not beneficial. Release note: None
678e0fe
to
328fadc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I actually looked at that before settling on this based on need to be cherry-picked and not wanting to hold up the fix. Other than that, I agree with you and I have a WIP locally in which I try to bike shed what the datadriven test would look like. I'll send that as a separate PR (maybe not for this particular test, but I do want to start translating some to establish good examples to lean on).
Agreed that switching to datadriven tests will making backporting more difficult. We should avoid big cleanups like that when making changes we'll want to backport.
bors r=petermattis |
34502: storage: ignore non-live probing follower during log truncation r=petermattis a=tbg Need to write tests and such, but I wanted to at least post this before the weekend. ---- In the previous code, a follower in probing status which was not recently active (i.e. a dead node) would permanently suppress log truncations unless the Raft log was above threshold size (but the size tracks only what the current leaseholder has written, i.e., it can undercount dramatically). As a result, snapshots to other nodes would get blocked if the log was in fact large (>16mb), leading to ranges which effectively couldn't change their set of members. Release note (bug fix): Prevent a problem that would cause the Raft log to grow very large which in turn could prevent replication changes. Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
Build succeeded |
Need to write tests and such, but I wanted to at least post this
before the weekend.
In the previous code, a follower in probing status which was not
recently active (i.e. a dead node) would permanently suppress
log truncations unless the Raft log was above threshold size (but the
size tracks only what the current leaseholder has written, i.e., it
can undercount dramatically). As a result, snapshots to other nodes
would get blocked if the log was in fact large (>16mb), leading to
ranges which effectively couldn't change their set of members.
Release note (bug fix): Prevent a problem that would cause the Raft log
to grow very large which in turn could prevent replication changes.