Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: ignore non-live probing follower during log truncation #34502

Merged
merged 3 commits into from
Feb 8, 2019

Conversation

tbg
Copy link
Member

@tbg tbg commented Feb 1, 2019

Need to write tests and such, but I wanted to at least post this
before the weekend.


In the previous code, a follower in probing status which was not
recently active (i.e. a dead node) would permanently suppress
log truncations unless the Raft log was above threshold size (but the
size tracks only what the current leaseholder has written, i.e., it
can undercount dramatically). As a result, snapshots to other nodes
would get blocked if the log was in fact large (>16mb), leading to
ranges which effectively couldn't change their set of members.

Release note (bug fix): Prevent a problem that would cause the Raft log
to grow very large which in turn could prevent replication changes.

@tbg tbg requested a review from a team February 1, 2019 21:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves a test. I'm surprised no existing tests break due to this.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)


pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):

		if !progress.RecentActive {
			// Make no exceptions for followers who haven't contacted
			// us within a reasonable period of time.

Can you elaborate on what it means to "make no exceptions"? Perhaps this should be something like: "If a follower hasn't contacted us within a reasonable period of time, do not include that follower's Raft log position in the decision for where to truncate".

Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves a test.

Absolutely. Wasn't my intention to merge without one. Just wanted to show the fix.

✂-1

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I completely missed that comment and went straight to the code. Apologies.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

@tbg
Copy link
Member Author

tbg commented Feb 6, 2019

Ready for a look. Turns out that a bug in the test prevented me from catching this in the first place.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)


pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Can you elaborate on what it means to "make no exceptions"? Perhaps this should be something like: "If a follower hasn't contacted us within a reasonable period of time, do not include that follower's Raft log position in the decision for where to truncate".

Ping.


pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):

						Match:        0,
						Next:         v,
					}

These test cases are super hard to read. There is a minimum of code, yet I think we'd be better served by more code where each test case is easier to understand (i.e. datadriven tests). That said, this is fine for now. Just my griping.

In the previous code, a follower in probing status which was not
recently active (i.e. a dead node) would permanently suppress
log truncations unless the Raft log was above threshold size (but the
size tracks only what the current leaseholder has written, i.e., it
can undercount dramatically). As a result, snapshots to other nodes
would get blocked if the log was in fact large (>16mb), leading to
ranges which effectively couldn't change their set of members.

This wasn't caught in TestComputeTruncateDecisionProgressStatusProbe
because of a bug in the test, which was accidentally setting the Match
instead of the Next of the probing follower. By setting Match, the
probing follower behaved differently from the case which would trigger
the bug, and so the outcome the test asserts is actually still the same
(except that it failed when the bug in the test was fixed until the bug
in the truncation code was also fixed).

Release note (bug fix): Prevent a problem that would cause the Raft log
to grow very large which in turn could prevent replication changes.

wiptest
After recent fixes to the test and the code, the test was still not
suggesting a truncation, but only because fewer than 90 entries were
truncatable. To make it abundantly clear that a truncation would
happen if this weren't the case, switch out the probing follower in
the test so that a truncation would result.

Release note: None
Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @petermattis)


pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Ping.

Thanks. Pong.


pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):

Previously, petermattis (Peter Mattis) wrote…

These test cases are super hard to read. There is a minimum of code, yet I think we'd be better served by more code where each test case is easier to understand (i.e. datadriven tests). That said, this is fine for now. Just my griping.

I actually looked at that before settling on this based on need to be cherry-picked and not wanting to hold up the fix. Other than that, I agree with you and I have a WIP locally in which I try to bike shed what the datadriven test would look like. I'll send that as a separate PR (maybe not for this particular test, but I do want to start translating some to establish good examples to lean on).

It wasn't looking at the progress.State, which should be harmless
but better not to trust that the Match field is correctly populated
for followers in probing status.

Note that there's a potential behavior change here: if a follower
needs a snapshot, it will have a Match field. But if we're computing
a quorum index, we implicitly assume that progress can be made from
that index since a quorum of followers "has it". A follower which
needs a snapshot is not able to help out with progress until it
has been caught up, so including it in the quorum index is not
beneficial.

Release note: None
Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)


pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I actually looked at that before settling on this based on need to be cherry-picked and not wanting to hold up the fix. Other than that, I agree with you and I have a WIP locally in which I try to bike shed what the datadriven test would look like. I'll send that as a separate PR (maybe not for this particular test, but I do want to start translating some to establish good examples to lean on).

Agreed that switching to datadriven tests will making backporting more difficult. We should avoid big cleanups like that when making changes we'll want to backport.

@tbg
Copy link
Member Author

tbg commented Feb 8, 2019

bors r=petermattis

craig bot pushed a commit that referenced this pull request Feb 8, 2019
34502: storage: ignore non-live probing follower during log truncation r=petermattis a=tbg

Need to write tests and such, but I wanted to at least post this
before the weekend.

----

In the previous code, a follower in probing status which was not
recently active (i.e. a dead node) would permanently suppress
log truncations unless the Raft log was above threshold size (but the
size tracks only what the current leaseholder has written, i.e., it
can undercount dramatically). As a result, snapshots to other nodes
would get blocked if the log was in fact large (>16mb), leading to
ranges which effectively couldn't change their set of members.

Release note (bug fix): Prevent a problem that would cause the Raft log
to grow very large which in turn could prevent replication changes.

Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
@tbg tbg mentioned this pull request Feb 8, 2019
17 tasks
@craig
Copy link
Contributor

craig bot commented Feb 8, 2019

Build succeeded

@craig craig bot merged commit 328fadc into cockroachdb:master Feb 8, 2019
@tbg tbg deleted the fix/log-trunc-bug branch March 13, 2019 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants