storage: ignore non-live probing follower during log truncation #34502

tbg · 2019-02-01T21:09:28Z

Need to write tests and such, but I wanted to at least post this
before the weekend.

In the previous code, a follower in probing status which was not
recently active (i.e. a dead node) would permanently suppress
log truncations unless the Raft log was above threshold size (but the
size tracks only what the current leaseholder has written, i.e., it
can undercount dramatically). As a result, snapshots to other nodes
would get blocked if the log was in fact large (>16mb), leading to
ranges which effectively couldn't change their set of members.

Release note (bug fix): Prevent a problem that would cause the Raft log
to grow very large which in turn could prevent replication changes.

cockroach-teamcity · 2019-02-01T21:09:36Z

This change is

petermattis

This deserves a test. I'm surprised no existing tests break due to this.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):

		if !progress.RecentActive {
			// Make no exceptions for followers who haven't contacted
			// us within a reasonable period of time.

Can you elaborate on what it means to "make no exceptions"? Perhaps this should be something like: "If a follower hasn't contacted us within a reasonable period of time, do not include that follower's Raft log position in the decision for where to truncate".

tbg

This deserves a test.

Absolutely. Wasn't my intention to merge without one. Just wanted to show the fix.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

petermattis

Oops, I completely missed that comment and went straight to the code. Apologies.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

tbg · 2019-02-06T12:14:39Z

Ready for a look. Turns out that a bug in the test prevented me from catching this in the first place.

petermattis

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Can you elaborate on what it means to "make no exceptions"? Perhaps this should be something like: "If a follower hasn't contacted us within a reasonable period of time, do not include that follower's Raft log position in the decision for where to truncate".

Ping.

pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):

						Match:        0,
						Next:         v,
					}

These test cases are super hard to read. There is a minimum of code, yet I think we'd be better served by more code where each test case is easier to understand (i.e. datadriven tests). That said, this is fine for now. Just my griping.

In the previous code, a follower in probing status which was not recently active (i.e. a dead node) would permanently suppress log truncations unless the Raft log was above threshold size (but the size tracks only what the current leaseholder has written, i.e., it can undercount dramatically). As a result, snapshots to other nodes would get blocked if the log was in fact large (>16mb), leading to ranges which effectively couldn't change their set of members. This wasn't caught in TestComputeTruncateDecisionProgressStatusProbe because of a bug in the test, which was accidentally setting the Match instead of the Next of the probing follower. By setting Match, the probing follower behaved differently from the case which would trigger the bug, and so the outcome the test asserts is actually still the same (except that it failed when the bug in the test was fixed until the bug in the truncation code was also fixed). Release note (bug fix): Prevent a problem that would cause the Raft log to grow very large which in turn could prevent replication changes. wiptest

After recent fixes to the test and the code, the test was still not suggesting a truncation, but only because fewer than 90 entries were truncatable. To make it abundantly clear that a truncation would happen if this weren't the case, switch out the probing follower in the test so that a truncation would result. Release note: None

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @petermattis)

pkg/storage/raft_log_queue.go, line 326 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Ping.

Thanks. Pong.

pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):

Previously, petermattis (Peter Mattis) wrote…

These test cases are super hard to read. There is a minimum of code, yet I think we'd be better served by more code where each test case is easier to understand (i.e. datadriven tests). That said, this is fine for now. Just my griping.

I actually looked at that before settling on this based on need to be cherry-picked and not wanting to hold up the fix. Other than that, I agree with you and I have a WIP locally in which I try to bike shed what the datadriven test would look like. I'll send that as a separate PR (maybe not for this particular test, but I do want to start translating some to establish good examples to lean on).

It wasn't looking at the progress.State, which should be harmless but better not to trust that the Match field is correctly populated for followers in probing status. Note that there's a potential behavior change here: if a follower needs a snapshot, it will have a Match field. But if we're computing a quorum index, we implicitly assume that progress can be made from that index since a quorum of followers "has it". A follower which needs a snapshot is not able to help out with progress until it has been caught up, so including it in the quorum index is not beneficial. Release note: None

petermattis

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

pkg/storage/raft_log_queue_test.go, line 260 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I actually looked at that before settling on this based on need to be cherry-picked and not wanting to hold up the fix. Other than that, I agree with you and I have a WIP locally in which I try to bike shed what the datadriven test would look like. I'll send that as a separate PR (maybe not for this particular test, but I do want to start translating some to establish good examples to lean on).

Agreed that switching to datadriven tests will making backporting more difficult. We should avoid big cleanups like that when making changes we'll want to backport.

tbg · 2019-02-08T09:07:25Z

bors r=petermattis

34502: storage: ignore non-live probing follower during log truncation r=petermattis a=tbg Need to write tests and such, but I wanted to at least post this before the weekend. ---- In the previous code, a follower in probing status which was not recently active (i.e. a dead node) would permanently suppress log truncations unless the Raft log was above threshold size (but the size tracks only what the current leaseholder has written, i.e., it can undercount dramatically). As a result, snapshots to other nodes would get blocked if the log was in fact large (>16mb), leading to ranges which effectively couldn't change their set of members. Release note (bug fix): Prevent a problem that would cause the Raft log to grow very large which in turn could prevent replication changes. Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>

craig · 2019-02-08T09:58:17Z

Build succeeded

GitHub CI (Cockroach)

tbg requested a review from a team February 1, 2019 21:09

tbg force-pushed the fix/log-trunc-bug branch from e519e7a to 6f3a0cf Compare February 1, 2019 21:11

petermattis reviewed Feb 1, 2019

View reviewed changes

tbg mentioned this pull request Feb 1, 2019

storage: inability to up-replicate if Raft log is too large #33071

Closed

tbg commented Feb 1, 2019

View reviewed changes

petermattis reviewed Feb 1, 2019

View reviewed changes

tbg force-pushed the fix/log-trunc-bug branch from 6f3a0cf to 5b2bccd Compare February 6, 2019 12:14

tbg force-pushed the fix/log-trunc-bug branch from 5b2bccd to 007afcf Compare February 6, 2019 13:29

petermattis approved these changes Feb 6, 2019

View reviewed changes

tbg added 2 commits February 7, 2019 09:20

tbg force-pushed the fix/log-trunc-bug branch from 007afcf to 678e0fe Compare February 7, 2019 08:20

tbg commented Feb 7, 2019

View reviewed changes

tbg force-pushed the fix/log-trunc-bug branch from 678e0fe to 328fadc Compare February 7, 2019 13:37

petermattis approved these changes Feb 7, 2019

View reviewed changes

tbg mentioned this pull request Feb 8, 2019

release: v2.2.0-alpha.20190211 #34288

Closed

17 tasks

craig bot merged commit 328fadc into cockroachdb:master Feb 8, 2019

tbg mentioned this pull request Feb 11, 2019

backport-2.1: storage: log truncation bug fixes #34774

Merged

tbg deleted the fix/log-trunc-bug branch March 13, 2019 11:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: ignore non-live probing follower during log truncation #34502

storage: ignore non-live probing follower during log truncation #34502

tbg commented Feb 1, 2019

cockroach-teamcity commented Feb 1, 2019

petermattis left a comment

tbg left a comment

petermattis left a comment

tbg commented Feb 6, 2019

petermattis left a comment

tbg left a comment

petermattis left a comment

tbg commented Feb 8, 2019

craig bot commented Feb 8, 2019

storage: ignore non-live probing follower during log truncation #34502

storage: ignore non-live probing follower during log truncation #34502

Conversation

tbg commented Feb 1, 2019

cockroach-teamcity commented Feb 1, 2019

petermattis left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment

tbg commented Feb 6, 2019

petermattis left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

petermattis left a comment

Choose a reason for hiding this comment

tbg commented Feb 8, 2019

craig bot commented Feb 8, 2019

Build succeeded