Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: terminate nodes in minority on stats mismatch #41902

Merged
merged 1 commit into from Oct 28, 2019

Conversation

tbg
Copy link
Member

@tbg tbg commented Oct 24, 2019

First commit is #41893


Following up on the parent commit, the consistency checker now instructs
the replicas corresponding to the SHA with the lowest multiplicity to
terminate.

Note that this change is a candidate for backporting into 19.2. An "old"
leaseholder will never populate the newly introduced field, and will always
terminate itself. A new leaseholder will populate the field, but only new
nodes will interpret it. In the worst case, the result will be a
consistency failure that does not lead to a fatal error (but collects
RocksDB checkpoints just the same). Since no significant time is usually
spent in a mixed-patch-releases, this is not a concern that outweighs the
benefit of terminating (heuristically) the "right" nodes.

Release note (general improvement): when the replicas within a range have
found to been corrupted, the outliers will be terminated. Previously, the
leaseholder replica would terminate, regardless of which replicas disagreed
with each other. This is expected to curb the spreading of corrupted data
better than the previous approach.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Mostly just asking for comments to help my 🔬 🐶 self. Remember to change the second commit message to remove reference to "parent commit" once #41893 is merged separately.

Reviewed 2 of 2 files at r1, 11 of 11 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @tbg)


pkg/roachpb/api.proto, line 488 at r2 (raw file):

  // directory and needs to be removed manually to avoid leaking disk space.
  bool checkpoint = 4;
  repeated ReplicaDescriptor terminate = 5 [(gogoproto.nullable) = false];

Could use a comment, as below. I have no real suggestions to make but 'terminate' sounds like a boolean, not a list of replicas to be terminated.


pkg/roachpb/api.proto, line 1208 at r2 (raw file):

  // in the engine's auxiliary directory.
  bool checkpoint = 6;
  // If set, specifies the replicas which are the most likely source of the

[nit] set -> non-empty.


pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):

			notifyReportDiff <- struct{}{}
		}
	// Store 0 will panic.

Update this comment. also, notifyFatal?


pkg/storage/replica_consistency.go, line 268 at r2 (raw file):

	}

	// Diff was printed above, so call logFunc with a short message only.

logFunc no longer applicable.


pkg/storage/replica_proposal.go, line 254 at r2 (raw file):

		if shouldFatal {
			// This node should fatal as a result of a previous consistency
			// check (i.e. this round is carried out only to obtain a diff).

Aside: The fact that this takes place over two rounds, is this documented somewhere? From a quick skim I couldn't find it. If there, the fact that replicas self terminate in the second round merits a mention there.


pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):

  // inconsistency and we want to preserve as much state as possible.
  bool checkpoint = 4;
  repeated roachpb.ReplicaDescriptor terminate = 6 [(gogoproto.nullable) = false];

Comment here.

Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

change the second commit message to remove reference to "parent commit" once #41893 is merged separately.

Done

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif)


pkg/roachpb/api.proto, line 488 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Could use a comment, as below. I have no real suggestions to make but 'terminate' sounds like a boolean, not a list of replicas to be terminated.

Done.


pkg/roachpb/api.proto, line 1208 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

[nit] set -> non-empty.

Done.


pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Update this comment. also, notifyFatal?

Done.


pkg/storage/replica_consistency.go, line 268 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

logFunc no longer applicable.

Done.


pkg/storage/replica_proposal.go, line 254 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Aside: The fact that this takes place over two rounds, is this documented somewhere? From a quick skim I couldn't find it. If there, the fact that replicas self terminate in the second round merits a mention there.

Done.


pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Comment here.

Done.

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 9 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @tbg)


pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Done.

Did you mean s1?


pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Done.

ha, "processing this comment"?

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 9 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @tbg)

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 9 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @tbg)

Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif)


pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Did you mean s1?

It's always a struggle with indexes vs IDs -- mtc.Store(1).Ident.StoreID == 2, s2 refers to "the store with ID 2". I added a reference to the index.


pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

ha, "processing this comment"?

🙈 done.

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

Following up on cockroachdb#41893, the consistency checker now instructs the
replicas corresponding to the SHA with the lowest multiplicity to
terminate.

Note that this change is a candidate for backporting into 19.2. An "old"
leaseholder will never populate the newly introduced field, and will
always terminate itself. A new leaseholder will populate the field, but
only new nodes will interpret it. In the worst case, the result will be
a consistency failure that does not lead to a fatal error (but collects
RocksDB checkpoints just the same). Since no significant time is usually
spent in a mixed-patch-releases, this is not a concern that outweighs
the benefit of terminating (heuristically) the "right" nodes.

Release note (general improvement): when the replicas within a range
have found to been corrupted, the outliers will be terminated.
Previously, the leaseholder replica would terminate, regardless of
which replicas disagreed with each other. This is expected to curb
the spreading of corrupted data better than the previous approach.
@tbg tbg merged commit 8394d86 into cockroachdb:master Oct 28, 2019
@tbg tbg deleted the deathrattle-2 branch October 28, 2019 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants