New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: terminate nodes in minority on stats mismatch #41902
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just asking for comments to help my 🔬 🐶 self. Remember to change the second commit message to remove reference to "parent commit" once #41893 is merged separately.
Reviewed 2 of 2 files at r1, 11 of 11 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/roachpb/api.proto, line 488 at r2 (raw file):
// directory and needs to be removed manually to avoid leaking disk space. bool checkpoint = 4; repeated ReplicaDescriptor terminate = 5 [(gogoproto.nullable) = false];
Could use a comment, as below. I have no real suggestions to make but 'terminate' sounds like a boolean, not a list of replicas to be terminated.
pkg/roachpb/api.proto, line 1208 at r2 (raw file):
// in the engine's auxiliary directory. bool checkpoint = 6; // If set, specifies the replicas which are the most likely source of the
[nit] set -> non-empty.
pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):
notifyReportDiff <- struct{}{} } // Store 0 will panic.
Update this comment. also, notifyFatal?
pkg/storage/replica_consistency.go, line 268 at r2 (raw file):
} // Diff was printed above, so call logFunc with a short message only.
logFunc
no longer applicable.
pkg/storage/replica_proposal.go, line 254 at r2 (raw file):
if shouldFatal { // This node should fatal as a result of a previous consistency // check (i.e. this round is carried out only to obtain a diff).
Aside: The fact that this takes place over two rounds, is this documented somewhere? From a quick skim I couldn't find it. If there, the fact that replicas self terminate in the second round merits a mention there.
pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):
// inconsistency and we want to preserve as much state as possible. bool checkpoint = 4; repeated roachpb.ReplicaDescriptor terminate = 6 [(gogoproto.nullable) = false];
Comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
change the second commit message to remove reference to "parent commit" once #41893 is merged separately.
Done
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif)
pkg/roachpb/api.proto, line 488 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Could use a comment, as below. I have no real suggestions to make but 'terminate' sounds like a boolean, not a list of replicas to be terminated.
Done.
pkg/roachpb/api.proto, line 1208 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] set -> non-empty.
Done.
pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Update this comment. also, notifyFatal?
Done.
pkg/storage/replica_consistency.go, line 268 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
logFunc
no longer applicable.
Done.
pkg/storage/replica_proposal.go, line 254 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Aside: The fact that this takes place over two rounds, is this documented somewhere? From a quick skim I couldn't find it. If there, the fact that replicas self terminate in the second round merits a mention there.
Done.
pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Comment here.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 9 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @tbg)
pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Done.
Did you mean s1?
pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Done.
ha, "processing this comment"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 9 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @tbg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 9 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @tbg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif)
pkg/storage/consistency_queue_test.go, line 243 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Did you mean s1?
It's always a struggle with indexes vs IDs -- mtc.Store(1).Ident.StoreID == 2
, s2
refers to "the store with ID 2". I added a reference to the index.
pkg/storage/storagepb/proposer_kv.proto, line 87 at r2 (raw file):
Previously, irfansharif (irfan sharif) wrote…
ha, "processing this comment"?
🙈 done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
Following up on cockroachdb#41893, the consistency checker now instructs the replicas corresponding to the SHA with the lowest multiplicity to terminate. Note that this change is a candidate for backporting into 19.2. An "old" leaseholder will never populate the newly introduced field, and will always terminate itself. A new leaseholder will populate the field, but only new nodes will interpret it. In the worst case, the result will be a consistency failure that does not lead to a fatal error (but collects RocksDB checkpoints just the same). Since no significant time is usually spent in a mixed-patch-releases, this is not a concern that outweighs the benefit of terminating (heuristically) the "right" nodes. Release note (general improvement): when the replicas within a range have found to been corrupted, the outliers will be terminated. Previously, the leaseholder replica would terminate, regardless of which replicas disagreed with each other. This is expected to curb the spreading of corrupted data better than the previous approach.
First commit is #41893
Following up on the parent commit, the consistency checker now instructs
the replicas corresponding to the SHA with the lowest multiplicity to
terminate.
Note that this change is a candidate for backporting into 19.2. An "old"
leaseholder will never populate the newly introduced field, and will always
terminate itself. A new leaseholder will populate the field, but only new
nodes will interpret it. In the worst case, the result will be a
consistency failure that does not lead to a fatal error (but collects
RocksDB checkpoints just the same). Since no significant time is usually
spent in a mixed-patch-releases, this is not a concern that outweighs the
benefit of terminating (heuristically) the "right" nodes.
Release note (general improvement): when the replicas within a range have
found to been corrupted, the outliers will be terminated. Previously, the
leaseholder replica would terminate, regardless of which replicas disagreed
with each other. This is expected to curb the spreading of corrupted data
better than the previous approach.