kvserver: ensure cleanup of all abandoned learners before an `AdminRelocateRange` #80205

aayushshah15 · 2022-04-19T22:21:46Z

Previously, after #79405,
maybeLeaveAtomicChangeReplicasAndRemoveLearners() could return a range
descriptor that contained learner replicas. This was because of the idempotency
introduced in #79405, which relaxed how/when we failed replication changes due
to descriptor mismatch. The hazard was that, while we would ensure that the
learner we were trying to remove was indeed gone, we wouldn't do enough to
ensure that there weren't new learners that were added to the range due to
other concurrent AdminRelocateRange calls.

This was a violation of its contract and could sometimes fail
AdminRelocateRange requests.

This commit makes it such that when maybeLeaveAtomicChangeReplicasAndRemoveLearners
returns a non-error response, it guarantees that all the learner replicas
have been removed according to the returned range descriptor.

Resolves #79887

Release note: None

cockroach-teamcity · 2022-04-19T22:21:54Z

This change is

aayushshah15 · 2022-04-19T22:22:37Z

@nvanbenschoten, this PR is just a proposal and not something I'm super sold on. What do you think?

…locateRange` Previously, after cockroachdb#79405, `maybeLeaveAtomicChangeReplicasAndRemoveLearners()` could return a range descriptor that contained learner replicas. This was because of the idempotency introduced in cockroachdb#79405, which relaxed how/when we failed replication changes due to descriptor mismatch. The hazard was that, while we would ensure that the learner we were trying to remove was indeed gone, we wouldn't do enough to ensure that there weren't new learners that were added to the range due to other concurrent `AdminRelocateRange` calls. This was a violation of its contract and could sometimes fail `AdminRelocateRange` requests. This commit makes it such that when `maybeLeaveAtomicChangeReplicasAndRemoveLearners` returns a non-error response, it guarantees that _all_ the learner replicas have been removed according to the returned range descriptor. Release note: None

This commit is an alternative to cockroachdb#80205. In cockroachdb#79405, we relaxed the check that `AdminChangeReplicas` performs comparing the expected range descriptor state against the range descriptor in KV when we were performing single learner removals. This was because these learner removals are performed at the end of a replication change (i.e. after its already completed) and thus, could be made idempotent. The above change messed with a subtle part of the contract of `maybeLeaveAtomicChangeReplicasAndRemoveLearners()` since now it was no longer guaranteed to always successfully return with a range desc that did not contain learners (due to intervening replication changes that were now no longer detected because of the relaxation mentioned above). This commit fixes this regression by making `maybeLeaveAtomicChangeReplicasAndRemoveLearners` check at the end if we've raced with another replication change. If so, we return an error instead of incorrectly returning a successful response with a descriptor that could still have learners or be in a joint config. Release note: None

nvanbenschoten

I think I like this approach better than #80205. That PR feels like a partial regression of #79405, so I don't understand the benefit of a state with both of those PRs vs. a state with neither.

However, this PR does pose a risk of getting stuck in the loop if some other actor continuously adds learners.

I wonder if there's a third option that scopes down the call to maybeLeaveAtomicChangeReplicasAndRemoveLearners at the end of changeReplicasImpl. If we only want the demoted learners to be removed in this case, why are we requiring that all learners be removed?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)

pkg/kv/kvserver/replica_command.go line 1324 at r1 (raw file):

	// Now the config isn't joint any more, but we may have demoted some voters
	// into learners. These learners should go as well.
	for {

Now that #79379 has landed, Should we be checking hasOutstandingLearnerSnapshotInFlight inside of this loop?

pkg/kv/kvserver/replica_command.go line 1338 at r1 (raw file):

		var err error
		log.VEventf(ctx, 2, `removing learner replica %v from %v`, removalTarget, desc)
		desc, err = execChangeReplicasTxn(

Having seen fallout already, I'm somewhat wary that #79405 might have other consequences for callers that are removing a single learner replica and expect execChangeReplicasTxn to be a true compare-and-swap on the full descriptor. Do you think that's a concern? If so, do you think there's merit in using changeReplicasTxnArgs to scope these limited compare-and-swap fast-paths to just the callers that want them, instead of inferring them from the shape of the internalReplicationChanges?

aayushshah15 requested a review from a team as a code owner April 19, 2022 22:21

aayushshah15 force-pushed the 20220419_fixRelocateRoachtest branch from 3aae6de to f4d79e9 Compare April 20, 2022 19:24

aayushshah15 requested a review from nvanbenschoten April 20, 2022 19:53

aayushshah15 mentioned this pull request Apr 21, 2022

kvserver: detect (and fail on) races during learner replica removals #80329

Open

nvanbenschoten reviewed Apr 27, 2022

View reviewed changes

nvanbenschoten mentioned this pull request Jul 18, 2022

roachtest: kv50/rangelookups/relocate/nodes=8 failed [waiting for backport] #79887

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: ensure cleanup of all abandoned learners before an `AdminRelocateRange` #80205

kvserver: ensure cleanup of all abandoned learners before an `AdminRelocateRange` #80205

aayushshah15 commented Apr 19, 2022 •

edited

cockroach-teamcity commented Apr 19, 2022

aayushshah15 commented Apr 19, 2022

nvanbenschoten left a comment

kvserver: ensure cleanup of all abandoned learners before an AdminRelocateRange #80205

Are you sure you want to change the base?

kvserver: ensure cleanup of all abandoned learners before an AdminRelocateRange #80205

Conversation

aayushshah15 commented Apr 19, 2022 • edited

cockroach-teamcity commented Apr 19, 2022

aayushshah15 commented Apr 19, 2022

nvanbenschoten left a comment

Choose a reason for hiding this comment

kvserver: ensure cleanup of all abandoned learners before an `AdminRelocateRange` #80205

kvserver: ensure cleanup of all abandoned learners before an `AdminRelocateRange` #80205

aayushshah15 commented Apr 19, 2022 •

edited