Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: kvserver: synchronize replica removal with read-only requests #64377

Merged

Conversation

erikgrinaker
Copy link
Contributor

Backport 1/1 commits from #64324. Also backport necessary test infrastructure from #59059, #59086 (partial), #59333 (partial), and #64242.

/cc @cockroachdb/release @cockroachdb/kv


Replica removal did not synchronize with in-flight read-only requests,
which could cause them to be evaluated on a removed (empty) replica,
returning an empty result.

This patch fixes the problem by locking Replica.readOnlyCmdMu during
replica removal, thus either waiting for read-only requests to complete
or not evaluating them.

Resolves #64325.

Release note (bug fix): Fixed a race condition where read-only requests
during replica removal (e.g. during range merges or rebalancing) could
be evaluated on the removed replica, returning an empty result.

@erikgrinaker erikgrinaker self-assigned this Apr 29, 2021
@cockroach-teamcity
Copy link
Member

This change is Reviewable

lunevalex and others added 4 commits April 29, 2021 13:31
Makes progress on cockroachdb#8299

This commit introduces a new type of ManualClock a HybridManualClock.
and wires it into the TestCluster.  This clock follows the physical
wall time of a regular clock, but allows the developer to move it
forward. This is needed to be able to test functionality around
lease expiration and other time based mechanisms.

To verify that the clock is usefull in tests, a single test in
client_merge_tests.go is converted to use TestCluster with a
HybridManualClock. To make the test possible we also need a
simple way to create a range with an expiration based lease. This
is done through the new TestCluster.ScratchRangeWithExpirationLease
function.

Release note: None
This backports the `HybridManualClock` changes from cockroachdb#59333, but not the
client replica test changes which have additional backport dependencies.

---

We introduce a new function on hlc.HybridManualClock to pause the clock.
This allows a test to conduct specific time based measurements, which
are otherwise tricky to do when the time is moving.

Release note: None
Pause() reads the clock and saved the reading to be
returned from now on. The bgu was that the reading and the saving were
not atomic - it was possible to read, then the still-not-paused clock to
hand out higher timestamps to others, then to write. This created a
backwards clock jump, which is unexpected. In particular, such a jump
broke the logic in TestLeaseholdersRejectClockUpdateWithJump.

Fixes cockroachdb#60914

Release note: None
The clock pretended to support different underlying time sources, but it
didn't really. This patch removes indirection when calling UnixNano().

Release note: None
@erikgrinaker erikgrinaker force-pushed the backport20.1-64324 branch 2 times, most recently from 9ff3fcb to 288cdd4 Compare April 29, 2021 17:23
This backports `TestCluster.MoveRangeLeaseNonCooperatively` and related
test infra from cockroachdb#59086, avoiding backporting the code changes.

Release note: None
Replica removal did not synchronize with in-flight read-only requests,
which could cause them to be evaluated on a removed (empty) replica,
returning an empty result.

This patch fixes the problem by locking `Replica.readOnlyCmdMu` during
replica removal, thus either waiting for read-only requests to complete
or not evaluating them.

Release note (bug fix): Fixed a race condition where read-only requests
during replica removal (e.g. during range merges or rebalancing) could
be evaluated on the removed replica, returning an empty result.
@erikgrinaker erikgrinaker merged commit abbee62 into cockroachdb:release-20.1 Apr 30, 2021
@erikgrinaker erikgrinaker deleted the backport20.1-64324 branch April 30, 2021 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants