Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-23.1: kvserver: eagerly extend expiration leases in Raft scheduler #100686

Merged

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Apr 5, 2023

Backport:

Please see individual PRs for details.

/cc @cockroachdb/release

Release justification: fixes rangefeed resolved timestamps with expiration leases, gated behind default-off cluster setting.

@erikgrinaker erikgrinaker self-assigned this Apr 5, 2023
@erikgrinaker erikgrinaker requested a review from a team April 5, 2023 08:23
@erikgrinaker erikgrinaker requested a review from a team as a code owner April 5, 2023 08:23
@blathers-crl
Copy link

blathers-crl bot commented Apr 5, 2023

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@erikgrinaker
Copy link
Contributor Author

Need to sort out some test flakes before merging this, but won't have time until next week. See #100693 and #100689.

The store lease renewer, which eagerly extends expiration-based leases,
did not correctly (un)register ranges on splits/merges/relocations. This
could cause the lease renewer to not eagerly extend leases for some
ranges, but was mitigated by the lease renewer only being used for the
meta and liveness ranges which rarely/never split.

This patch makes sure the lease renewer is notified on these events. It
also incidentally fixes an unlikely race where a range split immediately
following an expiration-based lease transfer could leave the RHS using
an expiration-based lease until it was transferred elsewhere.

Epic: none
Release note: None
This patch prevents quiescence with expiration-based leases, since we'll
have to propose a lease extension shortly anyway. Out of caution, we
only do this when `kv.expiration_leases_only.enabled` is `true`, with an
eye towards 23.1 backports. For 23.2, we will always do this.

Epic: none
Release note: None
This patch eagerly extends expiration leases in the Raft scheduler,
during Raft ticks, but only if `kv.expiration_leases_only.enabled` is
`true`. This is possible because we no longer allow ranges
with expiration leases to quiesce in this case.

The old store lease renewer is disabled in this case, but still tracks
local system ranges with expiration leases in case the setting is later
changed. The store lease renewer is still retained for use by default,
to allow backporting this change to 23.1 while keeping the existing
default behavior.

Doing this during Raft scheduling is compelling, since it already does
efficient goroutine scheduling (including liveness range prioritization)
and holds the replica mutex.

Compared to an alternative implementation that relied on a separate,
improved store lease renewer which also allowed ranges to quiesce
between extensions, this has significantly better performance. The main
reason for this is that unquiescing involves a Raft append to wake the
leader. On an idle cluster with 20.000 ranges using only expiration
leases, CPU usage was 26% vs.  30%, write IOPS was 500 vs. 900, and
network traffic was 0.8 MB/s vs.  1.0 MB/s. KV benchmark results:

```
name                    old ops/sec  new ops/sec  delta
kv0/enc=false/nodes=3    14.6k ± 3%   15.2k ± 4%  +3.72%  (p=0.003 n=8+8)
kv95/enc=false/nodes=3   30.6k ± 3%   32.2k ± 3%  +5.22%  (p=0.000 n=8+8)

name                    old p50      new p50      delta
kv0/enc=false/nodes=3     11.3 ± 3%    11.0 ± 0%  -2.76%  (p=0.006 n=8+6)
kv95/enc=false/nodes=3    4.59 ± 8%    4.55 ± 3%    ~     (p=0.315 n=8+8)

name                    old p95      new p95      delta
kv0/enc=false/nodes=3     31.5 ± 0%    30.2 ± 4%  -4.25%  (p=0.003 n=6+8)
kv95/enc=false/nodes=3    19.5 ± 7%    18.1 ± 4%  -7.28%  (p=0.000 n=8+7)

name                    old p99      new p99      delta
kv0/enc=false/nodes=3     48.7 ± 3%    47.4 ± 6%    ~     (p=0.059 n=8+8)
kv95/enc=false/nodes=3    32.8 ± 9%    30.2 ± 4%  -8.04%  (p=0.001 n=8+8)
```

Epic: none
Release note: None
Previously, the replicate queue would only reacquire epoch leases, and
allow expiration leases to expire. It now reacquires all leases, since
they are eagerly extended.

In the future, when quiescence is removed entirely, this responsibility
should be moved to the Raft tick loop.

Epic: none
Release note: None
If `kv.expiration_leases_only.enabled´ is `true`, this patch disables
expiration lease extension during request processing, since the Raft
scheduler will eagerly extend leases in this case. This will eventually
be removed completely, but is kept for now with an eye towards a 23.1
backport.

Epic: none
Release note: None
The store lease renewer begins tracking an system range's
expiration-based lease for renewal when it's first applied. However, it
only did this if the lease was still valid when first applied. It was
possible for the node to apply a lease after it's lapsed, but then to
retain and extend the lease, and in this case it would not be registered
with the lease renewer.

Epic: none
Release note: None
Epic: none
Release note: None
@erikgrinaker erikgrinaker merged commit 285fc42 into cockroachdb:release-23.1 Apr 12, 2023
@erikgrinaker erikgrinaker deleted the backport23.1-100392-100430 branch May 11, 2023 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants