-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: reject lease proposal locally when leader not known #118435
Comments
cc @cockroachdb/replication |
I'm not sure this will work. If the range does not have quorum nor a lease, and we don't attempt to propose a lease, then the replica circuit breakers will never trip and the client will keep retrying indefinitely -- exactly the scenario replica circuit breakers are meant to avoid. If we are to do this, then we'll have to implement circuit breakers at the DistSender level instead. That may come for free with #105168. |
How does proposing a lease help the range reach a quorum and avoid the indefinite retry? When a replica is in this state where it does not know the leader, it is simultaneously in the process of campaigning and trying to establish or learn leadership. Placing a proposal to sit idle in Are you saying that we will need client-side intervention in the case where all replicas are disconnected and no leader can be elected, to avoid an infinite This does feel like the client's responsibility though, right? |
Yeah, I think that's a better way to handle it. Today though, we rely on replica circuit breakers to avoid clients getting permanently stuck on an unavailable range (e.g. one that's lost quorum). If we simply remove this lease acquisition without also adding client-side circuit breakers then we're effectively removing that protection, leaving clients to get stuck on unavailable ranges again. As a short-term, backportable fix I think we should get #118737 in now. Then once we have range-level DistSender circuit breakers we can reject these proposals (and consider removing replica circuit breakers entirely). Note that range-level circuit breakers are out of scope for #118943, and probably not landing in 24.1 given stability is nearly upon us. |
One additional aspect here is that if the range is quiesced but has lost its leaseholder/leader, we have to wake up the replica so that it can actually hold an election and gain leadership. Proposing a lease acquisition will indirectly do that. If we reject lease acquisitions then we need to make sure we still unquiesce the range -- it's possible that it already does so, given we only reject this in the proposal buffer, but we should check and add a test case for this. |
+1 to all of this. There is a call to |
Previously we could propose a lease change if the leader was unknown. This request could stall as we waited for the lease request to fail. Not knowing the leader means we can't immediately submit this lease proposal. We rely on the client to retry requests until a new leader has been established. Epic: none Fixes: cockroachdb#120073 Fixes: cockroachdb#118435 Release note: None
Fixes cockroachdb#125676. This commit sets the `kv.lease.reject_on_leader_unknown.enabled` cluster setting to true in the c2c roachtests. This should avoid lease requests from getting stuck after a restart with intentionally high replication lag in `c2c/disconnect`. We can remove this when we complete cockroachdb#118435. Release note: None
125813: kv: reject lease request on unknown leader in c2c roachtests r=nvanbenschoten a=nvanbenschoten Fixes #125676. This commit sets the `kv.lease.reject_on_leader_unknown.enabled` cluster setting to true in the c2c roachtests. This should avoid lease requests from getting stuck after a restart with intentionally high replication lag in `c2c/disconnect`. We can remove this when we complete #118435. Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Closes cockroachdb#123498. Informs cockroachdb#118435. This commit extracts lease request and lease transfer validation into the `kv/kvserver/leases` package which was introduced in cockroachdb#124682. This avoids the scattered leases validity checks which were previously spread between `replica_range_lease.go`, `replica_proposal_buf.go` `cmd_lease_request.go`, `cmd_lease_transfer.go`. This makes the logic easier to understand and easier to test. It will also make the logic easier to adjust when we make the following two changes in service of cockroachdb#118435: - default to rejecting lease requests on unknown leaders - perform lease request validation above raft, like we do for lease transfers This second change also ties in with Leader leases (cockroachdb#123847). Until we make that change, we will occasionally need to propose expiration-based leases (e.g. when we're not the leader) just for them to be rejected by the proposal buffer. Release note: None
Closes cockroachdb#123498. Informs cockroachdb#118435. This commit extracts lease request and lease transfer validation into the `kv/kvserver/leases` package which was introduced in cockroachdb#124682. This avoids the scattered leases validity checks which were previously spread between `replica_range_lease.go`, `replica_proposal_buf.go` `cmd_lease_request.go`, `cmd_lease_transfer.go`. This makes the logic easier to understand and easier to test. It will also make the logic easier to adjust when we make the following two changes in service of cockroachdb#118435: - default to rejecting lease requests on unknown leaders - perform lease request validation above raft, like we do for lease transfers This second change also ties in with Leader leases (cockroachdb#123847). Until we make that change, we will occasionally need to propose expiration-based leases (e.g. when we're not the leader) just for them to be rejected by the proposal buffer. Release note: None
126182: kv: consolidate range lease validation r=arulajmani a=nvanbenschoten Closes #123498. Informs #118435. This commit extracts lease request and lease transfer validation into the `kv/kvserver/leases` package which was introduced in #124682. This avoids the scattered leases validity checks which were previously spread between `replica_range_lease.go`, `replica_proposal_buf.go` `cmd_lease_request.go`, `cmd_lease_transfer.go`. This makes the logic easier to understand and easier to test. It will also make the logic easier to adjust when we make the following two changes in service of #118435: - default to rejecting lease requests on unknown leaders - perform lease request validation above raft, like we do for lease transfers This second change also ties in with Leader leases (#123847). Until we make that change, we will occasionally need to propose expiration-based leases (e.g. when we're not the leader) just for them to be rejected by the proposal buffer. This is a fairly large PR, and I'm willing to split it up if that makes sense. Release note: None 126845: orchestration: released CockroachDB version 24.1.2. Next version: 24.1.3 r=ZhouXing19 a=cockroach-teamcity Release note: None Epic: None Release justification: non-production (release infra) change. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Justin Beaver <teamcity@cockroachlabs.com>
Closes cockroachdb#123498. Informs cockroachdb#118435. This commit extracts lease request and lease transfer validation into the `kv/kvserver/leases` package which was introduced in cockroachdb#124682. This avoids the scattered leases validity checks which were previously spread between `replica_range_lease.go`, `replica_proposal_buf.go` `cmd_lease_request.go`, `cmd_lease_transfer.go`. This makes the logic easier to understand and easier to test. It will also make the logic easier to adjust when we make the following two changes in service of cockroachdb#118435: - default to rejecting lease requests on unknown leaders - perform lease request validation above raft, like we do for lease transfers This second change also ties in with Leader leases (cockroachdb#123847). Until we make that change, we will occasionally need to propose expiration-based leases (e.g. when we're not the leader) just for them to be rejected by the proposal buffer. Release note: None
127082: kv: don't propose lease acquisition if raft leader is unknown, by default r=nvanbenschoten a=nvanbenschoten Fixes #118435. Fixes #120073. This commit updates the `kv.lease.reject_on_leader_unknown.enabled` cluster setting added in 282d953 to be enabled by default. This setting instructs a Raft follower who is not aware of the current Raft leader to reject lease acquisitions immediately, rather than waiting for the local replica establishing connectivity to its quorum (which may never happen). In cases where a range truly has no leader, we now rely on the client to retry requests (with backoff) until a new leader has been established. This behavior is is still configurable with the cluster setting, so we can disable it if we find that it causes issues. Release note: None 127169: roachtest: fix broken failover test naming r=nvanbenschoten a=nvanbenschoten failover roachtest names were recently broken by 04e093f, which unintentionally introduced variable aliasing and caused the "/read-only" and "/read-write" strings to be appended to every test name. This commit fixes the names and avoids the variable shadowing to make the code less error-prone. The roachtest names are now fixed: ```sh ➜ roachtest list ^failover Listing tests which match regex "^failover" and are compatible with cloud "gce". failover/chaos/read-only [kv] failover/chaos/read-only/lease=expiration [kv] failover/chaos/read-write [kv] failover/chaos/read-write/lease=expiration [kv] failover/liveness/blackhole [kv] failover/liveness/blackhole-recv [kv] failover/liveness/blackhole-recv/lease=expiration [kv] failover/liveness/blackhole-send [kv] failover/liveness/blackhole-send/lease=expiration [kv] failover/liveness/blackhole/lease=expiration [kv] failover/liveness/crash [kv] failover/liveness/crash/lease=expiration [kv] failover/liveness/deadlock [kv] failover/liveness/deadlock/lease=expiration [kv] failover/liveness/disk-stall [kv] failover/liveness/disk-stall/lease=expiration [kv] failover/liveness/pause [kv] failover/liveness/pause/lease=expiration [kv] failover/non-system/blackhole [kv] failover/non-system/blackhole-recv [kv] failover/non-system/blackhole-recv/lease=expiration [kv] failover/non-system/blackhole-send [kv] failover/non-system/blackhole-send/lease=expiration [kv] failover/non-system/blackhole/lease=expiration [kv] failover/non-system/crash [kv] failover/non-system/crash/lease=expiration [kv] failover/non-system/deadlock [kv] failover/non-system/deadlock/lease=expiration [kv] failover/non-system/disk-stall [kv] failover/non-system/disk-stall/lease=expiration [kv] failover/non-system/pause [kv] failover/non-system/pause/lease=expiration [kv] failover/partial/lease-gateway [kv] failover/partial/lease-gateway/lease=expiration [kv] failover/partial/lease-leader [kv] failover/partial/lease-leader/lease=expiration [kv] failover/partial/lease-liveness [kv] failover/partial/lease-liveness/lease=expiration [kv] failover/system-non-liveness/blackhole [kv] failover/system-non-liveness/blackhole-recv [kv] failover/system-non-liveness/blackhole-recv/lease=expiration [kv] failover/system-non-liveness/blackhole-send [kv] failover/system-non-liveness/blackhole-send/lease=expiration [kv] failover/system-non-liveness/blackhole/lease=expiration [kv] failover/system-non-liveness/crash [kv] failover/system-non-liveness/crash/lease=expiration [kv] failover/system-non-liveness/deadlock [kv] failover/system-non-liveness/deadlock/lease=expiration [kv] failover/system-non-liveness/disk-stall [kv] failover/system-non-liveness/disk-stall/lease=expiration [kv] failover/system-non-liveness/pause [kv] failover/system-non-liveness/pause/lease=expiration [kv] ``` Epic: None Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Closes cockroachdb#128642. Reopens cockroachdb#118435. Reopens cockroachdb#120073. This commit is a partial revert of cockroachdb#127082. It switches the default value for the `kv.lease.reject_on_leader_unknown.enabled` cluster setting back to `false`. This avoids the regression we see in cockroachdb#128642 and may also help stabilize cockroachdb#127724. When manually testing range failover with epoch-based leases, we see that the cluster setting is causing hung raft elections. As hypothesized in cockroachdb#128642, by redirecting immediately after unquiescing a range, finding a dead leader, and campaigning (`shouldCampaignOnWake`), instead of just sitting tight and waiting for a raft leader election to complete, we trigger multiple replicas to campaign for leadership in quick succession. This leads to a hung election and requires a new election to be called after an additional (jittered) election timeout. We don't see this behavior with the setting disabled. Release note: None
128708: kv: propose lease acquisition if raft leader is unknown, by default r=nvanbenschoten a=nvanbenschoten Closes #128642. Reopens #118435. Reopens #120073. This commit is a partial revert of #127082. It switches the default value for the `kv.lease.reject_on_leader_unknown.enabled` cluster setting back to `false`. This avoids the regression we see in #128642 and may also help stabilize #127724. When manually testing range failover with epoch-based leases, we see that the cluster setting is causing hung raft elections. As hypothesized in #128642, by redirecting immediately after unquiescing a range, finding a dead leader, and campaigning (`shouldCampaignOnWake`), instead of just sitting tight and waiting for a raft leader election to complete, we trigger multiple replicas to campaign for leadership in quick succession. This leads to a hung election and requires a new election to be called after an additional (jittered) election timeout. We don't see this behavior with the setting disabled. Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
In #128642, we found that this caused a regression for quiesced ranges. One idea is to tie the default for this setting to the per-range lease type preference ( |
Is your feature request related to a problem? Please describe.
If a replica doesn't know who the leader is (possible just rejoined cluster) we still let the lease acquisition request be proposed, despite knowing it will fail.
cockroach/pkg/kv/kvserver/replica_proposal_buf.go
Line 685 in 59ad05a
This can hang if the replica is behind and is still catching up e.g. just rejoined the cluster.
Describe the solution you'd like
Reject the proposal, similar to the known leader case and return a
NotLeaseHolderError
.cockroach/pkg/kv/kvserver/replica_proposal_buf.go
Line 703 in 59ad05a
Additional context
Related #118266
Jira issue: CRDB-35731
Epic CRDB-37522
The text was updated successfully, but these errors were encountered: