Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batcheval: don't take out latches for LeaseInfo #119130

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Feb 13, 2024

Extracted from #118943.


This patch removes the LeaseInfo read latch on RangeLeaseKey. This makes the request latchless, which allows the request to be used as a simple replica health check without worrying about latch delays (e.g. with DistSender circuit breaker probes).

This latch was ineffectual anyway, since lease requests don't take out write latches (they're evaluated on the proposer). The lease is read from the replica's in-memory state.

Touches #104262.
Touches #118943.
Epic: none
Release note: None

Copy link

blathers-crl bot commented Feb 13, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@erikgrinaker erikgrinaker force-pushed the lease-info-latch branch 2 times, most recently from e352955 to 422fd53 Compare February 13, 2024 17:11
This patch removes the `LeaseInfo` read latch on `RangeLeaseKey`. This
makes the request latchless, which allows the request to be used as a
simple replica health check without worrying about latch delays (e.g.
with DistSender circuit breaker probes).

This latch was ineffectual anyway, since lease requests don't take out
write latches (they're evaluated on the proposer). The lease is read
from the replica's in-memory state.

Epic: none
Release note: None
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @erikgrinaker)


-- commits line 11 at r1:
The latches are effective at synchronizing with a lease transfer. However, we're already splitting (Next)Lease and CurrentLease in LeaseInfoResponse, so it seems reasonable to make this latchless.


pkg/kv/kvserver/batcheval/cmd_lease_info.go line 38 at r1 (raw file):

) error {
	// We don't take out a latch on the range lease key, since lease requests
	// bypass latches anyway (they're evaluated on the proposer). The lease is

Will the request still block on in-flight lease requests? Based on the commentary around LeaseInfoRequest.flags, I don't think it will, as long as it's issued in an INCONSISTENT batch. I think that's the behavior we want, so that's good.

However, this now seems unsafe if not issued in an INCONSISTENT batch. It might lead to spinning between checkLeaseRLocked and executeBatchWithConcurrencyRetries. Should we be setting the skipsLeaseCheck flag on this request? I can't tell from #8752 why we didn't to begin with. Maybe to provide flexibility, which we're removing here.

Copy link
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @nvanbenschoten)


-- commits line 11 at r1:

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The latches are effective at synchronizing with a lease transfer. However, we're already splitting (Next)Lease and CurrentLease in LeaseInfoResponse, so it seems reasonable to make this latchless.

Yeah, I'm not too worried about a race there.


pkg/kv/kvserver/batcheval/cmd_lease_info.go line 38 at r1 (raw file):

this now seems unsafe if not issued in an INCONSISTENT batch. It might lead to spinning between checkLeaseRLocked and executeBatchWithConcurrencyRetries. Should we be setting the skipsLeaseCheck flag on this request?

I don't understand why this is any more unsafe now than with the latch. Is it because we'll fail fast on the lease acquisition due to the lease revocation, and keep spinning until we apply the new lease?

In any case, I sort of considered the lease check a feature here, in terms of using it as a makeshift health check. Being able to discover or acquire a lease seems pretty fundamental to being a functional replica.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @erikgrinaker)


pkg/kv/kvserver/batcheval/cmd_lease_info.go line 38 at r1 (raw file):
I was considering the case where the lease is being transferred away and was concerned about not waiting on the latches held by TransferLease. But I think we will wait on the in-progress transfer in redirectOnOrAcquireLeaseForRequest without spinning, so I don't see a hazard anymore.

In any case, I sort of considered the lease check a feature here, in terms of using it as a makeshift health check. Being able to discover or acquire a lease seems pretty fundamental to being a functional replica.

So the plan is to issue LeaseInfo requests in a CONSISTENT batch so that the recipient attempts to acquire a lease if it doesn't know who the leaseholder is? I figured that would work against our goal of not getting stuck on a partitioned replica, but maybe that's fine if we are only issuing these LeaseInfo requests in circuit breaker async tasks.

Copy link
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @nvanbenschoten)


pkg/kv/kvserver/batcheval/cmd_lease_info.go line 38 at r1 (raw file):

I think we will wait on the in-progress transfer

Yeah, that's my understanding too.

So the plan is to issue LeaseInfo requests in a CONSISTENT batch so that the recipient attempts to acquire a lease if it doesn't know who the leaseholder is? I figured that would work against our goal of not getting stuck on a partitioned replica, but maybe that's fine if we are only issuing these LeaseInfo requests in circuit breaker async tasks.

Yes, at least for now.

Specifically, in the case of a replica that's disconnected from the leader, we have to attempt to acquire a lease and fail/time out in order to trip the DistSender circuit breaker. If we do an INCONSISTENT read and fetch the stale lease, then the health check will succeed even though the replica is non-functional/partitioned.

This is not particularly principled, but it's cheap to implement and has the desired semantics.

I think we'll find that we need more here though. We probably want to batch these probes per node, but we don't want a timeout to apply for the entire batch, we want it to apply for each individual request on the server-side, such that we'll return all the successful responses even if some individual replicas time out. We could adapt LeaseInfo to do something like that, or we could add a dedicated health check request that doesn't implicitly rely on lease acquisition.

For now though, I think this will do.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist and @erikgrinaker)


pkg/kv/kvserver/batcheval/cmd_lease_info.go line 38 at r1 (raw file):

Specifically, in the case of a replica that's disconnected from the leader, we have to attempt to acquire a lease and fail/time out in order to trip the DistSender circuit breaker. If we do an INCONSISTENT read and fetch the stale lease, then the health check will succeed even though the replica is non-functional/partitioned.

This is not particularly principled, but it's cheap to implement and has the desired semantics.

The semantics seem correct for client-side understanding of which replicas may hold the lease, but are they correct for circuit-breaking follower read requests? What exactly are the desired semantics for those requests?

Follower reads are interesting because their need to understand who holds the lease and be connected to that leaseholder is dependent on how far in the past they are backdated. An exact staleness -3s follower read will fail (and attempt to acquire the lease) if it hits a replica who is disconnected from the leaseholder. Meanwhile, an exact staleness -1h follower read, or a bounded staleness follower read, can likely succeed even on a replica that has been partitioned from the leaseholder. Because of this, it's not clear that a single circuit breaker policy should apply for all follower reads against a given range.

So taking a step back, how were you imagining that follower reads should interact with the DistSender circuit breakers in #118943?

Copy link
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist)


pkg/kv/kvserver/batcheval/cmd_lease_info.go line 38 at r1 (raw file):

The semantics seem correct for client-side understanding of which replicas may hold the lease

Not only that, they are required to avoid persistent unavailability with client timeouts below the lease acquisition timeout.

are they correct for circuit-breaking follower read requests? What exactly are the desired semantics for those requests?

This is a good point, I hadn't considered this. I suppose a straightforward way to handle this would be to communicate and track the replica's closed timestamp via the LeaseInfo probes and allow read requests to bypass the circuit breaker if they are below the closed timestamp and the last probe successfully responded with a closed timestamp.

I've added this as a follow-up item in #118943, and I'll submit a PR shortly to extend LeaseInfoResponse with the LAI and closed timestamp so we can get any protocol changes in before the branch cut.

@craig
Copy link
Contributor

craig bot commented Feb 20, 2024

Build succeeded:

@craig craig bot merged commit 3161ca9 into cockroachdb:master Feb 20, 2024
16 of 17 checks passed
Copy link
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/kv/kvserver/batcheval/cmd_lease_info.go line 38 at r1 (raw file):

I'll submit a PR shortly to extend LeaseInfoResponse with the LAI and closed timestamp so we can get any protocol changes in before the branch cut

Of course, in this case we'll get a NLHE back from the replica rather than a LeaseInfoResponse, so that doesn't buy us anything. I suppose we could send two probes back-to-back: one inconsistent (to probe the replica itself for follower reads) and one consistent (to probe lease acquisition/detection for leased reads).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants