-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-23.1: kv: prevent lease interval regression during expiration-to-epoch promotion #123560
Open
nvanbenschoten
wants to merge
4
commits into
cockroachdb:release-23.1
Choose a base branch
from
nvanbenschoten:backport23.1-123442
base: release-23.1
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
release-23.1: kv: prevent lease interval regression during expiration-to-epoch promotion #123560
nvanbenschoten
wants to merge
4
commits into
cockroachdb:release-23.1
from
nvanbenschoten:backport23.1-123442
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for opening a backport. Please check the backport criteria before merging:
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
Also, please add a brief release justification to the body of your PR to justify this |
blathers-crl
bot
added
the
backport
Label PR's that are backports to older release branches
label
May 3, 2024
sumeerbhola
approved these changes
May 3, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has an extra commit. Assuming you had to pickup an old commit to make a clean backport.
This can be triggered rapidly because each replica might call this as it tries and fails to acquire a lease.
This commit adds a check that a replica does not perform a lease transfer if it does not own the previous lease. This allows us to make a stronger assumption a layer down. Epic: None Release note: None
…otion Fixes cockroachdb#121480. Fixes cockroachdb#122016. This commit resolves a bug in the expiration-based to epoch-based lease promotion transition, where the lease's effective expiration could be allowed to regress. To prevent this, we detect when such cases are about to occur and synchronously heartbeat the leaseholder's liveness record. This works because the liveness record interval and the expiration-based lease interval are the same, so a synchronous heartbeat ensures that the liveness record has a later expiration than the prior lease by the time the lease promotion goes into effect. The code structure here leaves a lot to be desired, but since we're going to be cleaning up and/or removing a lot of this code soon anyway, I'm prioritizing backportability. This is therefore more targeted and less general than it could be. The resolution here also leaves something to be desired. A nicer fix would be to introduce a minimum_lease_expiration field on epoch-based leases so that we can locally ensure that the expiration does not regress. This is what we plan to do for leader leases in the upcoming release. We don't make this change because it would be require a version gate to avoid replica divergence, so it would not be backportable. Release note (bug fix): Fixed a rare bug where a lease transfer could lead to a `side-transport update saw closed timestamp regression` panic. The bug could occur when a node was overloaded and failing to heartbeat its node liveness record.
This commit adds a check that `args.PrevLease` is equivalent to `cArgs.EvalCtx.GetLease()` to RequestLease. This ensures that the validation here is consistent with the validation that was performed when the lease request was constructed. Release note: None Epic: None
nvanbenschoten
force-pushed
the
backport23.1-123442
branch
from
May 30, 2024 18:03
3556e14
to
6ad65c0
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 3/3 commits from #123442.
/cc @cockroachdb/release
Fixes #121480.
Fixes #122016.
This commit resolves a bug in the expiration-based to epoch-based lease promotion transition, where the lease's effective expiration could be allowed to regress. To prevent this, we detect when such cases are about to occur and synchronously heartbeat the leaseholder's liveness record. This works because the liveness record interval and the expiration-based lease interval are the same, so a synchronous heartbeat ensures that the liveness record has a later expiration than the prior lease by the time the lease promotion goes into effect.
The code structure here leaves a lot to be desired, but since we're going to be cleaning up and/or removing a lot of this code soon anyway, I'm prioritizing backportability. This is therefore more targeted and less general than it could be.
The resolution here also leaves something to be desired. A nicer fix would be to introduce a minimum_lease_expiration field on epoch-based leases so that we can locally ensure that the expiration does not regress. This is what we plan to do for leader leases in the upcoming release. We don't make this change because it would be require a version gate to avoid replica divergence, so it would not be backportable.
Release note (bug fix): Fixed a rare bug where a lease transfer could lead to a
side-transport update saw closed timestamp regression
panic. The bug could occur when a node was overloaded and failing to heartbeat its node liveness record.Release justification: fixes rare, but serious panic.