Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

txn: Do constraint check when handling repeated acqurie_pessimsitic_lock request #14037

Conversation

MyonKeminta
Copy link
Contributor

@MyonKeminta MyonKeminta commented Jan 11, 2023

What is changed and how it works?

Issue Number: Close #14038
Close pingcap/tidb#40114

What's Changed:

Fixes the problem that when handling repeated acquire_pessimistic_lock requests is recevied, should_not_exist is ignored. 

TiKV provides idempotency for these RPC requests, but for acquire_pessimistic_lock, it ignored the possibility that the client may expect a pessimistic_rollback between two acquire_pessimistic_lock request on the same key. In this case the second request may come from another statement and carries `should_not_exist` that wasn't set in the previously finished pessimistic lock request. If the first request successfully acquired the lock and the pessimistic_rollback failed, TiKV may return a sucessful response, making the client believe that the key doesn't exist before. In some rare cases, this has risk to cause data inconsistency.

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test

Side effects

  • Performance regression
    • Consumes more CPU

Release note

Fixed a problem that when a transaction in TiDB fails to execute a pessimistic DML and then executes another DML, if there are random network failures between TiDB and TiKV, it has risk to cause data inconsistency.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Jan 11, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • cfzjywxk
  • sticnarf

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta
Copy link
Contributor Author

/release

if let Some((write, commit_ts)) = write {
// Here `get_write_with_commit_ts` returns only the latest PUT if it exists and
// is not deleted. It's still ok to pass it into `check_data_constraint`.
if locked_with_conflict_ts.is_none() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment about why locked_with_conflict_ts cases can be skipped? I cannot think of any problem but I also want to hear how you think about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a problem for this case:

  1. Raise an Insert pessimistic lock
  2. Locking is successful with conflict and the constraint check is not done
  3. Response of 2 is lost, the client retry
  4. The locked_with_conflict_ts is Some and the constraint check is skipped unexpectedly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It acquires the lock and lets the client (TiDB) retry the statement in this case. TiDB will find the key already exists when it retries the statement. It's a part of avoiding the key being locked by another transaction when the current transaction does a statement retry, making the retry a waste.
But actually I want to adjust this behavior later: When should_not_exist is set and the key exists and the latest version is newer than for_update_ts, return write conflict error even allow_lock_with_conflict is set. I think in most cases, when the statement retries, it's likely that it still want to insert the same key and it should still fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a problem for this case:

  1. Raise an Insert pessimistic lock
  2. Locking is successful with conflict and the constraint check is not done
  3. Response of 2 is lost, the client retry
  4. The locked_with_conflict_ts is Some and the constraint check is skipped unexpectedly

In the current implementation (which I'm planning to adjust later), the retried request in the 4th step will still produce a locked_with_conflict result, and TiDB will still retry. It will notice the key already exists when retrying the statement and abort the statement then, at which time the lock will be pessimistic_rollback-ed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. So the correctness depends on that the returned pessimistic lock result must contain "value existing" information, maybe we could add this in the comment too in case we forget to return the required information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Value field is required in PessimisticLockKeyResult::LockedWithConflict. I think it's just fine for locked_with_conflict cases? I'm actually not very sure where you prefer to add the comment you said

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it's fine.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Copy link
Contributor

@sticnarf sticnarf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MyonKeminta
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@MyonKeminta: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot added status/LGT2 Status: PR - There are already 2 approvals and removed status/LGT1 Status: PR - There is already 1 approval labels Jan 13, 2023
@MyonKeminta
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@MyonKeminta: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 6824bba

@ti-chi-bot ti-chi-bot added the status/can-merge Status: Can merge to base branch label Jan 13, 2023
@ti-chi-bot
Copy link
Member

@MyonKeminta: Your PR was out of date, I have automatically updated it for you.

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-5.3: #14047.

ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Jan 13, 2023
close tikv#14038, close pingcap/tidb#40114

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Jan 13, 2023
close tikv#14038, close pingcap/tidb#40114

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-5.4: #14048.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.1: #14049.

ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Jan 13, 2023
close tikv#14038, close pingcap/tidb#40114

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #14050.

@MyonKeminta MyonKeminta deleted the m/pessimistic-lock-idempotency-check-existence branch January 13, 2023 03:55
ti-chi-bot added a commit that referenced this pull request Jan 13, 2023
…ock request (#14037) (#14049)

ref #14037, close #14038, close pingcap/tidb#40114

Fixes the problem that when handling repeated acquire_pessimistic_lock requests is recevied, should_not_exist is ignored. 

TiKV provides idempotency for these RPC requests, but for acquire_pessimistic_lock, it ignored the possibility that the client may expect a pessimistic_rollback between two acquire_pessimistic_lock request on the same key. In this case the second request may come from another statement and carries `should_not_exist` that wasn't set in the previously finished pessimistic lock request. If the first request successfully acquired the lock and the pessimistic_rollback failed, TiKV may return a sucessful response, making the client believe that the key doesn't exist before. In some rare cases, this has risk to cause data inconsistency.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Co-authored-by: MyonKeminta <9948422+MyonKeminta@users.noreply.github.com>
Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
ti-chi-bot added a commit that referenced this pull request Feb 14, 2023
…ock request (#14037) (#14050)

ref #14037, close #14038, close pingcap/tidb#40114

Fixes the problem that when handling repeated acquire_pessimistic_lock requests is recevied, should_not_exist is ignored. 

TiKV provides idempotency for these RPC requests, but for acquire_pessimistic_lock, it ignored the possibility that the client may expect a pessimistic_rollback between two acquire_pessimistic_lock request on the same key. In this case the second request may come from another statement and carries `should_not_exist` that wasn't set in the previously finished pessimistic lock request. If the first request successfully acquired the lock and the pessimistic_rollback failed, TiKV may return a sucessful response, making the client believe that the key doesn't exist before. In some rare cases, this has risk to cause data inconsistency.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-5.3 Type: Need cherry pick to release-5.3 needs-cherry-pick-release-5.4 Type: Need cherry pick to release-5.4 needs-cherry-pick-release-6.1 needs-cherry-pick-release-6.5 release-note size/L status/can-merge Status: Can merge to base branch status/LGT2 Status: PR - There are already 2 approvals
Projects
None yet
5 participants