Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

causal_ts: fix issue #12498 #12573

Merged
merged 6 commits into from May 19, 2022
Merged

Conversation

pingyu
Copy link
Contributor

@pingyu pingyu commented May 18, 2022

Issue Number: #12498

Signed-off-by: pingyu yuping@pingcap.com

What is changed and how it works?

Issue Number: Close #12498

What's Changed:

Fix issue #12498 

By logging timestamp of regions, we found that the observing of change from follower to leader by on_role_change would be later than the real role change in raft state and adjacent write commands.
When this happen, the adjacent write commands would get unexpected smaller timestamp than the peer transferred from.

The following screenshot shows such a case.
image

The "leader transfer" indicates where the leader transfer happen. The log with "pre_propose" is the adjacent write command. "[is_leader=false]" indicates that the change of peer role have not been observed yet. "on_role_change" is the observing of peer role change.
We can see that the ts of "pre_propose" is smaller than ts of "on_role_change", which is just the violation of causality correctness.

In this PR, we observe region role change to "Candidate" other than "Leader", to avoid the late of flush.

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
    No.
  • Need to cherry-pick to the release branch
    No.

Check List

Tests

  • Manual test (add detailed scripts or steps below)
    E2E testing, with "shuffle_leader_scheduler".

Side effects

  • No.

Release note

Fix issue #12498.

Issue Number: tikv#12498

Signed-off-by: pingyu <yuping@pingcap.com>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented May 18, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • BusyJay
  • Connor1996

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@pingyu
Copy link
Contributor Author

pingyu commented May 18, 2022

/cc @BusyJay @Connor1996

@pingyu
Copy link
Contributor Author

pingyu commented May 18, 2022

/cc @BusyJay @Connor1996

PTAL, thanks~

Copy link
Member

@BusyJay BusyJay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test case should be added.

@ti-chi-bot ti-chi-bot added the status/LGT1 Status: PR - There is already 1 approval label May 18, 2022
Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// So we observe role change to Candidate to fix this issue.
// Also note that when there is only one peer, it would become leader directly.
if role_change.state == StateRole::Candidate
|| (ctx.region().peers.len() == 1 && role_change.state == StateRole::Leader)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is only one peer, it would become leader directly, and on_role_change is called later. You would still miss calling flush()

@ti-chi-bot ti-chi-bot added status/LGT2 Status: PR - There are already 2 approvals and removed status/LGT1 Status: PR - There is already 1 approval labels May 19, 2022
Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to revert LGTM

@ti-chi-bot ti-chi-bot removed the status/LGT2 Status: PR - There are already 2 approvals label May 19, 2022
@Connor1996 Connor1996 added the status/LGT1 Status: PR - There is already 1 approval label May 19, 2022
@ti-chi-bot ti-chi-bot removed the status/LGT1 Status: PR - There is already 1 approval label May 19, 2022
Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added the status/LGT1 Status: PR - There is already 1 approval label May 19, 2022
@ti-chi-bot ti-chi-bot added status/LGT2 Status: PR - There are already 2 approvals and removed status/LGT1 Status: PR - There is already 1 approval labels May 19, 2022
@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

Test case should be added.

OK~
I will add test cases in another PR coming soon.

@iosmanthus
Copy link
Member

/merge

@ti-chi-bot
Copy link
Member

@iosmanthus: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: b78a519

@ti-chi-bot ti-chi-bot added the status/can-merge Status: Can merge to base branch label May 19, 2022
@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-all-tests

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

1 similar comment
@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

2 similar comments
@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@ti-chi-bot
Copy link
Member

@pingyu: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

4 similar comments
@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@pingyu
Copy link
Contributor Author

pingyu commented May 19, 2022

/run-test

@ti-chi-bot ti-chi-bot merged commit cc16b0f into tikv:master May 19, 2022
joccau pushed a commit to joccau/tikv that referenced this pull request Jun 23, 2022
close tikv#12498, ref tikv#12498

Fix issue tikv#12498

Signed-off-by: pingyu <yuping@pingcap.com>

Co-authored-by: iosmanthus <dengliming@pingcap.com>
Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: joccau <zak.zhao@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note size/S status/can-merge Status: Can merge to base branch status/LGT2 Status: PR - There are already 2 approvals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RawKV API V2 timestamp causality violation
5 participants