Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix hung in remote lock if cn restart quickly to 1.0-dev #13004

Merged

Conversation

zhangxu19830126
Copy link
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #12554

What this PR does / why we need it:

Fix hung in remote lock if cn restart quickly, it will call remote lock on rpc io goroutine, and block all remote requests.

Assuming that we have cn0, cn1, and table1, we consider the following timing:

  1. at time t0, cn0 obtains the t1 lock table, and the lock-table bind is t1-cn0-table1-version1.
  2. at time t1, cn0 down.
  3. at time t2, cn0 restarted, and (t2-t1) < cfg.KeepBindTimeout,so lock-table allocator will keep
    the bind t1-cn0-table1-version1 valid
  4. cn1 try to lock table1 and gets the binding t1-cn0-table1-version1 from allocator or local cache, then
    sends a lock request to cn0.
  5. cn0 receive the lock request, but the lock-table bind is t1-cn0-table1-version2, and cn0 cn0 will consider
    this lock-table bind to be a remote lock table, because the serviceID(t1-cn0) != serviceID(t2-cn0). This
    will make rpc handle blocked.

@matrix-meow matrix-meow added the size/S Denotes a PR that changes [10,99] lines label Nov 27, 2023
@mergify mergify bot requested a review from sukki37 November 27, 2023 07:01
@mergify mergify bot added the kind/bug Something isn't working label Nov 27, 2023
@sukki37 sukki37 merged commit bef2827 into matrixorigin:1.0-dev Nov 27, 2023
14 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working size/S Denotes a PR that changes [10,99] lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants