-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FAP: Support cancel FAP tasks #8458
Conversation
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
/hold for proxy |
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
/run-all-tests |
@@ -333,6 +333,12 @@ void KVStore::handleDestroy(UInt64 region_id, TMTContext & tmt) | |||
void KVStore::handleDestroy(UInt64 region_id, TMTContext & tmt, const KVStoreTaskLock & task_lock) | |||
{ | |||
const auto region = getRegion(region_id); | |||
if (tmt.getContext().getSharedContextDisagg()->isDisaggregatedStorageMode()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure the FAP tasks being cleaned before apply legacy snapshot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is moving these lines to the beginning of this function to handle some corner cases? Maybe add some comment about it.
if (cancel_handle->blockedWaitFor(std::chrono::milliseconds(1000))) | ||
{ | ||
LOG_INFO(log, "Cancel FAP during peer selecting, region_id={}", region_id); | ||
// Just remove the task from AsyncTasks, it will not write anything in disk during this stage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the task in select stage won't write anything to disk
dbms/src/Storages/KVStore/MultiRaft/Disagg/FastAddPeerContext.cpp
Outdated
Show resolved
Hide resolved
/run-all-tests |
/run-all-tests |
/run-unit-test |
/run-integration-test |
/run-unit-test |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: JaySon-Huang, Lloyd-Pottiger The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
Signed-off-by: JaySon-Huang <tshent@qq.com>
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
/run-all-tests |
/run-unit-test |
/unhold |
Signed-off-by: CalvinNeo <calvinneo1995@gmail.com>
/run-all-tests |
/run-all-tests |
What problem does this PR solve?
Issue Number: close #8382
Problem Summary:
What is changed and how it works?
RELATED PROXY PR: pingcap/tidb-engine-ext#363
FAP phase 1 can be canceled in:
FastAddPeer
when timeoutIt this happens,
FastAddPeer
will block waitFastAddPeerImpl
to finish all cleaning work, and then return to Proxy. Otherwise, a later legacy snapshot may meet a cleaning FAP phase1 task.It can't be canceled elsewhere in TiFlash, since once the task is canceled, it is not accessible from
AsyncTasks
. Then aFastAddPeer
will create a new task for this.It can't be canceled from Proxy, since it involves many modifications in Proxy.
It can't be canceled by a legacy snapshot, since the only way a legacy snapshot is trigger is that the FAP is finished or fallback so it can continue to receive MsgSnapshot.
It can't be canceled by a destroy peer event, since we don't handle MsgAppend in FAP phase 1 either. If there are some corner cases, then it can:
FAP phase 1 result can be cleaned in:
FastAddPeerImpl
when received cancel signal fromFastAddPeer
handleDestroy
prehandleSnapshot
Check List
Tests
Side effects
Documentation
Release note