Fix bug introduced by #7644#7694
Conversation
|
/rebuild |
SeaRise
left a comment
There was a problem hiding this comment.
If two different mpp queries share a query set, if one of the queries has a limit, will the other query be canceled unexpectedly?
This could be a problems, but it is not related to this pr, I think this problem need to be solved by mpp gather level's cancel. |
ok |
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: SeaRise, yibin87 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/run-all-tests |
|
/run-integration-test |
|
/run-all-tests |
|
@windtalker: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests
If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
/rebuild |
|
/run-integration-test |
What problem does this PR solve?
Issue Number: close #7687, close #7678, close #7677
Problem Summary:
In #7644, it introduces query level process_list_entry for mpp query, that is to say, for a mpp query, all the mpp tasks share the same process_list_entry and memory_tracker. And the
process_list_entryis saved inMPPQueryTaskSetinMPPTaskManager.It works fine for normal queries, but for queries that may contains multiple independent sub-mpp-queries, there is a potential data race.
Consider a query with non-correlated subquery:
select * from t where id > (select max(a) from s), this query contains two independent sub queries:subquery_1:
select max(a) from ssubquery_2:
select * from t where id > result_of_subquery_1subquery_1 is executed first, then a possible data race will be
initProcessListEntry, and get theprocess_list_entry/memory_trackersaved inMPPQueryTaskSetunregisterTask, and MPPTaskManager found that there is no active mpp tasks, so theMPPQueryTaskSetis also removedinitProcessListEntry, it will create a newMPPQueryTaskSetand create newprocess_list_entry/memory_trackerAs you can see, after t5, some of subquery 2's task still holds
process_list_entry/memory_trackerthat was generated for subquery 1.What is changed and how it works?
The root cause of these problem is after
initProcessListEntry,MPPQueryTaskSetcan still be released due to no active mpp tasks.The pr refine the flow of
registerTask, now for a mpp task, it will interact with MPPTaskManager as follows:process_list_entry/memory_tracker, unlike the previous version ofregisterTask, in the new implementation, a task is not visible afterregisterTask.establishMPPConnectioncan connect to the mpp tunnelMPPQueryTaskSetcan only be released if all the registered mpp tasks are unregistered.Check List
Tests
Run random failpoint manually
Side effects
Documentation
Release note