Skip to content

Fix bug introduced by #7644#7694

Merged
ti-chi-bot[bot] merged 13 commits intopingcap:masterfrom
windtalker:query_memory_tracker_fix
Jun 27, 2023
Merged

Fix bug introduced by #7644#7694
ti-chi-bot[bot] merged 13 commits intopingcap:masterfrom
windtalker:query_memory_tracker_fix

Conversation

@windtalker
Copy link
Contributor

@windtalker windtalker commented Jun 25, 2023

What problem does this PR solve?

Issue Number: close #7687, close #7678, close #7677

Problem Summary:

In #7644, it introduces query level process_list_entry for mpp query, that is to say, for a mpp query, all the mpp tasks share the same process_list_entry and memory_tracker. And the process_list_entry is saved in MPPQueryTaskSet in MPPTaskManager.
It works fine for normal queries, but for queries that may contains multiple independent sub-mpp-queries, there is a potential data race.
Consider a query with non-correlated subquery: select * from t where id > (select max(a) from s), this query contains two independent sub queries:
subquery_1: select max(a) from s
subquery_2: select * from t where id > result_of_subquery_1
subquery_1 is executed first, then a possible data race will be

time description
t1 subquery 1's tasks are dispatched and executing
t2 subquery 2's tasks are dispatched
t3 some of subquery 2's tasks call initProcessListEntry, and get the process_list_entry/memory_tracker saved in MPPQueryTaskSet
t4 subquery 1 is finished, all its tasks called unregisterTask, and MPPTaskManager found that there is no active mpp tasks, so the MPPQueryTaskSet is also removed
t5 the rest of the subquery 2's tasks call initProcessListEntry, it will create a new MPPQueryTaskSet and create new process_list_entry/memory_tracker

As you can see, after t5, some of subquery 2's task still holds process_list_entry/memory_tracker that was generated for subquery 1.

What is changed and how it works?

The root cause of these problem is after initProcessListEntry, MPPQueryTaskSet can still be released due to no active mpp tasks.
The pr refine the flow of registerTask, now for a mpp task, it will interact with MPPTaskManager as follows:

  1. register task: this will register the task meta to MPPTaskManager, and get process_list_entry/memory_tracker, unlike the previous version of registerTask, in the new implementation, a task is not visible after registerTask.
  2. make task public: this will make the task visible to other, so establishMPPConnection can connect to the mpp tunnel
  3. unregister task: this will remove the task from MPPTaskManager

MPPQueryTaskSet can only be released if all the registered mpp tasks are unregistered.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
    Run random failpoint manually
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

windtalker and others added 7 commits June 23, 2023 16:16
Signed-off-by: xufei <xufei@pingcap.com>
Signed-off-by: xufei <xufei@pingcap.com>
Signed-off-by: xufei <xufei@pingcap.com>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-triage-completed release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed do-not-merge/needs-triage-completed labels Jun 25, 2023
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@windtalker
Copy link
Contributor Author

/rebuild

@SeaRise SeaRise self-requested a review June 27, 2023 02:56
Copy link
Contributor

@SeaRise SeaRise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If two different mpp queries share a query set, if one of the queries has a limit, will the other query be canceled unexpectedly?

@windtalker
Copy link
Contributor Author

her query be cancel

This could be a problems, but it is not related to this pr, I think this problem need to be solved by mpp gather level's cancel.

@SeaRise
Copy link
Contributor

SeaRise commented Jun 27, 2023

her query be cancel

This could be a problems, but it is not related to this pr, I think this problem need to be solved by mpp gather level's cancel.

ok

@SeaRise SeaRise self-requested a review June 27, 2023 04:25
Copy link
Contributor

@SeaRise SeaRise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others LGTM

Copy link
Contributor

@yibin87 yibin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Jun 27, 2023
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 27, 2023
Copy link
Contributor

@SeaRise SeaRise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot ti-chi-bot bot added the lgtm label Jun 27, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jun 27, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SeaRise, yibin87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jun 27, 2023
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jun 27, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-06-27 08:14:06.548690034 +0000 UTC m=+695411.949940468: ☑️ agreed by yibin87.
  • 2023-06-27 08:25:22.988571808 +0000 UTC m=+696088.389822256: ☑️ agreed by SeaRise.

@windtalker
Copy link
Contributor Author

/run-all-tests

@windtalker
Copy link
Contributor Author

/run-integration-test

@windtalker
Copy link
Contributor Author

/run-all-tests

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jun 27, 2023

@windtalker: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

trigger some heavy tests which will not run always when PR updated.

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@SeaRise
Copy link
Contributor

SeaRise commented Jun 27, 2023

/rebuild

@SeaRise
Copy link
Contributor

SeaRise commented Jun 27, 2023

/run-integration-test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

3 participants