Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: make sure mpp join task's hashCols are all contained of its plan's schema #52836

Merged
merged 4 commits into from
Apr 25, 2024

Conversation

yibin87
Copy link
Contributor

@yibin87 yibin87 commented Apr 23, 2024

What problem does this PR solve?

Issue Number: close #52828

Problem Summary:

  1. In planner: Column prune improvement for MPP Join and TableScan+Filter operators #52143, new projections might be added above mpp join operators to prune useless columns. However, each mpp task has an attribute named "hashCols" which are designed to eliminate useless exchange nodes, and these columns may not be contained in pruned columns.

// outer task is the task that will pass its MPPPartitionType to the join result
// for broadcast inner join, it should be the non-broadcast side, since broadcast side is always the build side, so
// just use the probe side is ok.
// for hash inner join, both side is ok, by default, we use the probe side
// for outer join, it should always be the outer side of the join
// for semi join, it should be the left side(the same as left out join)
outerTaskIndex := 1 - p.InnerChildIdx
if p.JoinType != InnerJoin {
if p.JoinType == RightOuterJoin {
outerTaskIndex = 1
} else {
outerTaskIndex = 0
}
}
// can not use the task from tasks because it maybe updated.
outerTask := lTask
if outerTaskIndex == 1 {
outerTask = rTask
}
task := &mppTask{
p: p,
partTp: outerTask.partTp,
hashCols: outerTask.hashCols,
}

For example:
select A.id from A join B on A.id = B.id; Suppose B is probe side, and it's hash inner join.
After logical column pruning, the output schema of A join B will be A.id only; while the task's hashCols will be B.id. To make matters worse, the hashCols may be used to check if extra cast projection needs to be added, then the newly added projection will expect B.id as input schema. So this PR makes sure hashCols are included in task.p's schema.

  1. In physical projection elimination, when a projection is above another projection and satisfy certain conditions, we'll remove current projection and use current projection's schema to reset child projection. However,
    empty projection(which schema is totally pruned) with projection child, will set child projection's schema to empty, which doesn't make sense. In such situation, keep child projection's schema seems more reasonable.

What changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

Signed-off-by: yibin <huyibin@pingcap.com>
@ti-chi-bot ti-chi-bot bot added release-note-none sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 23, 2024
Copy link

tiprow bot commented Apr 23, 2024

Hi @yibin87. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yibin87
Copy link
Contributor Author

yibin87 commented Apr 23, 2024

/cc @windtalker @winoros

@ti-chi-bot ti-chi-bot bot requested review from windtalker and winoros April 23, 2024 06:51
Copy link

codecov bot commented Apr 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 55.9880%. Comparing base (4354682) to head (111abf3).
Report is 31 commits behind head on master.

Additional details and impacted files
@@                Coverage Diff                @@
##             master     #52836         +/-   ##
=================================================
- Coverage   72.3201%   55.9880%   -16.3322%     
=================================================
  Files          1474       1589        +115     
  Lines        427611     602598     +174987     
=================================================
+ Hits         309249     337383      +28134     
- Misses        99057     242136     +143079     
- Partials      19305      23079       +3774     
Flag Coverage Δ
integration 37.1170% <11.9047%> (?)
unit 71.2739% <100.0000%> (+0.0860%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 53.9957% <ø> (ø)
parser ∅ <ø> (∅)
br 50.0685% <ø> (+8.9298%) ⬆️

Signed-off-by: yibin <huyibin@pingcap.com>
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 23, 2024
@yibin87
Copy link
Contributor Author

yibin87 commented Apr 23, 2024

/test check-dev

Copy link

tiprow bot commented Apr 23, 2024

@yibin87: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test check-dev

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So LGTM currently.
But you could add a TODO for our planner side.
The task.HashCols should be stored in the p.schema. i.e. Planner should maintain it instead of letting you re-add it here.

@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/hold

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2024
Signed-off-by: yibin <huyibin@pingcap.com>
Signed-off-by: yibin <huyibin@pingcap.com>
@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/test pull-mysql-client-test

Copy link

tiprow bot commented Apr 24, 2024

@yibin87: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test pull-mysql-client-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2024
@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/cc @SeaRise

@ti-chi-bot ti-chi-bot bot requested a review from SeaRise April 24, 2024 03:51
Copy link

ti-chi-bot bot commented Apr 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SeaRise, winoros

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the lgtm label Apr 24, 2024
Copy link

ti-chi-bot bot commented Apr 24, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-04-23 15:30:58.918717287 +0000 UTC m=+101415.658620198: ☑️ agreed by winoros.
  • 2024-04-24 05:09:21.517769643 +0000 UTC m=+150518.257672555: ☑️ agreed by SeaRise.

@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/test mysql-test

@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/test pull-integration-ddl-test

Copy link

tiprow bot commented Apr 24, 2024

@yibin87: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test mysql-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

tiprow bot commented Apr 24, 2024

@yibin87: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test pull-integration-ddl-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/test check-dev2

Copy link

tiprow bot commented Apr 24, 2024

@yibin87: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test check-dev2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yibin87
Copy link
Contributor Author

yibin87 commented Apr 24, 2024

/hold

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2024
@yibin87
Copy link
Contributor Author

yibin87 commented Apr 25, 2024

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 25, 2024
@ti-chi-bot ti-chi-bot bot merged commit 9fffc2f into pingcap:master Apr 25, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none sig/planner SIG: Planner size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tiflash mpp join generate physical plan failed in resolve index phase
3 participants