-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[refactor](multicast) change the way multicast do filter, project and shuffle #21412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
02685fd to
7a6532e
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run buildall |
|
PR approved by anyone and no changes requested. |
7a6532e to
6faa3a6
Compare
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run buildall |
HappenLee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
… shuffle (#21412) Co-authored-by: Jerry Hu <mrhhsg@gmail.com> 1. Filtering is done at the sending end rather than the receiving end 2. Projection is done at the sending end rather than the receiving end 3. Each sender can use different shuffle policies to send data
…nsumer (#58964) Related PR: #21412 Problem Summary: This pull request improves the handling of distribution properties (specifically "must shuffle") for `PhysicalProject` and `PhysicalFilter` nodes in the query planner, and adds comprehensive unit tests to ensure correctness. The main logic ensures that when certain child nodes require shuffling, the planner correctly adjusts the distribution requirements, especially in the presence of `Project`, `Filter`, and `Limit` nodes. Key changes include: **Distribution Property Handling Enhancements:** * Added logic in `ChildrenPropertiesRegulator` to check if a child node under a `PhysicalProject` or `PhysicalFilter` requires a "must shuffle" distribution, and to adjust the children’s properties accordingly. This is done via the new `mustShuffleUnderProjectOrFilter` method. * Included `PhysicalLimit` in the set of nodes that can trigger a shuffle requirement, by updating imports and logic. **Testing Improvements:** * Added a new test class `ChildrenPropertiesRegulatorTest.java` with detailed unit tests for the handling of "must shuffle" properties under `Project`, `Filter`, and `Limit` nodes. These tests use mocks to simulate various plan trees and assert correct distribution specification propagation. **Regression Test Coverage:** * Added a new regression test in `cte.groovy` to verify correct behavior when multiple `Project` nodes are present on a CTE consumer, ensuring the planner handles such cases as expected. These changes collectively make the planner more robust in handling complex plan trees with respect to distribution requirements, and ensure correctness through thorough testing.
…nsumer (#58964) Related PR: #21412 Problem Summary: This pull request improves the handling of distribution properties (specifically "must shuffle") for `PhysicalProject` and `PhysicalFilter` nodes in the query planner, and adds comprehensive unit tests to ensure correctness. The main logic ensures that when certain child nodes require shuffling, the planner correctly adjusts the distribution requirements, especially in the presence of `Project`, `Filter`, and `Limit` nodes. Key changes include: **Distribution Property Handling Enhancements:** * Added logic in `ChildrenPropertiesRegulator` to check if a child node under a `PhysicalProject` or `PhysicalFilter` requires a "must shuffle" distribution, and to adjust the children’s properties accordingly. This is done via the new `mustShuffleUnderProjectOrFilter` method. * Included `PhysicalLimit` in the set of nodes that can trigger a shuffle requirement, by updating imports and logic. **Testing Improvements:** * Added a new test class `ChildrenPropertiesRegulatorTest.java` with detailed unit tests for the handling of "must shuffle" properties under `Project`, `Filter`, and `Limit` nodes. These tests use mocks to simulate various plan trees and assert correct distribution specification propagation. **Regression Test Coverage:** * Added a new regression test in `cte.groovy` to verify correct behavior when multiple `Project` nodes are present on a CTE consumer, ensuring the planner handles such cases as expected. These changes collectively make the planner more robust in handling complex plan trees with respect to distribution requirements, and ensure correctness through thorough testing.
Proposed changes
Issue Number: close #xxx
Co-authored-by: Jerry Hu mrhhsg@gmail.com
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...