New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[multistage] Fix Predicate Pushdown by Using Rule Collection #10409
Conversation
@@ -115,4 +116,12 @@ private PinotQueryRuleSets() { | |||
PinotAggregateExchangeNodeInsertRule.INSTANCE, | |||
PinotWindowExchangeNodeInsertRule.INSTANCE | |||
); | |||
|
|||
public static final Collection<RelOptRule> FILTER_PUSHDOWN_RULES = ImmutableList.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: by default Calcite will use depth first order for running these rules. Also it won't do a "fullRestartAfterTransformation" unless we use HepMatchOrder.TOP_DOWN or HepMatchOrder.BOTTOM_UP.
I think using depth first order without doing full restarts after transformation should be fine but would be good if someone else also chimes in. Note that the match order can be changed for only this collection (it's a HepInstruction) so it doesn't need to be a global setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now we should be good. HEP planner is used here to avoid a lengthy volcano planner that adds latency to the planning phase. as long as the plan results are determinisitc we can always change the way we configure the planner IMO
Codecov Report
@@ Coverage Diff @@
## master #10409 +/- ##
============================================
+ Coverage 63.24% 70.36% +7.12%
- Complexity 5069 6104 +1035
============================================
Files 2030 2055 +25
Lines 110629 111394 +765
Branches 16842 16940 +98
============================================
+ Hits 69965 78384 +8419
+ Misses 35538 27517 -8021
- Partials 5126 5493 +367
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 310 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
45a65cd
to
fc34fb7
Compare
// ---- rules apply before exchange insertion. | ||
PinotFilterExpandSearchRule.INSTANCE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter expand search rule can be applied with filter rules or basic rules IMO.
// First run the basic rules using 1 HepInstruction per rule. | ||
for (RelOptRule relOptRule : PinotQueryRuleSets.BASIC_RULES) { | ||
hepProgramBuilder.addRuleInstance(relOptRule); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need 1 HelpInstruction per rule? might be good to add some explanation
"\n LogicalProject(col4=[$0], col2=[$1], col3=[$2])", | ||
"\n LogicalFilter(condition=[AND(>=($2, 0), =($1, 'pink floyd'))])", | ||
"\n LogicalTableScan(table=[[a]])", | ||
"\nLogicalProject(avg=[/(CASE(=($1, 0), null:DECIMAL(1000, 0), $0), $1)])", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder what made this 2 project merge possible after the rule changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because I am now doing the pruning/merging of redundant operators in the end and using a RuleCollection. (if that's what you meant)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah. gotcha. this is great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm mostly. please kindly take a look at the comments. thank you for the rule split. this is amazing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. retriggering CI to make sure it is clean
At present the optimizer rules have the following issues:
This PR leverages
RuleCollection
to fix these issues. ARuleCollection
is a collection of rules that are run in a singleHepInstruction
which ensures that we don't need to rely on ordering of the rules in a given collection. E.g. say we have:Filter > Project > Join > Project > Join > Project > TableScan
, if we use a different HepInstruction for pushing filter past project or joins then we won't be able to push the filter down all the way.The UTs changed in this PR also demonstrate the expected improvements in the plans.