-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRILL-4531: Add a Drill customized rule for pushing filter past aggre… #444
Conversation
|
||
import org.apache.calcite.rel.rules.FilterAggregateTransposeRule; | ||
|
||
public class DrillFilterAggregateTransposeRule{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this rule does not have an onMatch(), can you add comments about the justification ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the only thing we are doing is creating a FilterAggregateTransposeRule with a custom filter factory, does it make sense to simply create this in planner phase rather than having a custom rule class? It seems like the rule class isn't actually a rule.
Also, can you put a little information next to the rule definition of what happens if we don't use a custom factory? It would seem like the existing transpose rule would have worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The customized rule is not only adding a customer filter factory, it also specifies the filter / aggregate has to be in Drill Logical. The default rule patten will match Filter and Aggregate [1]. In debug, we saw that the default one will match with a LogicalFilter on top of DrillAggregate, which I feel is not right. We do not want to match Rels with mixed conventions.
@sudheeshkatkam helped track to the following error, which seems indicate the rule matching a LogicalFilter /DrillAggregate. Then, exception handling seems hit a cyclic somehow, and hang there forever.
The intention of this customized rule is to ensure that we do not match rel nodes with mixed convention.
Internal error: Error while applying rule FilterAggregateTransposeRule, args [rel#217:LogicalFilter.NONE.ANY([]).[[]](input=rel#126:Subset#5.LOGICAL.ANY%28[]%29.[],condition=IS NOT NULL%28$1%29), rel#6972:DrillAggregateRel.LOGICAL.ANY([]).[](input=rel#6971:Subset#1714.NONE.ANY%28[]%29.[],group={0, 1})]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that restricting the rule patten to Drill logical only would have side effect, since there is some filter pushdown rule works under calcite logical world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So probably two instances, one that is concrete Calcite logical and one that is concrete Drill logical?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. that's something I'm thinking of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacques-n @amansinha100 , could you please take another look at this PR? I modified the patch based on the discussion we had.
In stead of creating two instances (which causes Calcite complain duplicate rule descriptions error), I override match() method to check the rels have the same convention trait.
As a separate work, I'm working on DRILL-3855, to re-enable two rules (FilterSetOpTranspose, ProjectSetOpTranspose). These rules were disabled in DRILL-3257, since it caused TPC-DS Q74 to hang in planning, very similar to what was reported in this JIRA.
+1. Are there other rules that also may be getting applied with mixed convention (Calcite logical and Drill logical) ? I am surprised that we don't encountered more such cases.. |
We did have other rules getting applied mixed convention. For DRILL-3855, I'll fix push filter/project past setop rules. I'll open another JIRA to go through all the other rules. Even with the mixed convention, the issue reported in this jira did not always happen for every query. That's probably why we did not encountered more such cases before. |
I think there is another bug here. DrillAggregateRel should assert when a none convention is passed in. If you add that assertion where do we fail? We should fix that |
In general, this pattern should fail earlier. There should be an AbstractConverter between these rels. We should probably make sure that a rels assert their convention to avoid this type of bug earlier. |
Adding assertion to DrillAggregateRel did not show new failure when using the default FilterAggregateTransposeRule. In the rule, copy() method is used to create new instance of aggregate. So, the aggregate created by this rule should not violate the convention check. The patten of mixed rels does happen and it's the expected thing, since Drill's logical planning is dealing with two convention: NONE and LOGICAL. So, at any time, the RelSet would have RelSubSets for both conventions. It's normal to see a LogicalFilter whose input is a RelSubset with a DrillAggregate as a member. The problem is not we should not have such mixed patten. In stead, the problem is that the Rule should not match such pattern. If the rule matches such mixed Rel, then looks like it could cause problem. With the default rule, the trace of running 50 seconds shows FilterAggregateTransposeRule is fired abnormally high (it could be even high if it does not timeout) If we add the restriction, then the new rule was just fired coupled of times. |
Your expectation about mixed conventions is inconsistent with my own. My expectation: a RelSubset should only have a single convention (in fact it should have only a single trait definition and all children should have satisfying trait definitions). I would expect the following: (1) Prels should only have PHYSICAL My previous statement mistook the convention of the DrillAggregateRel in the below. I thought it said NONE. [rel#217:LogicalFilter.NONE.ANY([]).[], rel#6972:DrillAggregateRel.LOGICAL.ANY([]).] It seems like the actual problem is that LogicalFilter is allowing an input of LOGICAL. I don't believe that LOGICAL should satisfy NONE. It seems that LogicalFilter should assert that its child is of Convention NONE. (In other words, a LogicalFilter shouldn't have Convention transformation properties.) Otherwise, we'll see a large number of invalid (but infinite) plans. |
I agree that your expectation for RelSubset makes sense to me. However, for now it does not happen that way. The following is the trace for the query which went through planning with this customized rule (I removed some rels). Set#1, type: RecordType(BIGINT custkey, ANY custAddress) rel#345:LogicalFilter has a child rel#273 with LOGICAL convention. As another example, for the following query: Select n_name, n_nationkey from cp. The trace: Again, rel#53:LogicalProject has a child rel#35 whose convention is LOGICAL. I think the reason that we have such mixed rels is we have different kinds of rules, used in a single Volcano planning phase.
For instance, ProjectMergeRule, which matches base Project, yet uses default RelFactory, will match both LogicalProject and DrillProject, but produce LogicalProject as outcome. That will cause the mixed rels. 2 things we may consider to fix this:
The above 2 things would take some considerably effort, though. |
Yeah, you're right and thanks for reviewing fully. We can shelve for now but this definitely seems like something we should get to. +1 on the patch. |
Thanks for the discussion, @jacques-n . I opened DRILL-4541 to track the issues we discussed. I'll work on it and see if I can find a solution. |
…gate