New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-3855: Enable FilterSetOpTransposeRule, DrillProjectSetOpTranspo… #1226

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@vdiravka
Member

vdiravka commented Apr 19, 2018

…seRule

@arina-ielchiieva

This comment has been minimized.

Contributor

arina-ielchiieva commented Apr 20, 2018

@amansinha100 / @vvysotskyi could you please review?

@@ -238,26 +238,27 @@ protected DrillRel convertToRawDrel(final RelNode relNode) throws SqlUnsupported
// HEP Directory pruning .
final RelNode pruned = transform(PlannerType.HEP_BOTTOM_UP, PlannerPhase.DIRECTORY_PRUNING, relNode);
final RelTraitSet logicalTraits = pruned.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
final RelNode intermediateNode = transform(PlannerType.HEP, PlannerPhase.PRE_LOGICAL_PLANNING, pruned);

This comment has been minimized.

@amansinha100

amansinha100 Apr 20, 2018

Contributor

Did you consider applying this even before the DIRECTORY_PRUNING ? If a filter is on dirN column and gets pushed on both sides of the union-all, then directory pruning can be done.

This comment has been minimized.

@vdiravka

vdiravka Apr 24, 2018

Member

Nice suggestion. Done.

@@ -178,6 +178,12 @@ public RuleSet getRules(OptimizerRulesContext context, Collection<StoragePlugin>
PlannerPhase.getPhysicalRules(context),
getStorageRules(context, plugins, this));
}
},
PRE_LOGICAL_PLANNING("Planning with Hep planner only for rules, which are failed for Volcano planner") {

This comment has been minimized.

@amansinha100

amansinha100 Apr 20, 2018

Contributor

I thought about whether a separate group of rules is needed instead of adding it to existing group of HEP rules. But I suppose it is ok to create the separate group with the expectation that once Calcite-1271 is fixed for Volcano planner, we have the option to remove this group. Thinking more about this..filter pushdown past a union-all is cost-safe since union-all output rows is equal or greater than input rows. (it is not cost-safe for pushdown past union-distinct).

This comment has been minimized.

@vdiravka

vdiravka Apr 24, 2018

Member

You are right, it is necessary to add these rules to Drill's staticRuleSet, so then these setOpTranspose rules can cover more cases.

When it will be added to Volcano Planner we can think, whether to remove PlannerPhase.PRE_LOGICAL_PLANNING or to leave it for more complete directory_pruning or think to move PlannerPhase.DIRECTORY_PRUNING to staticRuleSet for Volcano Planner too.

".*UnionAll.*\n" +
".*Scan.*columns=\\[`n_regionkey`\\].*\n" +
".*Scan.*columns=\\[`r_regionkey`\\].*"};
final String[] expectedPlan = {"Sort.*\n" +

This comment has been minimized.

@amansinha100

amansinha100 Apr 20, 2018

Contributor

Apart from these tests, do we have any unit tests that produce empty results once the filter is pushed on either side of the union-all ? I expect that there could be some issues uncovered due to empty batch handling. However, those issues could be treated as separate JIRA.

This comment has been minimized.

@vdiravka

vdiravka Apr 24, 2018

Member

Looks like there are no issues related to it. I have added additional test case to verify it:
select n_nationkey from (select n_nationkey, n_name, n_comment from cp."tpch/nation.parquet" union all select r_regionkey, r_name, r_comment from cp."tpch/region.parquet") where n_nationkey > 4;
When filter is pushed down the result from right side of UNION will be empty.

Generally using UNION operator with empty data was resolved in DRILL-4185 and we have unit tests for other cases with empty data batches.

@vdiravka

@amansinha100 Thank you for review. I have answered your comments.
Please take a look.

@@ -178,6 +178,12 @@ public RuleSet getRules(OptimizerRulesContext context, Collection<StoragePlugin>
PlannerPhase.getPhysicalRules(context),
getStorageRules(context, plugins, this));
}
},
PRE_LOGICAL_PLANNING("Planning with Hep planner only for rules, which are failed for Volcano planner") {

This comment has been minimized.

@vdiravka

vdiravka Apr 24, 2018

Member

You are right, it is necessary to add these rules to Drill's staticRuleSet, so then these setOpTranspose rules can cover more cases.

When it will be added to Volcano Planner we can think, whether to remove PlannerPhase.PRE_LOGICAL_PLANNING or to leave it for more complete directory_pruning or think to move PlannerPhase.DIRECTORY_PRUNING to staticRuleSet for Volcano Planner too.

@@ -238,26 +238,27 @@ protected DrillRel convertToRawDrel(final RelNode relNode) throws SqlUnsupported
// HEP Directory pruning .
final RelNode pruned = transform(PlannerType.HEP_BOTTOM_UP, PlannerPhase.DIRECTORY_PRUNING, relNode);
final RelTraitSet logicalTraits = pruned.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
final RelNode intermediateNode = transform(PlannerType.HEP, PlannerPhase.PRE_LOGICAL_PLANNING, pruned);

This comment has been minimized.

@vdiravka

vdiravka Apr 24, 2018

Member

Nice suggestion. Done.

".*UnionAll.*\n" +
".*Scan.*columns=\\[`n_regionkey`\\].*\n" +
".*Scan.*columns=\\[`r_regionkey`\\].*"};
final String[] expectedPlan = {"Sort.*\n" +

This comment has been minimized.

@vdiravka

vdiravka Apr 24, 2018

Member

Looks like there are no issues related to it. I have added additional test case to verify it:
select n_nationkey from (select n_nationkey, n_name, n_comment from cp."tpch/nation.parquet" union all select r_regionkey, r_name, r_comment from cp."tpch/region.parquet") where n_nationkey > 4;
When filter is pushed down the result from right side of UNION will be empty.

Generally using UNION operator with empty data was resolved in DRILL-4185 and we have unit tests for other cases with empty data batches.

@amansinha100

This comment has been minimized.

Contributor

amansinha100 commented Apr 24, 2018

Thanks for making the changes. +1

@asfgit asfgit closed this in 6f6229a Apr 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment