[SPARK-11179] [SQL] Push filters through aggregate #9167

nitin2goyal · 2015-10-19T06:55:38Z

Push conjunctive predicates though Aggregate operators when their references are a subset of the groupingExpressions.

Query plan before optimisation :-
Filter ((c#138L = 2) && (a#0 = 3))
Aggregate [a#0], [a#0,count(b#1) AS c#138L]
Project [a#0,b#1]
LocalRelation [a#0,b#1,c#2]

Query plan after optimisation :-
Filter (c#138L = 2)
Aggregate [a#0], [a#0,count(b#1) AS c#138L]
Filter (a#0 = 3)
Project [a#0,b#1]
LocalRelation [a#0,b#1,c#2]

…'group by' attribute set

marmbrus · 2015-10-19T20:34:01Z

Tests please. Look here for examples.

…'group by' attribute set

nitin2goyal · 2015-10-20T07:22:26Z

Added tests

hvanhovell · 2015-10-20T07:40:49Z

We could do a similar thing for window functions.

rxin · 2015-10-20T08:31:01Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

rather than checking for complete overlap, can we pull out the expressions for group by columns and push those down?

Good point. Incorporated.

…'group by' attribute set

marmbrus · 2015-10-20T17:35:41Z

ok to test

marmbrus · 2015-10-20T17:37:32Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

nit: indentation. I'm not sure we have a strict rule, but no indent is kinda hard to follow. I'd probably try to make it a tree if it fits?

case filter @ Filter(condition, aggregate @ Aggregate(groupingExpressions, aggregateExpressions, grandChild)) =>

or just 4 space indent?

marmbrus · 2015-10-20T17:39:41Z

This looks great, thanks for doing it!

Can you cleanup the title [SPARK-11179] [SQL] Push filters through aggregate
and the description: Push conjunctive predicates though Aggregate operators when their references are a subset of the groupingExpressions.

If you are feeling ambitious I'd include query plans before and after the optimization too.

These become the commit message when we use our merge tool.

SparkQA · 2015-10-20T17:52:31Z

Test build #43995 has finished for PR 9167 at commit 671fbb3.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

Push conjunctive predicates though Aggregate operators when their references are a subset of the groupingExpressions. Query plan before optimisation :- Filter ((c#138L = 2) && (a#0 = 3)) Aggregate [a#0], [a#0,count(b#1) AS c#138L] Project [a#0,b#1] LocalRelation [a#0,b#1,c#2] Query plan after optimisation :- Filter (c#138L = 2) Aggregate [a#0], [a#0,count(b#1) AS c#138L] Filter (a#0 = 3) Project [a#0,b#1] LocalRelation [a#0,b#1,c#2]

nitin2goyal · 2015-10-20T18:43:34Z

Thanks for reviewing it Michael. Addressed 1 code review comment, fixed couple of scalastyle issues and cleaned up title and description in latest commit and this PR.

SparkQA · 2015-10-20T18:47:14Z

Test build #44002 has finished for PR 9167 at commit f422aa8.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

nitin2goyal · 2015-10-20T19:03:19Z

@marmbrus Test failed again due to whitespace issue. I have already removed whitespace from that line and I am not getting scalastyle issue when compiling locally (don't see any whitespace in code review also). Am I missing something?

[error] /home/jenkins/workspace/SparkPullRequestBuilder/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala:664:0: Whitespace at end of line

marmbrus · 2015-10-20T19:04:57Z

try sbt test:scalastyle

rxin · 2015-10-20T19:07:09Z

You can run local style tests by

dev/lint-scala

Push conjunctive predicates though Aggregate operators when their references are a subset of the groupingExpressions. Query plan before optimisation :- Filter ((c#138L = 2) && (a#0 = 3)) Aggregate [a#0], [a#0,count(b#1) AS c#138L] Project [a#0,b#1] LocalRelation [a#0,b#1,c#2] Query plan after optimisation :- Filter (c#138L = 2) Aggregate [a#0], [a#0,count(b#1) AS c#138L] Filter (a#0 = 3) Project [a#0,b#1] LocalRelation [a#0,b#1,c#2]

nitin2goyal · 2015-10-21T04:47:04Z

Somehow, I wasn't getting scalastyle issue on my branch with above 2 commands. Cloned apache spark master, cherry-picked my changes and then I got the error. Checked-in the whitespace removal. Please test this.

SparkQA · 2015-10-21T07:10:47Z

Test build #44041 has finished for PR 9167 at commit 82fc386.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

marmbrus · 2015-10-21T17:43:45Z

Thanks, merging to master!

SPARK-11179: Push filters through aggregate if filters are subset of …

4ee8058

…'group by' attribute set

SPARK-11179: Push filters through aggregate if filters are subset of …

3b016b7

…'group by' attribute set

rxin reviewed Oct 20, 2015
View reviewed changes

SPARK-11179: Push filters through aggregate if filters are subset of …

671fbb3

…'group by' attribute set

marmbrus reviewed Oct 20, 2015
View reviewed changes

nitin2goyal changed the title ~~SPARK-11179: Push filters through aggregate if filters are subset of …~~ [SPARK-11179] [SQL] Push filters through aggregate Oct 20, 2015

asfgit closed this in f62e326 Oct 21, 2015

[SPARK-11179] [SQL] Push filters through aggregate #9167

[SPARK-11179] [SQL] Push filters through aggregate #9167

Uh oh!

Conversation

nitin2goyal commented Oct 19, 2015

Uh oh!

marmbrus commented Oct 19, 2015

Uh oh!

nitin2goyal commented Oct 20, 2015

Uh oh!

hvanhovell commented Oct 20, 2015

Uh oh!

rxin Oct 20, 2015

Choose a reason for hiding this comment

Uh oh!

nitin2goyal Oct 20, 2015

Choose a reason for hiding this comment

Uh oh!

marmbrus commented Oct 20, 2015

Uh oh!

marmbrus Oct 20, 2015

Choose a reason for hiding this comment

Uh oh!

marmbrus commented Oct 20, 2015

Uh oh!

SparkQA commented Oct 20, 2015

Uh oh!

nitin2goyal commented Oct 20, 2015

Uh oh!

SparkQA commented Oct 20, 2015

Uh oh!

nitin2goyal commented Oct 20, 2015

Uh oh!

marmbrus commented Oct 20, 2015

Uh oh!

rxin commented Oct 20, 2015

Uh oh!

nitin2goyal commented Oct 21, 2015

Uh oh!

SparkQA commented Oct 21, 2015

Uh oh!

marmbrus commented Oct 21, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants