Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20246][SQL] should not push predicate down through aggregate with non-deterministic expressions #17562

Closed
wants to merge 2 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

Similar to Project, when Aggregate has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in Aggregate.

How was this patch tested?

new regression test

@@ -134,15 +134,20 @@ class FilterPushdownSuite extends PlanTest {
comparePlans(optimized, correctAnswer)
}

test("nondeterministic: can't push down filter with nondeterministic condition through project") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was wrong, actually we can push down nondeterministic filter through project, as long as the project list is all deterministic.

@cloud-fan
Copy link
Contributor Author

cc @liancheng @gatorsmile

@@ -134,15 +134,20 @@ class FilterPushdownSuite extends PlanTest {
comparePlans(optimized, correctAnswer)
}

test("nondeterministic: can't push down filter with nondeterministic condition through project") {
test("nondeterministic: can push down nondeterministic filter through project") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: through project with all deterministic fields.

@viirya
Copy link
Member

viirya commented Apr 7, 2017

LGTM except a minor comment.

@SparkQA
Copy link

SparkQA commented Apr 7, 2017

Test build #75597 has finished for PR 17562 at commit a2599be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 7, 2017

Test build #75598 has finished for PR 17562 at commit e6a1bfe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 7, 2017

Test build #75600 has finished for PR 17562 at commit e6546be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -792,7 +793,8 @@ object PushDownPredicate extends Rule[LogicalPlan] with PredicateHelper {
filter
}

case filter @ Filter(condition, aggregate: Aggregate) =>
case filter @ Filter(condition, aggregate: Aggregate)
if aggregate.aggregateExpressions.forall(_.deterministic) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this case above case filter @ Filter(condition, w: Window)?

Based on the comment you add above, it becomes easier to follow by the readers.

@gatorsmile
Copy link
Member

LGTM except a minor comment

@SparkQA
Copy link

SparkQA commented Apr 7, 2017

Test build #75604 has finished for PR 17562 at commit f254d5f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Apr 8, 2017
…ith non-deterministic expressions

## What changes were proposed in this pull request?

Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`.

## How was this patch tested?

new regression test

Author: Wenchen Fan <wenchen@databricks.com>

Closes #17562 from cloud-fan/filter.

(cherry picked from commit 7577e9c)
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
asfgit pushed a commit that referenced this pull request Apr 8, 2017
…ith non-deterministic expressions

## What changes were proposed in this pull request?

Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`.

## How was this patch tested?

new regression test

Author: Wenchen Fan <wenchen@databricks.com>

Closes #17562 from cloud-fan/filter.

(cherry picked from commit 7577e9c)
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
@gatorsmile
Copy link
Member

Thanks! Merging to master/2.1/2.0

@asfgit asfgit closed this in 7577e9c Apr 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants