-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20246][SQL] should not push predicate down through aggregate with non-deterministic expressions #17562
Conversation
@@ -134,15 +134,20 @@ class FilterPushdownSuite extends PlanTest { | |||
comparePlans(optimized, correctAnswer) | |||
} | |||
|
|||
test("nondeterministic: can't push down filter with nondeterministic condition through project") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was wrong, actually we can push down nondeterministic filter through project, as long as the project list is all deterministic.
@@ -134,15 +134,20 @@ class FilterPushdownSuite extends PlanTest { | |||
comparePlans(optimized, correctAnswer) | |||
} | |||
|
|||
test("nondeterministic: can't push down filter with nondeterministic condition through project") { | |||
test("nondeterministic: can push down nondeterministic filter through project") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: through project with all deterministic fields.
LGTM except a minor comment. |
Test build #75597 has finished for PR 17562 at commit
|
Test build #75598 has finished for PR 17562 at commit
|
Test build #75600 has finished for PR 17562 at commit
|
@@ -792,7 +793,8 @@ object PushDownPredicate extends Rule[LogicalPlan] with PredicateHelper { | |||
filter | |||
} | |||
|
|||
case filter @ Filter(condition, aggregate: Aggregate) => | |||
case filter @ Filter(condition, aggregate: Aggregate) | |||
if aggregate.aggregateExpressions.forall(_.deterministic) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this case above case filter @ Filter(condition, w: Window)
?
Based on the comment you add above, it becomes easier to follow by the readers.
LGTM except a minor comment |
Test build #75604 has finished for PR 17562 at commit
|
…ith non-deterministic expressions ## What changes were proposed in this pull request? Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`. ## How was this patch tested? new regression test Author: Wenchen Fan <wenchen@databricks.com> Closes #17562 from cloud-fan/filter. (cherry picked from commit 7577e9c) Signed-off-by: Xiao Li <gatorsmile@gmail.com>
…ith non-deterministic expressions ## What changes were proposed in this pull request? Similar to `Project`, when `Aggregate` has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions in `Aggregate`. ## How was this patch tested? new regression test Author: Wenchen Fan <wenchen@databricks.com> Closes #17562 from cloud-fan/filter. (cherry picked from commit 7577e9c) Signed-off-by: Xiao Li <gatorsmile@gmail.com>
Thanks! Merging to master/2.1/2.0 |
What changes were proposed in this pull request?
Similar to
Project
, whenAggregate
has non-deterministic expressions, we should not push predicate down through it, as it will change the number of input rows and thus change the evaluation result of non-deterministic expressions inAggregate
.How was this patch tested?
new regression test