-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43095][SQL] Avoid Once strategy's idempotence is broken for batch: Infer Filters
#40742
Conversation
cc @cloud-fan |
@@ -1430,6 +1430,10 @@ object InferFiltersFromConstraints extends Rule[LogicalPlan] | |||
.union(constructIsNotNullConstraints(constraints, plan.output)) | |||
.filter { c => | |||
c.references.nonEmpty && c.references.subsetOf(plan.outputSet) && c.deterministic | |||
}.filterNot { | |||
// Avoid once strategy idempotence is broken. | |||
case a EqualNullSafe b => a.semanticEquals(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we fix the place that generates this useless predicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Moved it to ConstraintHelper.inferAdditionalConstraints
.
@@ -75,7 +75,10 @@ trait ConstraintHelper { | |||
inferredConstraints ++= replaceConstraints(predicates - eq, l, r) | |||
case _ => // No inference | |||
} | |||
inferredConstraints -- constraints | |||
(inferredConstraints -- constraints).filterNot { | |||
case a EqualNullSafe b => a.semanticEquals(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this PR. the generated EqualNullSafe
will be optimized by SimplifyBinaryComparison
.
@@ -75,7 +75,10 @@ trait ConstraintHelper { | |||
inferredConstraints ++= replaceConstraints(predicates - eq, l, r) | |||
case _ => // No inference | |||
} | |||
inferredConstraints -- constraints | |||
(inferredConstraints -- constraints).filterNot { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we update the pattern match above to add if !l.semanticEquals(r)
?
@@ -66,13 +66,13 @@ trait ConstraintHelper { | |||
val predicates = constraints.filterNot(_.isInstanceOf[IsNotNull]) | |||
predicates.foreach { | |||
case eq @ EqualTo(l: Attribute, r: Attribute) => | |||
val candidateConstraints = predicates - eq | |||
val candidateConstraints = predicates - eq - EqualNullSafe(l, r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if l.semanticEquals(r)
, we don't need to infer predicates at all, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But the current situation is like this: l <=> l and r <=> r
is inferred from l === r and l <=> r
. At this time, l.semanticEquals(r)
is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah got it. Let's add a code comment to explain it with this example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain it a bit more about why it makes the optimizer not idempotent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example:
Join Inner, (a#0 = a#7)
:- Project [a#0, a#0 AS xa#3]
: +- Join Inner, (a#0 = a#4)
: :- LocalRelation <empty>, [a#0, b#1, c#2]
: +- LocalRelation <empty>, [a#4, b#5, c#6]
+- LocalRelation <empty>, [a#7, b#8, c#9]
After InferFiltersFromConstraints
:
Join Inner, (a#0 = a#7)
:- Filter ((a#0 <=> a#0) AND (xa#3 <=> xa#3))
: +- Project [a#0, a#0 AS xa#3]
: +- Join Inner, (a#0 = a#4)
: :- Filter isnotnull(a#0)
: : +- LocalRelation <empty>, [a#0, b#1, c#2]
: +- Filter isnotnull(a#4)
: +- LocalRelation <empty>, [a#4, b#5, c#6]
+- Filter isnotnull(a#7)
+- LocalRelation <empty>, [a#7, b#8, c#9]
And run InferFiltersFromConstraints
again. The left side constraints:
(a#0 <=> xa#3)
(xa#3 = a#0)
(a#0 <=> a#0)
(xa#3 <=> xa#3)
The right side constraints:
(isnotnull(a#7))
Join condition is:
(a#0 = a#7)
Based on these constraints, the inferred result is:
(a#7 <=> xa#3)
(xa#3 = a#7)
(a#7 <=> a#7)
(a#7 <=> a#7)
is a valid constraints for right side. The result:
Join Inner, (a#0 = a#7)
:- Filter ((a#0 <=> a#0) AND (xa#3 <=> xa#3))
: +- Project [a#0, a#0 AS xa#3]
: +- Join Inner, (a#0 = a#4)
: :- Filter isnotnull(a#0)
: : +- LocalRelation <empty>, [a#0, b#1, c#2]
: +- Filter isnotnull(a#4)
: +- LocalRelation <empty>, [a#4, b#5, c#6]
+- Filter (a#7 <=> a#7)
+- Filter isnotnull(a#7)
+- LocalRelation <empty>, [a#7, b#8, c#9]
@@ -66,13 +66,13 @@ trait ConstraintHelper { | |||
val predicates = constraints.filterNot(_.isInstanceOf[IsNotNull]) | |||
predicates.foreach { | |||
case eq @ EqualTo(l: Attribute, r: Attribute) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to this PR, but shall we match Equality
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It seems ok.
@@ -66,13 +66,15 @@ trait ConstraintHelper { | |||
val predicates = constraints.filterNot(_.isInstanceOf[IsNotNull]) | |||
predicates.foreach { | |||
case eq @ EqualTo(l: Attribute, r: Attribute) => | |||
val candidateConstraints = predicates - eq | |||
// Also remove EqualNullSafe with the same l and r to avoid Once strategy's idempotence | |||
// is broken. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's mention l === r and l <=> r
can infer l <=> l and r <=> r
which is useless.
Merged to master. |
What changes were proposed in this pull request?
This PR makes it also remove
EqualNullSafe
when removingEqualTo
if their children are same when constructing candidate constraints inConstraintHelper.inferAdditionalConstraints
.For example:
l = r and l <=> r
. Before this PR,l = r and l <=> r
can inferl <=> l and r <=> r
which is useless. After This PR, it can't infer anything.Why are the changes needed?
Avoid Once strategy's idempotence is broken for batch:
Infer Filters
:Before this PR:
After this PR:
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Unit test.