-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-42500][SQL] ConstantPropagation supports more cases #40093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -200,14 +200,20 @@ object ConstantPropagation extends Rule[LogicalPlan] { | |
|
|
||
| private def replaceConstants(condition: Expression, equalityPredicates: EqualityPredicates) | ||
| : Expression = { | ||
| val constantsMap = AttributeMap(equalityPredicates.map(_._1)) | ||
| val predicates = equalityPredicates.map(_._2).toSet | ||
| def replaceConstants0(expression: Expression) = expression transform { | ||
| val allConstantsMap = AttributeMap(equalityPredicates.map(_._1)) | ||
| val allPredicates = equalityPredicates.map(_._2).toSet | ||
| def replaceConstants0( | ||
| expression: Expression, constantsMap: AttributeMap[Literal]) = expression transform { | ||
| case a: AttributeReference => constantsMap.getOrElse(a, a) | ||
| } | ||
| condition transform { | ||
| case e @ EqualTo(_, _) if !predicates.contains(e) => replaceConstants0(e) | ||
| case e @ EqualNullSafe(_, _) if !predicates.contains(e) => replaceConstants0(e) | ||
| case b: BinaryComparison => | ||
| if (!allPredicates.contains(b)) { | ||
| replaceConstants0(b, allConstantsMap) | ||
| } else { | ||
| val excludedEqualityPredicates = equalityPredicates.filterNot(_._2.semanticEquals(b)) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exclude current binary comparison to support the following case: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this doesn't look like constant propagation but a different optimization to simplify predicates:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This PR does not support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we exclude it first? We should add a new optimizer rule later to do more advanced predicate simplification. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean only support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nvm, the rule name is |
||
| replaceConstants0(b, AttributeMap(excludedEqualityPredicates.map(_._1))) | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To support other binary comparisons. For example:
>,<.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we look at the commit history and understand why constant propagation only supports equality predicates before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial commit did not support it. It seems other binary comparisons were not considered at the time:
#17993
Later we tried to enhance it. It seems it is not easy because there are too many changes:
#24553
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @peter-toth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can replace constants in other expressions too, #24553 does the same (and more).
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get the issue. If a complex expression is equal to a constant why we can't use that constant where the complex expression appears? I don't think it matters if some parts of the complex expression can also be replaced to another constat. The replace happens using
transformDownso larger complex expressions are replaced to constants first. E.g.a = 1 AND (a + b) = 1 AND (a + b) > 2->a = 1 AND (1 + b) = 1 AND 1 > 2->falsemakes sense to me.Yes we can. The reason why I didn't do those kind of shortcuts in
ConstantPropagationin my PR is because there are other, similar cases where it isn't that easy to do a shortcut (or simplification has been handled in other rules so why doing it here as well) . E.g.a = 1 AND a > 2->a = 1 AND 1 > 2->falsewhere the 2nd step (1 > 2->false) could also be handled inConstantPropagationbut those foldable expressions are already handled inConstantFoldingand then inBooleanSimplificationwhen contant propagation has been done.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me open a PR tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #40268.
I didn't want to overcomplicate the change so took only a few parts from #24553. But if you think 1. or 2. or any other feature from the original PR is also needed I can add it to the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan This is why conflicting equality predicates are not supported: #17993 (comment)
It is correct if it only optimize on the filter condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sidenote: In my old PR (https://github.com/apache/spark/pull/24553/files#diff-d43484d56a4d9991066b5c00d12ec2465c75131e055fc02ee7fb6dfd45b5006fR76-R79) I proposed #27309 to detect equijoins better and to allow constant propagation in
Joins too.