New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules #32686
Conversation
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #139029 has finished for PR 32686 at commit
|
Kubernetes integration test starting |
Test build #139031 has finished for PR 32686 at commit
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
Kubernetes integration test status failure |
Test build #139054 has finished for PR 32686 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #139070 has finished for PR 32686 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
@dbaliafroozeh @gengliangwang @hvanhovell @maryannxue this PR is ready for review. Thanks! |
Test build #139096 has finished for PR 32686 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #139104 has finished for PR 32686 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
One small ergonomic comment. I would be great if we can create some shorthand for the function closures. I would probably make the in individual value be matcher for itself (if Enumeration allows subclassing of the Value class), and create a bunch of functions that allow you to compose them, e.g.: |
@@ -423,7 +424,9 @@ class Analyzer(override val catalogManager: CatalogManager) | |||
*/ | |||
object ResolveAliases extends Rule[LogicalPlan] { | |||
private def assignAliases(exprs: Seq[NamedExpression]) = { | |||
exprs.map(_.transformUp { case u @ UnresolvedAlias(child, optGenAliasFunc) => | |||
exprs.map(_.transformUpWithPruning(_.containsPattern(UNRESOLVED_ALIAS)) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the indention looks wired here. Shall we move the {
in the above line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -1876,7 +1879,7 @@ class Analyzer(override val catalogManager: CatalogManager) | |||
private def allowGroupByAlias: Boolean = conf.groupByAliases && !conf.ansiEnabled | |||
|
|||
override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUpWithPruning( | |||
AlwaysProcess.fn, ruleId) { | |||
_.containsAllPatterns(AGGREGATE, UNRESOLVED_ATTRIBUTE), ruleId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add comment saying that mayResolveAttrByAggregateExprs
requires the TreePattern UNRESOLVED_ATTRIBUTE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -3736,7 +3744,8 @@ object EliminateUnions extends Rule[LogicalPlan] { | |||
* rule can't work for those parameters. | |||
*/ | |||
object CleanupAliases extends Rule[LogicalPlan] with AliasHelper { | |||
override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { | |||
override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUpWithPruning( | |||
_.containsPattern(ALIAS)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we need to set the Tree Pattern of MultiAlias
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the catch!
@@ -117,6 +120,7 @@ case class AggregateExpression( | |||
UnresolvedAttribute(aggregateFunction.toString) | |||
} | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I'm not sure what the transformWithPruning interface exactly looks like. IIUC, transformWithPruning may still not be able to just take a Anyway, thanks for the suggestion. I'll think about whether there's a simpler approach and may address it subsequent PRs. |
Kubernetes integration test unable to build dist. exiting with code: 1 |
Thanks, merging to master |
Test build #139125 has finished for PR 32686 at commit
|
What changes were proposed in this pull request?
Added the following TreePattern enums:
Added tree pattern pruning to the following Analyzer rules:
Why are the changes needed?
Reduce the number of tree traversals and hence improve the query compilation latency.
How was this patch tested?
Existing tests.
Performance diff:
<style type="text/css"></style>