Skip to content

Conversation

@sigmod
Copy link
Contributor

@sigmod sigmod commented May 7, 2021

What changes were proposed in this pull request?

Added the following TreePattern enums:

  • BOOL_AGG
  • COUNT_IF
  • CURRENT_LIKE
  • RUNTIME_REPLACEABLE

Added tree traversal pruning to the following rules:

  • ReplaceExpressions
  • RewriteNonCorrelatedExists
  • ComputeCurrentTime
  • GetCurrentDatabaseAndCatalog

Why are the changes needed?

Reduce the number of tree traversals and hence improve the query compilation latency.

Performance improvement (org.apache.spark.sql.TPCDSQuerySuite):
Rule name | Total Time (baseline) | Total Time (experiment) | experiment/baseline
ReplaceExpressions | 27546369 | 19753804 | 0.72
RewriteNonCorrelatedExists | 17304883 | 2086194 | 0.12
ComputeCurrentTime | 35751301 | 19984477 | 0.56
GetCurrentDatabaseAndCatalog | 37230787 | 18874013 | 0.51

How was this patch tested?

Existing tests.

@github-actions github-actions bot added the SQL label May 7, 2021
@SparkQA
Copy link

SparkQA commented May 7, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42743/

@SparkQA
Copy link

SparkQA commented May 7, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42743/

@sigmod sigmod changed the title [WIP][SPARK-35146][SQL] Migrate to transformWithPruning or resolveWithPruning for rules in finishAnalysis.scala [SPARK-35146][SQL] Migrate to transformWithPruning or resolveWithPruning for rules in finishAnalysis.scala May 7, 2021
@sigmod
Copy link
Contributor Author

sigmod commented May 7, 2021

@hvanhovell @gengliangwang @dbaliafroozeh @maryannxue this PR is ready for review.

@SparkQA
Copy link

SparkQA commented May 7, 2021

Test build #138221 has finished for PR 32461 at commit 6fc4523.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42887/

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42887/

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work as always.

@gengliangwang
Copy link
Member

Thanks, merging to master

@SparkQA
Copy link

SparkQA commented May 11, 2021

Test build #138364 has finished for PR 32461 at commit a5f4fee.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • trait ShuffledJoin extends JoinCodegenSupport

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants