Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35077][SQL] Migrate to transformWithPruning for leftover optimizer rules #32721

Closed
wants to merge 11 commits into from

Conversation

sigmod
Copy link
Contributor

@sigmod sigmod commented Jun 1, 2021

What changes were proposed in this pull request?

Migrate to transformWithPruning for the following queries:

  • SimplifyExtractValueOps
  • NormalizeFloatingNumbers
  • PushProjectionThroughUnion
  • PushDownPredicates
  • ExtractPythonUDFFromAggregate
  • ExtractPythonUDFFromJoinCondition
  • ExtractGroupingPythonUDFFromAggregate
  • ExtractPythonUDFs
  • CleanupDynamicPruningFilters

Why are the changes needed?

Reduce the number of tree traversals and hence improve the query compilation latency.

How was this patch tested?

Existing tests.
Performance diff:
<style type="text/css"></style>

  Baseline Experiment Experiment/Baseline
SimplifyExtractValueOps 99367049 3679579 0.04
NormalizeFloatingNumbers 24717928 20451094 0.83
PushProjectionThroughUnion 14130245 7913551 0.56
PushDownPredicates 276333542 261246842 0.95
ExtractPythonUDFFromAggregate 6459451 2683556 0.42
ExtractPythonUDFFromJoinCondition 5695404 2504573 0.44
ExtractGroupingPythonUDFFromAggregate 5546701 1858755 0.34
ExtractPythonUDFs 58726458 1598518 0.03
CleanupDynamicPruningFilters 26606652 15417936 0.58
OptimizeSubqueries 3072287940 2876462708 0.94

@github-actions github-actions bot added the SQL label Jun 1, 2021
@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Test build #139133 has finished for PR 32721 at commit d95d332.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43653/

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43655/

@github-actions github-actions bot added the PYTHON label Jun 1, 2021
@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43655/

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43663/

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43663/

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Test build #139135 has finished for PR 32721 at commit 864ee6f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Test build #139143 has finished for PR 32721 at commit 64220b7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sigmod sigmod changed the title [WIP][SPARK-35077][SQL] Migrate to transformWithPruning for leftover optimizer rules [SPARK-35077][SQL] Migrate to transformWithPruning for leftover optimizer rules Jun 1, 2021
@sigmod
Copy link
Contributor Author

sigmod commented Jun 1, 2021

@dbaliafroozeh @gengliangwang @hvanhovell @maryannxue this PR is ready for review.

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43695/

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43695/

@SparkQA
Copy link

SparkQA commented Jun 1, 2021

Test build #139175 has finished for PR 32721 at commit 58732f9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member

Merging to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants