[WIP][SPARK-34884][SQL] Improve DPP evaluation #32097

wangyum · 2021-04-08T14:44:01Z

What changes were proposed in this pull request?

This pr avoid Driver OOM if spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly=false.

Why are the changes needed?

Safely disable spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly to improve query performance.

Benchmark result(spark.sql.adaptive.enabled=false):

TPCDS 5T SQL	Before(reuseBroadcastOnly=true)	After(reuseBroadcastOnly=false)
58	144	21
73	8	7
83	25	14

Does this PR introduce any user-facing change?

No.

How was this patch tested?

// TODO

wangyum · 2021-04-08T14:44:39Z

...src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PlanDynamicPruningFilters.scala

-          // it is not worthwhile to execute the query, so we fall-back to a true literal
-          DynamicPruningExpression(Literal.TrueLiteral)
-        } else {
+        } else if (canBroadcastBySize(buildPlan, conf)) {


cc @cloud-fan

SparkQA · 2021-04-08T16:03:10Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41662/

SparkQA · 2021-04-08T16:03:11Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41662/

SparkQA · 2021-04-08T18:52:32Z

Test build #137084 has finished for PR 32097 at commit 7c47937.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

Disable spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly

7c47937

wangyum commented Apr 8, 2021

View reviewed changes

github-actions bot added the SQL label Apr 8, 2021

wangyum closed this May 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][SPARK-34884][SQL] Improve DPP evaluation #32097

[WIP][SPARK-34884][SQL] Improve DPP evaluation #32097

Uh oh!

wangyum commented Apr 8, 2021

Uh oh!

wangyum Apr 8, 2021

Uh oh!

SparkQA commented Apr 8, 2021

Uh oh!

SparkQA commented Apr 8, 2021

Uh oh!

SparkQA commented Apr 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP][SPARK-34884][SQL] Improve DPP evaluation #32097

[WIP][SPARK-34884][SQL] Improve DPP evaluation #32097

Uh oh!

Conversation

wangyum commented Apr 8, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

wangyum Apr 8, 2021

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 8, 2021

Uh oh!

SparkQA commented Apr 8, 2021

Uh oh!

SparkQA commented Apr 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants