Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Apr 8, 2021

What changes were proposed in this pull request?

This pr avoid Driver OOM if spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly=false.

Why are the changes needed?

Safely disable spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly to improve query performance.

Benchmark result(spark.sql.adaptive.enabled=false):

TPCDS 5T SQL Before(reuseBroadcastOnly=true) After(reuseBroadcastOnly=false)
58 144 21
73 8 7
83 25 14

Does this PR introduce any user-facing change?

No.

How was this patch tested?

// TODO

// it is not worthwhile to execute the query, so we fall-back to a true literal
DynamicPruningExpression(Literal.TrueLiteral)
} else {
} else if (canBroadcastBySize(buildPlan, conf)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41662/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41662/

@github-actions github-actions bot added the SQL label Apr 8, 2021
@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Test build #137084 has finished for PR 32097 at commit 7c47937.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum closed this May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants