Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34086][SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar #31168

Closed
wants to merge 2 commits into from

Conversation

yaooqinn
Copy link
Member

A backport for #31150 to branch 3.1

What changes were proposed in this pull request?

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/

We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression.

W/ this fix, for q41 in TPCDS with Using TPCDS original definitions for char/varchar columns applied, we can reduce the stage code-gen size from 22523 to 14369

14369  - 22523 = - 8154

Why are the changes needed?

fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails

Does this PR introduce any user-facing change?

no

How was this patch tested?

modified uts

…odegen in length check for char varchar

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/

We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression.

W/ this fix, for q41 in TPCDS with [Using TPCDS original definitions for char/varchar columns](#31012) applied, we can reduce the stage code-gen size from 22523 to 14369
```
14369  - 22523 = - 8154
```

fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails

no

modified uts

Closes #31150 from yaooqinn/SPARK-34086.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@github-actions github-actions bot added the SQL label Jan 13, 2021
@SparkQA
Copy link

SparkQA commented Jan 13, 2021

Test build #134010 has finished for PR 31168 at commit daf3781.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

cc @cloud-fan thanks

@cloud-fan
Copy link
Contributor

thanks, merging to 3.1!

cloud-fan pushed a commit that referenced this pull request Jan 14, 2021
…ils codegen in length check for char varchar

A backport for #31150 to branch 3.1

### What changes were proposed in this pull request?

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/

We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression.

W/ this fix, for q41 in TPCDS with [Using TPCDS original definitions for char/varchar columns](#31012) applied, we can reduce the stage code-gen size from 22523 to 14369
```
14369  - 22523 = - 8154
```

### Why are the changes needed?

fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

modified uts

Closes #31168 from yaooqinn/SPARK-34086-31.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@cloud-fan cloud-fan closed this Jan 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants