[SPARK-34086][SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar #31168

yaooqinn · 2021-01-13T10:40:25Z

A backport for #31150 to branch 3.1

What changes were proposed in this pull request?

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/

We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression.

W/ this fix, for q41 in TPCDS with Using TPCDS original definitions for char/varchar columns applied, we can reduce the stage code-gen size from 22523 to 14369

14369  - 22523 = - 8154

Why are the changes needed?

fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails

Does this PR introduce any user-facing change?

no

How was this patch tested?

modified uts

…odegen in length check for char varchar https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/ We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression. W/ this fix, for q41 in TPCDS with [Using TPCDS original definitions for char/varchar columns](#31012) applied, we can reduce the stage code-gen size from 22523 to 14369 ``` 14369 - 22523 = - 8154 ``` fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails no modified uts Closes #31150 from yaooqinn/SPARK-34086. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…odegen in length check for char varchar

SparkQA · 2021-01-13T15:10:02Z

Test build #134010 has finished for PR 31168 at commit daf3781.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yaooqinn · 2021-01-13T15:15:49Z

cc @cloud-fan thanks

cloud-fan · 2021-01-14T03:51:44Z

thanks, merging to 3.1!

…ils codegen in length check for char varchar A backport for #31150 to branch 3.1 ### What changes were proposed in this pull request? https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/ We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression. W/ this fix, for q41 in TPCDS with [Using TPCDS original definitions for char/varchar columns](#31012) applied, we can reduce the stage code-gen size from 22523 to 14369 ``` 14369 - 22523 = - 8154 ``` ### Why are the changes needed? fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? modified uts Closes #31168 from yaooqinn/SPARK-34086-31. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

yaooqinn added 2 commits January 13, 2021 18:03

[SPARK-34086][SQL] RaiseError generates too much code and may fails c…

daf3781

…odegen in length check for char varchar

github-actions bot added the SQL label Jan 13, 2021

cloud-fan approved these changes Jan 14, 2021

View reviewed changes

cloud-fan closed this Jan 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-34086][SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar #31168

[SPARK-34086][SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar #31168

yaooqinn commented Jan 13, 2021

SparkQA commented Jan 13, 2021

yaooqinn commented Jan 13, 2021

cloud-fan commented Jan 14, 2021

[SPARK-34086][SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar #31168

[SPARK-34086][SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar #31168

Conversation

yaooqinn commented Jan 13, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

SparkQA commented Jan 13, 2021

yaooqinn commented Jan 13, 2021

cloud-fan commented Jan 14, 2021