[SPARK-34086][SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar #31168
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A backport for #31150 to branch 3.1
What changes were proposed in this pull request?
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133928/testReport/org.apache.spark.sql.execution/LogicalPlanTagInSparkPlanSuite/q41/
We can reduce more than 8000 bytes by removing the unnecessary CONCAT expression.
W/ this fix, for q41 in TPCDS with Using TPCDS original definitions for char/varchar columns applied, we can reduce the stage code-gen size from 22523 to 14369
Why are the changes needed?
fix the perf regression(we need other improvements for q41 works), there will be a huge performance regression if codegen fails
Does this PR introduce any user-facing change?
no
How was this patch tested?
modified uts