Skip to content

[SPARK-33619][SQL] Fix GetMapValueUtil code generation error#30560

Closed
leanken-zz wants to merge 4 commits intoapache:masterfrom
leanken-zz:leanken-SPARK-33619
Closed

[SPARK-33619][SQL] Fix GetMapValueUtil code generation error#30560
leanken-zz wants to merge 4 commits intoapache:masterfrom
leanken-zz:leanken-SPARK-33619

Conversation

@leanken-zz
Copy link
Contributor

@leanken-zz leanken-zz commented Dec 1, 2020

What changes were proposed in this pull request?

Code Gen bug fix that introduced by SPARK-33460

GetMapValueUtil

s"""throw new NoSuchElementException("Key " + $eval2 + " does not exist.");"""

SHOULD BE

s"""throw new java.util.NoSuchElementException("Key " + $eval2 + " does not exist.");"""

And the reason why SPARK-33460 failed to detect this bug via UT, it was because that checkExceptionInExpression did not work as expect like checkEvaluation which will try eval expression with BOTH CODEGEN_ONLY and NO_CODEGEN mode, and in this PR, will also fix this Test bug, too.

Why are the changes needed?

Bug Fix.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Add UT and Existing UT.

@github-actions github-actions bot added the SQL label Dec 1, 2020
@leanken-zz
Copy link
Contributor Author

Why were

  1. checkExceptionInExpression[Exception](expr, errMsg)
  2. sql/testOnly org.apache.spark.sql.SQLQueryTestSuite – -z ansi/map.sql
    in SPARK-33460 can't detect this Bug

It's because 
1. checkExceptionInExpression is some how with error, too, but i don't know of the history. It should wrap with 
withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key like CheckEvalulation, AND WE SHOULD FIX this later.
2. SQLQueryTestSuite ansi/map.sql failed to detect because of the Constant Folding rules, it is calling eval instead of code gen  in this case

@leanken-zz
Copy link
Contributor Author

@cloud-fan @HyukjinKwon FYI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leanken, can we add a test?

Copy link
Contributor Author

@leanken-zz leanken-zz Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leanken, can we add a test?

Currently, this case being missed in SPARK-33460 it is because checkExceptionInExpression is not as expected. And as i investigated, there are some cases would fail if I correct the behavior of checkExceptionInExpression by using withSQLConf set CodeGenFallBack to false, like ObjectExpressionsSuite and RegexpExpressionsSuite and etc.

image

So I need your advise. Should I

  1. Fix all these wrong behavior separately and then finally fix checkExceptionInExpression.
  2. Fix all these wrong behavior and checkExceptionInExpression with the same patch.

The reason why i submit this PR it's because this bug introduced by me and I know of the history, but all those other behavior is fresh to me, and the all-in-one fix might be big.

As for add a test to the java.util.NoSuchElementException, I think the offline manual verification is enough, If we fix checkExceptionInExpression, and make it reliable, then we don't need to check code gen behavior for every new added or updated operation, right?

Copy link
Contributor Author

@leanken-zz leanken-zz Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leanken, can we add a test?

maybe you can try reproduce ObjectExpressionsSuite and RegexpExpressionsSuite in your environment so that you can get clearer picture.

Copy link
Member

@maropu maropu Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And as i investigated, there are some cases would fail if I correct the behavior of checkExceptionInExpression by using withSQLConf set CodeGenFallBack to false, like ObjectExpressionsSuite and RegexpExpressionsSuite and etc.

How about making a PR to fix checkExceptionInExpression first? I'm currently not sure how many failures caused by it exist. If there are too many failures, we can separate the work to fix them then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for @maropu 's advice.

@HyukjinKwon HyukjinKwon changed the title [SPARK-33619] GetMapValueUtil code generation error. [SPARK-33619][SQL] GetMapValueUtil code generation error. Dec 1, 2020
…RK-33460].

Fix code style of [SPARK-33572].

Change-Id: I7744f8bf35e1a10afa7c155b917835a1facdd1ce
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-33619][SQL] GetMapValueUtil code generation error. [SPARK-33619][SQL] Fix GetMapValueUtil code generation error Dec 1, 2020
@leanken-zz
Copy link
Contributor Author

let's see how many error occurs after I change the checkExceptionInExpression

@maropu @HyukjinKwon @dongjoon-hyun @cloud-fan

@SparkQA
Copy link

SparkQA commented Dec 2, 2020

Test build #132019 has finished for PR 30560 at commit 8e4ca3d.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

Change-Id: Icdd9c60778b9fa155e4b2cb62e2e6d841a43c174
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should remove the entire function, but here we run an entire test to see if there are regressions.

@SparkQA
Copy link

SparkQA commented Dec 2, 2020

Test build #132015 has finished for PR 30560 at commit 6d4fb1d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class MakeDate(

@maropu
Copy link
Member

maropu commented Dec 2, 2020

 org.apache.spark.sql.catalyst.expressions.DateExpressionsSuite.Consistent error handling for datetime formatting and parsing functions	16 ms	1
 org.apache.spark.sql.catalyst.expressions.ObjectExpressionsSuite.SPARK-23594 GetExternalRowField should support interpreted execution	49 ms	1
 org.apache.spark.sql.catalyst.expressions.ObjectExpressionsSuite.SPARK-23595 ValidateExternalType should support interpreted execution	0.64 sec	1
 org.apache.spark.sql.catalyst.expressions.RegexpExpressionsSuite.RegexExtract	0.25 sec	1
 org.apache.spark.sql.catalyst.expressions.RegexpExpressionsSuite.RegexExtractAll

It seems the five errors? If so, IMO its okay to fix all of them at once. Anyway, nice catch!

@leanken-zz
Copy link
Contributor Author

leanken-zz commented Dec 2, 2020

 org.apache.spark.sql.catalyst.expressions.DateExpressionsSuite.Consistent error handling for datetime formatting and parsing functions	16 ms	1
 org.apache.spark.sql.catalyst.expressions.ObjectExpressionsSuite.SPARK-23594 GetExternalRowField should support interpreted execution	49 ms	1
 org.apache.spark.sql.catalyst.expressions.ObjectExpressionsSuite.SPARK-23595 ValidateExternalType should support interpreted execution	0.64 sec	1
 org.apache.spark.sql.catalyst.expressions.RegexpExpressionsSuite.RegexExtract	0.25 sec	1
 org.apache.spark.sql.catalyst.expressions.RegexpExpressionsSuite.RegexExtractAll

It seems the five errors? If so, IMO its okay to fix all of them at once. Anyway, nice catch!

it should be zero error after the final patch 16940ce. please ignore the former test result ^_^

Change-Id: I035eca72ebdc6d79676fe14cc79ac4ac2eb4430c
@SparkQA
Copy link

SparkQA commented Dec 2, 2020

Test build #132022 has finished for PR 30560 at commit 16940ce.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For simplicity, can we always test it with 2 modes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be OK. since setting mode does not affect evaluateWithoutCodegen

@maropu
Copy link
Member

maropu commented Dec 2, 2020

Could you update the title and the description, too?

@leanken-zz
Copy link
Contributor Author

and

OK

Change-Id: Ifcd5d186c16879159b84223dbc79fcf0ec628c81
@leanken-zz
Copy link
Contributor Author

Could you update the title and the description, too?

update desc and add another UT in ExpressionEvalHelperSuite to make sure the checkExceptionInExpression change work as expect.

@SparkQA
Copy link

SparkQA commented Dec 2, 2020

Test build #132025 has finished for PR 30560 at commit 6570d70.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@cloud-fan cloud-fan closed this Dec 2, 2020
@cloud-fan cloud-fan reopened this Dec 2, 2020
@SparkQA
Copy link

SparkQA commented Dec 2, 2020

Test build #132040 has finished for PR 30560 at commit 95bdd37.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class BadCodegenAndEvalExpression() extends LeafExpression

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 58583f7 Dec 2, 2020
@SparkQA
Copy link

SparkQA commented Dec 2, 2020

Test build #132047 has finished for PR 30560 at commit 95bdd37.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class BadCodegenAndEvalExpression() extends LeafExpression

@leanken-zz
Copy link
Contributor Author

retest this please

@dongjoon-hyun
Copy link
Member

Hi, All.
This seems to break Scala 2.13 UT. Please see https://issues.apache.org/jira/browse/SPARK-33948?focusedCommentId=17258217&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17258217 .

@HyukjinKwon
Copy link
Member

Thank you @dongjoon-hyun and @LuciferYang for investigating and checking this!

@LuciferYang
Copy link
Contributor

@HyukjinKwon @dongjoon-hyun I will give a pr try to fix this, It seems that there is some wrong with the generated code. But I am not sure why this pr triggered the problem ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

Comments