-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13242] [SQL] Fall back to interpreting complex when
expressions
#11243
Conversation
@@ -182,11 +183,16 @@ case class CaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[E | |||
|
|||
generatedCode += "}\n" * cases.size | |||
|
|||
s""" | |||
if (generatedCode.size > 64 * 1000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64k is too large, JIT will stop to work after 8K bytes code. 1K?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you mean 10K? That makes sense. Should we modify the corresponding check in CodegenContext.splitExpressions?
@davies Thanks for taking a look at this. I've incorporated your fixes -- though I think you meant 10K rather than 1K as the source code threshold? |
I meant 1K source code for this expression. If a Java method jave more than 8K bytes code, JIT will not compile it, we should leave enough space for other staff. |
Jenkins, OK to test. |
Ah... I figured 10K of source code would compile to at most 4K bytecode. But I can see the case for requiring it to be smaller still. I'll adjust to 1K. |
@davies sorry, I think my extra commit may have confused Jenkins? |
@davies Hi, I guess this missed the window for 1.6.1? Codegen errors of this sort on complex expressions represent a regression with respect to 1.5.2 which are preventing us from being able to upgrade. Falling back to interpreting (as in this PR) is not an ideal solution, but at least it would get our code running again. Is there anything further I can do to help with this? |
@@ -182,11 +183,16 @@ case class CaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[E | |||
|
|||
generatedCode += "}\n" * cases.size | |||
|
|||
s""" | |||
// Methods that are too long cannot be compiled, so fall back to interpreting | |||
if (generatedCode.length > 1000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do the switch based on number of branches? for example, 10 or 20 switches.
Because whole stage codegen can't work with CodegenFallback, If we do so, we can know when we will do the fallback, we can still support CaseWhen in whole stage codegen if only have a few branches.
@joehalliwell Do you mind me to take over this one? so we can merge this quickly. |
Not at all, please do! |
## What changes were proposed in this pull request? If there are many branches in a CaseWhen expression, the generated code could go above the 64K limit for single java method, will fail to compile. This PR change it to fallback to interpret mode if there are more than 20 branches. This PR is based on apache#11243 and apache#11221, thanks to joehalliwell Closes apache#11243 Closes apache#11221 ## How was this patch tested? Add a test with 50 branches. Author: Davies Liu <davies@databricks.com> Closes apache#11592 from davies/fix_when.
This follows @davies suggestion over on #11221 but rather than hard coding a limit on the number of branches, it does a heuristic check on the generated source code size.
This isn't ideal, but will work well in practice and mirrors the approach in #7076