Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13242] [SQL] Fall back to interpreting complex when expressions #11243

Closed
wants to merge 6 commits into from

Conversation

joehalliwell
Copy link

This follows @davies suggestion over on #11221 but rather than hard coding a limit on the number of branches, it does a heuristic check on the generated source code size.

This isn't ideal, but will work well in practice and mirrors the approach in #7076

@@ -182,11 +183,16 @@ case class CaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[E

generatedCode += "}\n" * cases.size

s"""
if (generatedCode.size > 64 * 1000) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64k is too large, JIT will stop to work after 8K bytes code. 1K?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean 10K? That makes sense. Should we modify the corresponding check in CodegenContext.splitExpressions?

@joehalliwell
Copy link
Author

@davies Thanks for taking a look at this. I've incorporated your fixes -- though I think you meant 10K rather than 1K as the source code threshold?

@davies
Copy link
Contributor

davies commented Feb 17, 2016

I meant 1K source code for this expression. If a Java method jave more than 8K bytes code, JIT will not compile it, we should leave enough space for other staff.

@davies
Copy link
Contributor

davies commented Feb 17, 2016

Jenkins, OK to test.

@joehalliwell
Copy link
Author

Ah... I figured 10K of source code would compile to at most 4K bytecode. But I can see the case for requiring it to be smaller still. I'll adjust to 1K.

@joehalliwell
Copy link
Author

@davies sorry, I think my extra commit may have confused Jenkins?

@joehalliwell
Copy link
Author

@davies Hi, I guess this missed the window for 1.6.1? Codegen errors of this sort on complex expressions represent a regression with respect to 1.5.2 which are preventing us from being able to upgrade. Falling back to interpreting (as in this PR) is not an ideal solution, but at least it would get our code running again. Is there anything further I can do to help with this?

@@ -182,11 +183,16 @@ case class CaseWhen(branches: Seq[(Expression, Expression)], elseValue: Option[E

generatedCode += "}\n" * cases.size

s"""
// Methods that are too long cannot be compiled, so fall back to interpreting
if (generatedCode.length > 1000) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do the switch based on number of branches? for example, 10 or 20 switches.

Because whole stage codegen can't work with CodegenFallback, If we do so, we can know when we will do the fallback, we can still support CaseWhen in whole stage codegen if only have a few branches.

@davies
Copy link
Contributor

davies commented Mar 8, 2016

@joehalliwell Do you mind me to take over this one? so we can merge this quickly.

@joehalliwell
Copy link
Author

Not at all, please do!

@asfgit asfgit closed this in 9634e17 Mar 9, 2016
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
## What changes were proposed in this pull request?

If there are many branches in a CaseWhen expression, the generated code could go above the 64K limit for single java method, will fail to compile. This PR change it to fallback to interpret mode if there are more than 20 branches.

This PR is based on apache#11243 and apache#11221, thanks to joehalliwell

Closes apache#11243
Closes apache#11221

## How was this patch tested?

Add a test with 50 branches.

Author: Davies Liu <davies@databricks.com>

Closes apache#11592 from davies/fix_when.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants