Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21871][SQL] Check actual bytecode size when compiling generated code #19083

Closed
wants to merge 14 commits into from

Conversation

maropu
Copy link
Member

@maropu maropu commented Aug 30, 2017

What changes were proposed in this pull request?

This pr added code to check actual bytecode size when compiling generated code. In #18810, we added code to give up code compilation and use interpreter execution in SparkPlan if the line number of generated functions goes over maxLinesPerFunction. But, we already have code to collect metrics for compiled bytecode size in CodeGenerator object. So,we could easily reuse the code for this purpose.

How was this patch tested?

Added tests in WholeStageCodegenSuite.

@@ -30,8 +30,15 @@ SELECT a + 2, COUNT(b) FROM testData GROUP BY a + 1;
SELECT a + 1 + 1, COUNT(b) FROM testData GROUP BY a + 1;

-- Aggregate with nulls.
--
-- In SPARK-21871, we added code to check the bytecode size of gen'd methods. If the size
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the issue in #19082 fixed, we might remove workaround. Or, we might use a new flag discussed in #19062 here.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81241 has finished for PR 19083 at commit 65eb028.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81245 has finished for PR 19083 at commit 6a61393.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@rednaxelafx rednaxelafx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this one that checks the exact bytecode size instead of source-code-string-size-based heuristics. Some inline comments:

// Error: VM option 'HugeMethodLimit' is develop and is available only in debug version of VM.
// Error: Could not create the Java Virtual Machine.
// Error: A fatal exception has occurred. Program will exit.
private val hugeMethodLimit = 8000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this threshold is only meaningful on HotSpot VM and some HotSpot-derived JVMs. Other JVMs, for instance IBM J9 doesn't use the same threshold. It'd be a bit too strict and biased to make this a non-configurable behavior for all JVMs.

I'd suggest that if we are to do this, at least centralize the JVM detection logic somewhere (e.g. unify with the JVM detection logic in org.apache.spark.util.SizeEstimator) and only set this kind of threshold based on the detected JVM. That way we're much less likely to regress on other JVMs, and/or even on future versions of the HotSpot VM where it'll get a new JIT compiler (Graal) with new JIT compilation heuristics that could be different from the current version.

Copy link
Member Author

@maropu maropu Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment! yea, I agree. I'll rethink this pr based on your suggestion.

Copy link
Member Author

@maropu maropu Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, IBM J9 has the same threshold for too-long functions (I'm not familiar with IBM JVM...)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ask @kiszk for that ^_^
With OpenJ9 coming out soon, we might be able to find out details on this kind of issue ourselves. But until then it's easier to just ask IBM folks for a reliable answer ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, you know @kiszk, haha.

Copy link
Member

@kiszk kiszk Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While IBM JDK has the similar option -Xjit:acceptHugeMethods, IBM JDK unfortunately does not disclose its threshold now.

My suggestion is to make this threshold configurable by using SQLConf. Is it possible to do this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I feel detecting specific JVM implementations might go too far for this purpose. yea, one of options is to add an internal option for this. WDYT? @rednaxelafx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm fine with having a SQLConf conf flag for the threshold for now, with the option to move to a centralized JVM detection implementation later where we can still have a SQLConf flag to override the heuristic based on detection.
Setting such a threshold to 64KB will effectively turn the restriction off (i.e. same behavior as -XX:-DontCompileHugeMethods), so having a simple int conf should be good enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, ok! Thanks!

@@ -1091,6 +1111,7 @@ object CodeGenerator extends Logging {
}

// Then walk the classes to get at the method bytecode.
val methodsToByteCodeSize = mutable.ArrayBuffer[(String, Int)]()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we know the number of methods right below (for each generated class, cf.methodInfos would give us the number of methods in that class, of which we expect almost all of them to have a Code attribute), making pre-sized arrays for each class and then concatenating those arrays into a Seq as the result would avoid potential resizing wastes from using a big default-sized ArrayBuffer.

val errMsg = intercept[IllegalArgumentException] {
CodeGenerator.compile(code)
}.getMessage
assert(errMsg.contains("the size of GeneratedClass.agg_doAggregateWithKeys is 9182 and " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hard-coded method name and size is a bit too strict. Can we relax that a little bit so that it's more resilient to the naming and the size of the generated classes/methods?

@@ -1079,7 +1099,7 @@ object CodeGenerator extends Logging {
/**
* Records the generated class and method bytecode sizes by inspecting janino private fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to update method comment if we change method purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yea! Thanks! I'll update

throw new IllegalArgumentException(
s"the size of $clazzName.$methodName is $byteCodeSize and this value goes over " +
s"the HugeMethodLimit $hugeMethodLimit (JVM doesn't compile methods " +
"larger than this limit)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We designed to show exception messages (see below the handling of JaninoRuntimeException), before falling back to non whole-stage codegen execution. I think we should follow the same behavior.

This exception will be hidden from user for now in WholeStageCodegenExec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@maropu maropu force-pushed the SPARK-21871 branch 3 times, most recently from 6100734 to 9c58237 Compare September 1, 2017 08:53
@SparkQA
Copy link

SparkQA commented Sep 1, 2017

Test build #81309 has finished for PR 19083 at commit 73090e8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 1, 2017

Test build #81313 has finished for PR 19083 at commit 6100734.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class CodeGenerationSuite extends PlanTest with ExpressionEvalHelper
  • class OrderingSuite extends PlanTest with ExpressionEvalHelper
  • class GeneratedProjectionSuite extends PlanTest

@SparkQA
Copy link

SparkQA commented Sep 1, 2017

Test build #81312 has finished for PR 19083 at commit 78af6f4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class CodeGenerationSuite extends PlanTest with ExpressionEvalHelper
  • class OrderingSuite extends PlanTest with ExpressionEvalHelper
  • class GeneratedProjectionSuite extends PlanTest

@SparkQA
Copy link

SparkQA commented Sep 1, 2017

Test build #81314 has finished for PR 19083 at commit 9c58237.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 2, 2017

Test build #81327 has finished for PR 19083 at commit d6add58.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Sep 2, 2017

@viirya @rednaxelafx @kiszk Could you check again?

strExpr = Decode(Encode(strExpr, "utf-8"), "utf-8")
}
// Set the max value at `WHOLESTAGE_HUGE_METHOD_LIMIT` to compile gen'd code by janino
withSQLConf(SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key -> Int.MaxValue.toString) {
Copy link
Member

@kiszk kiszk Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to change value for WHOLESTAGE_HUGE_METHOD_LIMIT while this test is not for whole-stage codegen? This parameter does not seem to be related to the whole-stage codegen.
We could select better naming for SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, you're right. I think CODEGEN_HUGE_METHOD_LIMIT is better?

@SparkQA
Copy link

SparkQA commented Sep 7, 2017

Test build #81498 has finished for PR 19083 at commit 78653de.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Sep 7, 2017

retest this please

@SparkQA
Copy link

SparkQA commented Sep 7, 2017

Test build #81507 has finished for PR 19083 at commit 78653de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Sep 30, 2017

@gatorsmile if you get time, could you check this? Thanks.

logError(msg)
val maxLines = SQLConf.get.loggingMaxLinesForCodegen
logInfo(s"\n${CodeFormatter.format(code, maxLines)}")
throw new IllegalArgumentException(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the best kind of exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompileException is better?

@kiszk
Copy link
Member

kiszk commented Sep 30, 2017

LGTM except one comment

@SparkQA
Copy link

SparkQA commented Sep 30, 2017

Test build #82348 has finished for PR 19083 at commit 76d5cb2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 30, 2017

Test build #82353 has finished for PR 19083 at commit 87140fb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


val CODEGEN_HUGE_METHOD_LIMIT = buildConf("spark.sql.codegen.hugeMethodLimit")
.internal()
.doc("The bytecode size of a single compiled Java function generated by whole-stage codegen." +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bytecode -> The maximum bytecode

.internal()
.doc("The bytecode size of a single compiled Java function generated by whole-stage codegen." +
"When the compiled function exceeds this threshold, " +
"the whole-stage codegen is deactivated for this subtree of the current query plan. " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This threshold is not for whole-stage only, right? It is misleading.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, you're right. I'll brush up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@maropu
Copy link
Member Author

maropu commented Oct 1, 2017

retest this please.

1 similar comment
@maropu
Copy link
Member Author

maropu commented Oct 1, 2017

retest this please.

@gatorsmile
Copy link
Member

@maropu Thanks for working on it. LGTM except two minor comments.

cc @rednaxelafx @kiszk @viirya @cloud-fan

if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
logWarning(s"Found too long generated codes: the bytecode size was $maxCodeSize and " +
s"this value went over the limit ${sqlContext.conf.hugeMethodLimit}. To avoid this, " +
s"you can the limit ${SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key} higher:\n$treeString")
Copy link
Member

@viirya viirya Oct 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can set the limit ... higher...


// Check if compiled code has a too large function
if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
logWarning(s"Found too long generated codes: the bytecode size was $maxCodeSize and " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We better explain why the too long codes is a problem as previous: Found too long generated codes and JIT optimization might not work: ....

// Check if compiled code has a too large function
if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
logWarning(s"Found too long generated codes: the bytecode size was $maxCodeSize and " +
s"this value went over the limit ${sqlContext.conf.hugeMethodLimit}. To avoid this, " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this value went over.... Whole-stage codegen disabled for this plan. To avoid this ....

@@ -151,7 +151,7 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
}
}

def genGroupByCodeGenContext(caseNum: Int): CodegenContext = {
def genGroupByCodeGenContext(caseNum: Int): (CodegenContext, CodeAndComment) = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned CodegenContext is not used. We don't need to return it.

@viirya
Copy link
Member

viirya commented Oct 4, 2017

Few minor comments otherwise LGTM.

@maropu
Copy link
Member Author

maropu commented Oct 4, 2017

Thanks, I'll update soon

@SparkQA
Copy link

SparkQA commented Oct 4, 2017

Test build #82435 has finished for PR 19083 at commit dfde49b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Oct 4, 2017

fixed @gatorsmile


max function bytecode size: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
hugeMethodLimit = 8000 1043 / 1159 0.6 1591.5 1.0X
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original codegen = F case is removed? I think it is reasonable to compare with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, you're right; I'll update and thanks

import java.util.concurrent.ExecutionException

import org.apache.spark.sql.Row
import org.apache.spark.sql.catalyst.expressions.codegen.{CodeAndComment, CodegenContext, CodeGenerator}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use CodegenContext anymore and can remove it.

@@ -151,7 +151,7 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
}
}

def genGroupByCodeGenContext(caseNum: Int): CodegenContext = {
def genGroupByCodeGenContext(caseNum: Int): CodeAndComment = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

genGroupByCodeGenContext -> genGroupByCode.

@SparkQA
Copy link

SparkQA commented Oct 4, 2017

Test build #82437 has finished for PR 19083 at commit fca22b7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
logWarning(s"Found too long generated codes and JIT optimization might not work: " +
s"the bytecode size was $maxCodeSize, this value went over the limit " +
s"${sqlContext.conf.hugeMethodLimit}, and the whole-stage codegen was disable " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable -> disabled

logWarning(s"Found too long generated codes and JIT optimization might not work: " +
s"the bytecode size was $maxCodeSize, this value went over the limit " +
s"${sqlContext.conf.hugeMethodLimit}, and the whole-stage codegen was disable " +
s"for this plan. To avoid this, you can set the limit " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set -> raise . then, remove higher

@SparkQA
Copy link

SparkQA commented Oct 4, 2017

Test build #82442 has finished for PR 19083 at commit 433f13b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 4, 2017

Test build #82443 has finished for PR 19083 at commit 09ae105.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Oct 4, 2017

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 4, 2017

Test build #82445 has finished for PR 19083 at commit 09ae105.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 4, 2017

Test build #82446 has finished for PR 19083 at commit 09ae105.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@asfgit asfgit closed this in 4a779bd Oct 4, 2017
rdblue pushed a commit to rdblue/spark that referenced this pull request Apr 3, 2019
…d code

This pr added code to check actual bytecode size when compiling generated code. In apache#18810, we added code to give up code compilation and use interpreter execution in `SparkPlan` if the line number of generated functions goes over `maxLinesPerFunction`. But, we already have code to collect metrics for compiled bytecode size in `CodeGenerator` object. So,we could easily reuse the code for this purpose.

Added tests in `WholeStageCodegenSuite`.

Author: Takeshi Yamamuro <yamamuro@apache.org>

Closes apache#19083 from maropu/SPARK-21871.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants