[SPARK-21871][SQL] Check actual bytecode size when compiling generated code #19083

maropu · 2017-08-30T02:05:03Z

What changes were proposed in this pull request?

This pr added code to check actual bytecode size when compiling generated code. In #18810, we added code to give up code compilation and use interpreter execution in SparkPlan if the line number of generated functions goes over maxLinesPerFunction. But, we already have code to collect metrics for compiled bytecode size in CodeGenerator object. So,we could easily reuse the code for this purpose.

How was this patch tested?

Added tests in WholeStageCodegenSuite.

maropu · 2017-08-30T02:07:38Z

sql/core/src/test/resources/sql-tests/inputs/group-by.sql

@@ -30,8 +30,15 @@ SELECT a + 2, COUNT(b) FROM testData GROUP BY a + 1;
 SELECT a + 1 + 1, COUNT(b) FROM testData GROUP BY a + 1;

 -- Aggregate with nulls.
+--
+-- In SPARK-21871, we added code to check the bytecode size of gen'd methods. If the size


If the issue in #19082 fixed, we might remove workaround. Or, we might use a new flag discussed in #19062 here.

SparkQA · 2017-08-30T02:09:14Z

Test build #81241 has finished for PR 19083 at commit 65eb028.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-08-30T04:35:51Z

Test build #81245 has finished for PR 19083 at commit 6a61393.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

rednaxelafx

I like this one that checks the exact bytecode size instead of source-code-string-size-based heuristics. Some inline comments:

rednaxelafx · 2017-08-30T07:06:37Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+  // Error: VM option 'HugeMethodLimit' is develop and is available only in debug version of VM.
+  // Error: Could not create the Java Virtual Machine.
+  // Error: A fatal exception has occurred. Program will exit.
+  private val hugeMethodLimit = 8000


Actually this threshold is only meaningful on HotSpot VM and some HotSpot-derived JVMs. Other JVMs, for instance IBM J9 doesn't use the same threshold. It'd be a bit too strict and biased to make this a non-configurable behavior for all JVMs.

I'd suggest that if we are to do this, at least centralize the JVM detection logic somewhere (e.g. unify with the JVM detection logic in org.apache.spark.util.SizeEstimator) and only set this kind of threshold based on the detected JVM. That way we're much less likely to regress on other JVMs, and/or even on future versions of the HotSpot VM where it'll get a new JIT compiler (Graal) with new JIT compilation heuristics that could be different from the current version.

Thanks for your comment! yea, I agree. I'll rethink this pr based on your suggestion.

BTW, IBM J9 has the same threshold for too-long functions (I'm not familiar with IBM JVM...)?

Let's ask @kiszk for that ^_^
With OpenJ9 coming out soon, we might be able to find out details on this kind of issue ourselves. But until then it's easier to just ask IBM folks for a reliable answer ;-)

oh, you know @kiszk, haha.

While IBM JDK has the similar option -Xjit:acceptHugeMethods, IBM JDK unfortunately does not disclose its threshold now.

My suggestion is to make this threshold configurable by using SQLConf. Is it possible to do this?

I see. I feel detecting specific JVM implementations might go too far for this purpose. yea, one of options is to add an internal option for this. WDYT? @rednaxelafx

Sure, I'm fine with having a SQLConf conf flag for the threshold for now, with the option to move to a centralized JVM detection implementation later where we can still have a SQLConf flag to override the heuristic based on detection.
Setting such a threshold to 64KB will effectively turn the restriction off (i.e. same behavior as -XX:-DontCompileHugeMethods), so having a simple int conf should be good enough.

yea, ok! Thanks!

rednaxelafx · 2017-08-30T07:23:39Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

@@ -1091,6 +1111,7 @@ object CodeGenerator extends Logging {
    }

    // Then walk the classes to get at the method bytecode.
+    val methodsToByteCodeSize = mutable.ArrayBuffer[(String, Int)]()


Since we know the number of methods right below (for each generated class, cf.methodInfos would give us the number of methods in that class, of which we expect almost all of them to have a Code attribute), making pre-sized arrays for each class and then concatenating those arrays into a Seq as the result would avoid potential resizing wastes from using a big default-sized ArrayBuffer.

rednaxelafx · 2017-08-30T07:27:07Z

sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala

+      val errMsg = intercept[IllegalArgumentException] {
+        CodeGenerator.compile(code)
+      }.getMessage
+      assert(errMsg.contains("the size of GeneratedClass.agg_doAggregateWithKeys is 9182 and " +


The hard-coded method name and size is a bit too strict. Can we relax that a little bit so that it's more resilient to the naming and the size of the generated classes/methods?

viirya · 2017-08-31T03:31:28Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

@@ -1079,7 +1099,7 @@ object CodeGenerator extends Logging {
  /**
   * Records the generated class and method bytecode sizes by inspecting janino private fields.


Better to update method comment if we change method purpose.

oh, yea! Thanks! I'll update

viirya · 2017-08-31T03:48:30Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+          throw new IllegalArgumentException(
+            s"the size of $clazzName.$methodName is $byteCodeSize and this value goes over " +
+              s"the HugeMethodLimit $hugeMethodLimit (JVM doesn't compile methods " +
+              "larger than this limit)")


We designed to show exception messages (see below the handling of JaninoRuntimeException), before falling back to non whole-stage codegen execution. I think we should follow the same behavior.

This exception will be hidden from user for now in WholeStageCodegenExec.

SparkQA · 2017-09-01T09:45:05Z

Test build #81309 has finished for PR 19083 at commit 73090e8.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-09-01T11:24:22Z

Test build #81313 has finished for PR 19083 at commit 6100734.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class CodeGenerationSuite extends PlanTest with ExpressionEvalHelper
class OrderingSuite extends PlanTest with ExpressionEvalHelper
class GeneratedProjectionSuite extends PlanTest

SparkQA · 2017-09-01T11:24:39Z

Test build #81312 has finished for PR 19083 at commit 78af6f4.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class CodeGenerationSuite extends PlanTest with ExpressionEvalHelper
class OrderingSuite extends PlanTest with ExpressionEvalHelper
class GeneratedProjectionSuite extends PlanTest

SparkQA · 2017-09-01T11:35:57Z

Test build #81314 has finished for PR 19083 at commit 9c58237.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-09-02T01:26:50Z

Test build #81327 has finished for PR 19083 at commit d6add58.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2017-09-02T01:33:46Z

@viirya @rednaxelafx @kiszk Could you check again?

kiszk · 2017-09-06T20:18:12Z

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala

-      strExpr = Decode(Encode(strExpr, "utf-8"), "utf-8")
-    }
+    // Set the max value at `WHOLESTAGE_HUGE_METHOD_LIMIT` to compile gen'd code by janino
+    withSQLConf(SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key -> Int.MaxValue.toString) {


Why do we need to change value for WHOLESTAGE_HUGE_METHOD_LIMIT while this test is not for whole-stage codegen? This parameter does not seem to be related to the whole-stage codegen.
We could select better naming for SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT?

yea, you're right. I think CODEGEN_HUGE_METHOD_LIMIT is better?

SparkQA · 2017-09-07T07:04:46Z

Test build #81498 has finished for PR 19083 at commit 78653de.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2017-09-07T07:21:48Z

retest this please

SparkQA · 2017-09-07T09:50:13Z

Test build #81507 has finished for PR 19083 at commit 78653de.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2017-09-30T04:12:28Z

@gatorsmile if you get time, could you check this? Thanks.

kiszk · 2017-09-30T05:13:47Z

...atalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala

+        logError(msg)
+        val maxLines = SQLConf.get.loggingMaxLinesForCodegen
+        logInfo(s"\n${CodeFormatter.format(code, maxLines)}")
+        throw new IllegalArgumentException(msg)


Is this the best kind of exception?

CompileException is better?

kiszk · 2017-09-30T05:14:02Z

LGTM except one comment

SparkQA · 2017-09-30T09:35:20Z

Test build #82348 has finished for PR 19083 at commit 76d5cb2.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-09-30T16:08:15Z

Test build #82353 has finished for PR 19083 at commit 87140fb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-09-30T18:54:05Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+
+  val CODEGEN_HUGE_METHOD_LIMIT = buildConf("spark.sql.codegen.hugeMethodLimit")
+    .internal()
+    .doc("The bytecode size of a single compiled Java function generated by whole-stage codegen." +


The bytecode -> The maximum bytecode

gatorsmile · 2017-09-30T19:02:09Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+    .internal()
+    .doc("The bytecode size of a single compiled Java function generated by whole-stage codegen." +
+      "When the compiled function exceeds this threshold, " +
+      "the whole-stage codegen is deactivated for this subtree of the current query plan. " +


This threshold is not for whole-stage only, right? It is misleading.

ya, you're right. I'll brush up.

maropu · 2017-10-01T02:01:36Z

retest this please.

maropu · 2017-10-01T04:13:11Z

retest this please.

gatorsmile · 2017-10-03T19:51:10Z

@maropu Thanks for working on it. LGTM except two minor comments.

cc @rednaxelafx @kiszk @viirya @cloud-fan

viirya · 2017-10-04T00:29:36Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

+    if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
+      logWarning(s"Found too long generated codes: the bytecode size was $maxCodeSize and " +
+        s"this value went over the limit ${sqlContext.conf.hugeMethodLimit}. To avoid this, " +
+        s"you can the limit ${SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key} higher:\n$treeString")


you can set the limit ... higher...

viirya · 2017-10-04T00:31:52Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

+
+    // Check if compiled code has a too large function
+    if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
+      logWarning(s"Found too long generated codes: the bytecode size was $maxCodeSize and " +


We better explain why the too long codes is a problem as previous: Found too long generated codes and JIT optimization might not work: ....

viirya · 2017-10-04T00:32:40Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

+    // Check if compiled code has a too large function
+    if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
+      logWarning(s"Found too long generated codes: the bytecode size was $maxCodeSize and " +
+        s"this value went over the limit ${sqlContext.conf.hugeMethodLimit}. To avoid this, " +


this value went over.... Whole-stage codegen disabled for this plan. To avoid this ....

viirya · 2017-10-04T00:38:16Z

sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala

@@ -151,7 +151,7 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
    }
  }

-  def genGroupByCodeGenContext(caseNum: Int): CodegenContext = {
+  def genGroupByCodeGenContext(caseNum: Int): (CodegenContext, CodeAndComment) = {


The returned CodegenContext is not used. We don't need to return it.

viirya · 2017-10-04T00:44:19Z

Few minor comments otherwise LGTM.

maropu · 2017-10-04T01:33:13Z

Thanks, I'll update soon

SparkQA · 2017-10-04T02:50:16Z

Test build #82435 has finished for PR 19083 at commit dfde49b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2017-10-04T03:58:16Z

fixed @gatorsmile

viirya · 2017-10-04T04:02:14Z

sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala

+
+    max function bytecode size:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
+    ------------------------------------------------------------------------------------------------
+    hugeMethodLimit = 8000                        1043 / 1159          0.6        1591.5       1.0X


The original codegen = F case is removed? I think it is reasonable to compare with it.

yea, you're right; I'll update and thanks

viirya · 2017-10-04T04:05:19Z

sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala

+import java.util.concurrent.ExecutionException
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.catalyst.expressions.codegen.{CodeAndComment, CodegenContext, CodeGenerator}


We don't use CodegenContext anymore and can remove it.

viirya · 2017-10-04T04:05:53Z

sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala

@@ -151,7 +151,7 @@ class WholeStageCodegenSuite extends SparkPlanTest with SharedSQLContext {
    }
  }

-  def genGroupByCodeGenContext(caseNum: Int): CodegenContext = {
+  def genGroupByCodeGenContext(caseNum: Int): CodeAndComment = {


genGroupByCodeGenContext -> genGroupByCode.

SparkQA · 2017-10-04T04:34:01Z

Test build #82437 has finished for PR 19083 at commit fca22b7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-10-04T05:17:29Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

+    if (maxCodeSize > sqlContext.conf.hugeMethodLimit) {
+      logWarning(s"Found too long generated codes and JIT optimization might not work: " +
+        s"the bytecode size was $maxCodeSize, this value went over the limit " +
+        s"${sqlContext.conf.hugeMethodLimit}, and the whole-stage codegen was disable " +


disable -> disabled

gatorsmile · 2017-10-04T05:18:42Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

+      logWarning(s"Found too long generated codes and JIT optimization might not work: " +
+        s"the bytecode size was $maxCodeSize, this value went over the limit " +
+        s"${sqlContext.conf.hugeMethodLimit}, and the whole-stage codegen was disable " +
+        s"for this plan. To avoid this, you can set the limit " +


set -> raise . then, remove higher

SparkQA · 2017-10-04T07:04:42Z

Test build #82442 has finished for PR 19083 at commit 433f13b.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-10-04T07:04:43Z

Test build #82443 has finished for PR 19083 at commit 09ae105.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2017-10-04T07:06:56Z

retest this please.

SparkQA · 2017-10-04T09:51:50Z

Test build #82445 has finished for PR 19083 at commit 09ae105.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-10-04T09:55:49Z

Test build #82446 has finished for PR 19083 at commit 09ae105.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-10-04T17:08:45Z

Thanks! Merged to master.

…d code This pr added code to check actual bytecode size when compiling generated code. In apache#18810, we added code to give up code compilation and use interpreter execution in `SparkPlan` if the line number of generated functions goes over `maxLinesPerFunction`. But, we already have code to collect metrics for compiled bytecode size in `CodeGenerator` object. So,we could easily reuse the code for this purpose. Added tests in `WholeStageCodegenSuite`. Author: Takeshi Yamamuro <yamamuro@apache.org> Closes apache#19083 from maropu/SPARK-21871.

maropu commented Aug 30, 2017

View reviewed changes

maropu force-pushed the SPARK-21871 branch from 65eb028 to 6a61393 Compare August 30, 2017 02:37

rednaxelafx reviewed Aug 30, 2017

View reviewed changes

viirya reviewed Aug 31, 2017

View reviewed changes

maropu force-pushed the SPARK-21871 branch 3 times, most recently from 6100734 to 9c58237 Compare September 1, 2017 08:53

kiszk reviewed Sep 6, 2017

View reviewed changes

kiszk reviewed Sep 30, 2017

View reviewed changes

maropu force-pushed the SPARK-21871 branch from 76d5cb2 to 87140fb Compare September 30, 2017 13:25

gatorsmile reviewed Sep 30, 2017

View reviewed changes

Fix minor issues

dfde49b

viirya reviewed Oct 4, 2017

View reviewed changes

Fix

fca22b7

viirya reviewed Oct 4, 2017

View reviewed changes

Fix

433f13b

maropu force-pushed the SPARK-21871 branch from faab738 to 433f13b Compare October 4, 2017 04:23

gatorsmile reviewed Oct 4, 2017

View reviewed changes

Fix

09ae105

asfgit closed this in 4a779bd Oct 4, 2017

kiszk mentioned this pull request Oct 10, 2017

[SPARK-21870][SQL] Split aggregation code into small functions #19082

Closed

		@@ -1079,7 +1099,7 @@ object CodeGenerator extends Logging {
		/**
		* Records the generated class and method bytecode sizes by inspecting janino private fields.

[SPARK-21871][SQL] Check actual bytecode size when compiling generated code #19083

[SPARK-21871][SQL] Check actual bytecode size when compiling generated code #19083

Conversation

maropu commented Aug 30, 2017

What changes were proposed in this pull request?

How was this patch tested?

Choose a reason for hiding this comment

SparkQA commented Aug 30, 2017

SparkQA commented Aug 30, 2017

rednaxelafx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maropu Aug 30, 2017 • edited Loading

Choose a reason for hiding this comment

maropu Aug 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kiszk Aug 31, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Sep 1, 2017

SparkQA commented Sep 1, 2017

SparkQA commented Sep 1, 2017

SparkQA commented Sep 1, 2017

SparkQA commented Sep 2, 2017

maropu commented Sep 2, 2017

kiszk Sep 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Sep 7, 2017

maropu commented Sep 7, 2017

SparkQA commented Sep 7, 2017

maropu commented Sep 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kiszk commented Sep 30, 2017

SparkQA commented Sep 30, 2017

SparkQA commented Sep 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maropu commented Oct 1, 2017

maropu commented Oct 1, 2017

gatorsmile commented Oct 3, 2017

viirya Oct 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viirya commented Oct 4, 2017

maropu commented Oct 4, 2017

SparkQA commented Oct 4, 2017

maropu commented Oct 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Oct 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Oct 4, 2017

SparkQA commented Oct 4, 2017

maropu commented Oct 4, 2017

SparkQA commented Oct 4, 2017

SparkQA commented Oct 4, 2017

gatorsmile commented Oct 4, 2017

maropu Aug 30, 2017 •

edited

Loading

maropu Aug 30, 2017 •

edited

Loading

kiszk Aug 31, 2017 •

edited

Loading

kiszk Sep 6, 2017 •

edited

Loading

viirya Oct 4, 2017 •

edited

Loading