Skip to content

Conversation

@lw-lin
Copy link
Contributor

@lw-lin lw-lin commented Jul 26, 2016

What changes were proposed in this pull request?

Ideally, we would wish codegen methods to be less than 8KB for bytecode size. Beyond 8K JIT won't compile and can cause performance degradation.

Instead of understanding the generated source code and automatically breaking large methods into smaller ones (which is also a little hard to do), it'd be better we discover large methods asap and then manually improve the source code.

This patch adds support for checking codegen method size on compile. We can specify for each method what is the expected behavior when it exceeds 8KB for bytecode size:

 /** No-op when the method exceeds 8K size. */
 case object NO_OP extends FunctionSizeHint

 /** Log a warning when the method exceeds 8K size. */
 case object WARN_IF_EXCEEDS_JIT_LIMIT extends FunctionSizeHint

 /**
  * Throw a compilation exception when the method exceeds 8K size.
  * Fail fast so that we can catch it asap; this is useful in testing corner/edge cases.
  */
 case object ERROR_IF_EXCEEDS_JIT_LIMIT extends FunctionSizeHint

This way we can test against some extreme case such as a 10000-columns-wide table, to see if the generated code is small enough to get a chance to be JITed at runtime.

Sample Usage

val codeBody =
    s"""
      public static void inc() {
        int i = 0;
        i++;
        i++;
        ... // up to 15000 i++ s here, making this inc() method exceed 8K size
      }
    """

// == prior to this patch ===================
ctx = new CodegenContext()
ctx.addNewFunction("inc", codeBody)
CodeGenerator.compile(
  new CodeAndComment(ctx.declareAddedFunctions(), emptyComments))

// == after this patch ======================
// Exception: failed to compile. Method org.apache.spark.sql.catalyst.expressions.GeneratedClass.inc
// should not exceed 8K size limit -- observed size is 45003
ctx = new CodegenContext()
ctx.addNewFunction("inc", codeBody, CodegenContext.ERROR_IF_EXCEEDS_JIT_LIMIT)
CodeGenerator.compile(
  new CodeAndComment(ctx.declareAddedFunctions(), emptyComments, ctx.getFuncToSizeHintMap))

How was this patch tested?

new unit test

@SparkQA
Copy link

SparkQA commented Jul 26, 2016

Test build #62890 has finished for PR 14370 at commit 19b6d1d.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class CodeAndComment(

@SparkQA
Copy link

SparkQA commented Jul 26, 2016

Test build #62891 has finished for PR 14370 at commit 946797d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ooq
Copy link
Contributor

ooq commented Jul 26, 2016

Thanks for the PR @lw-lin . I think this check would be useful and ideally WARN_IF_EXCEEDS_JIT_LIMIT should be on by default if the overhead is minimal.

@lw-lin
Copy link
Contributor Author

lw-lin commented Jul 26, 2016

@davies would you also take a look? Thanks!

@rxin
Copy link
Contributor

rxin commented Jul 27, 2016

Why are we enforcing this? Users can turn on compile huge methods at the VM level.

@cloud-fan
Copy link
Contributor

+1 on @rxin 's idea, I think these checks should belong to VM, the 8k JIT limit is not the case for all VMs right? and maybe configurable?

@lw-lin
Copy link
Contributor Author

lw-lin commented Jul 27, 2016

sorry for the ambiguity, but we're not enforcing this. There are 3 levels:
- when NO_OP is on (which is the fault), we do nothing at all to a huge method;
- when WARN_... is on, we'll just log a warning;
- when ERROR... is on, we fail fast.

So by default, this does not affect anything on end-user's side. We should turn on ERROR_IF_EXCEEDS_JIT_LIMIT only in test suits and only against critical methods, so that we can find performance issues asap because it fails fast.

This patch was mainly intended for unit tests; but almost all existing tests passed this JIT limit check. Thus I don't think there's much value in this, and I'm closing this.

Thanks! :)

@lw-lin lw-lin closed this Jul 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants