Skip to content

Conversation

@rednaxelafx
Copy link
Contributor

What changes were proposed in this pull request?

In Java, instance fields are guaranteed to be first initialized to their corresponding default values (zero values) before the constructor is invoked. Thus, explicitly code to initialize fields to their zero values is redundant and should be avoided.

It's usually harmless to have such code in hand-written constructors, but in the case of mechanically generating code, such code could contribute to a significant portion of the code size and cause issues.

This ticket is a step in reducing the likelihood of hitting the 64KB bytecode method size limit in the Java Class files. This PR uses simple heuristics to filter out redundant code of initializing mutable state to their default (zero) values in CodegenContext.addMutableState, by matching string patterns. Basically, if the initCode in the added mutable state is like:

[this.]${variableName} = ${defaultValue};

(where [this.] indicates it's optional)
Then it'll be replaced by an empty string instead.

This pattern will catch the most common cases. An example of drawn from production is:

/* 262651 */     value24709 = -1L;
/* 262652 */     isNull24710 = false;
/* 262653 */     value24710 = -1L;
/* 262654 */     isNull24718 = false;
/* 262655 */     value24718 = false;
/* 262656 */     isNull24719 = false;
/* 262657 */     value24719 = -1L;
/* 262658 */     isNull24720 = false;
/* 262659 */     value24720 = -1L;
/* 262660 */     isNull24728 = false;
/* 262661 */     value24728 = false;
/* 262662 */     isNull24729 = false;
/* 262663 */     value24729 = -1L;
/* 262664 */     isNull24730 = false;
/* 262665 */     value24730 = -1L;
/* 262666 */     isNull24738 = false;
...

where all of the isNullNNN = false; initialization code is redundant.

Alternatives are:

  • Manually go through all the places where redundant initialization code is specified in addMutableState() call sites, and change them to empty string. That's tedious and involves changing a lot of code. But if we do it this way, we could considering giving initCode an default value of "" in the declaration of addMutableState(), which could be a nice improvement too.
  • Instead of using a simple regular-expression-based heuristic, use an actual parse tree to verify the redundant code pattern. That would make it very precise and be able to catch all cases, but it'll involve a bigger change, potentially changing a bit of infrastructure in the codegen, e.g. codegen with parse tree (AST) rather than strings. I'd like to take baby steps first and move to more heavyweight options later.

How was this patch tested?

Ran SQL and Catalyst unit tests. Also added some unit tests for the new filtering heuristic.

…out redundant

default initialization code with a simple heuristic.
@rednaxelafx rednaxelafx closed this Jun 9, 2017
@SparkQA
Copy link

SparkQA commented Jun 9, 2017

Test build #77854 has finished for PR 18255 at commit de9ce5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 9, 2017

Test build #3788 has finished for PR 18255 at commit de9ce5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants