Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods #18359

uschindler · 2016-05-15T11:01:19Z

I finally managed to get the field and array stores to def fields or arrays no longer box. The problem to solve was to hack into EChain.analyzeWrite and promote the type the other way round:

If it finds out that last node (the store link) in the chain accepts a def type, the code changes the return type of the expression (of which result should be stored) and disables casting the usual way. After that the return type of the expression gets promoted to directly to the store node.
The code in field store and array store's write() was adapted to use the promoted type in the descriptor (this change was missing in my last PR already).

In addition in this PR I made all invokedynamic calls use the GeneratorAdapter method name instead on the underlying visitor's method name. This makes all invokes named the same throughout codebase.

Another change is removal of a useless List -> array clone.

uschindler · 2016-05-15T11:06:48Z

I dumped the bytecode of several array tests (field stores are hard at the moment; they are also completely untested, as far as I know):

First example:
def x = new int[4]; x[0] = 5; return x[0];

This creates the following invokedynamic:

  public execute(Ljava/util/Map;Lorg/apache/lucene/search/Scorer;Lorg/elasticsearch/search/lookup/LeafDocLookup;Ljava/lang/Object;)Ljava/lang/Object;
   L0
    LINENUMBER 1 L0
    ICONST_4
    NEWARRAY T_INT
    ASTORE 5
   L1
    LINENUMBER 1 L1
    ALOAD 5
    ICONST_0
    ICONST_5
    INVOKEDYNAMIC arrayStore(Ljava/lang/Object;II)V [
      // handle kind 0x6 : INVOKESTATIC
      org/elasticsearch/painless/DefBootstrap.bootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/CallSite;
      // arguments:
      4
    ]
   L2
[...]

As you see there is no boxing involved anymore, the value is pushed unmodified to stack.

It also works if the value to Store is originally a String, although it does not really involves boxing, but you see that type passed to invokedyanmic is preserved:

def x = new String[4]; x[0] = 'foobar'; return x[0];

  public execute(Ljava/util/Map;Lorg/apache/lucene/search/Scorer;Lorg/elasticsearch/search/lookup/LeafDocLookup;Ljava/lang/Object;)Ljava/lang/Object;
   L0
    LINENUMBER 1 L0
    ICONST_4
    ANEWARRAY java/lang/String
    ASTORE 5
   L1
    LINENUMBER 1 L1
    ALOAD 5
    ICONST_0
    LDC "foobar"
    INVOKEDYNAMIC arrayStore(Ljava/lang/Object;ILjava/lang/String;)V [
      // handle kind 0x6 : INVOKESTATIC
      org/elasticsearch/painless/DefBootstrap.bootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/CallSite;
      // arguments:
      4
    ]
   L2
[...]

rmuir · 2016-05-15T14:16:31Z

Thanks for killing more boxing! I will look when i get home.

uschindler · 2016-05-15T15:46:46Z

Thanks @rmuir!
I am currently not sure if the code here is 100% correct, because it misuses the before field. I will change the PR to add a field storeType or similar. I can then also add an instanceof-check instead of the stupid "hack" I did now. The new code in analyzeWrite should really only be done for stores on DEF fields, so a superclass for LDefField and LDefArray should be introduced.
The current solutions is a hack and may break in future.

…ic for consistency with other method calls

uschindler · 2016-05-15T16:34:53Z

New version, I nuked the old one. This is now clean.

The new superclass of LDef* Link nodes is also useful to handle return values correctly. For this we can have a similar approach that makes method calls or "loads" get the type that is really expected. This is also the reason why I adapted the superclass of unrelated LDefCall, too.

uschindler · 2016-05-15T17:53:38Z

I also was able to remove boxing from loads and method calls - as long as they are simple expressions. Still not solved are compound statements (increments,...). I have to first understand EChain.analyzeCompound(). But I am sure there is a solution to find the simplest possible type.

The new code works similar to before: It checks if the link node used for read is a LDefLink and the expression got a "expected" output type (which is set by parent node). If this is given it will adapt the output type of the last node to the expected type. Because of that no cast is needed.

uschindler · 2016-05-15T17:56:51Z

Example: def x = new HashMap(); x['abc'] = 5; int z = x.get('abc'); return z;

Bytecode for the part int z = x.get('abc'):

    ALOAD 5
    LDC "abc"
    INVOKEDYNAMIC get(Ljava/lang/Object;Ljava/lang/String;)I [
      // handle kind 0x6 : INVOKESTATIC
      org/elasticsearch/painless/DefBootstrap.bootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/CallSite;
      // arguments:
      0
    ]
    ISTORE 6

…s also a bug with the size of LDefArray: fixed

uschindler · 2016-05-15T21:02:20Z

I also fixed a bug with compound statements and made them work with def at all. Problem was: Size of LDefArray was missing (was 0 instead of 2 like for LBrace), which caused ASM error because stack dups were missing.

For compound statements, we can unfortunately make no type propagation, because we load from DEF and store from DEF, so there is no more type information. E.g., if you increment a DEF, it will always box. But that has nothing to do with this issue (invokedynamic).

The other thing not yet solved is missing type information for statements like int x = a[0] + 1 (with a a DEF). Unfortunately the tree does not give enough information, because the expected type of the outer (int x) is not yet visible in the inner (a[0] + 1). I will think about that.

This PR now solves all the "simple" cases and also fixes a bug with def arrays and stack size.

rmuir · 2016-05-16T00:45:48Z

I also fixed a bug with compound statements and made them work with def at all. Problem was: Size of LDefArray was missing (was 0 instead of 2 like for LBrace), which caused ASM error because stack dups were missing.

Thank you, I think i hit this one when experimenting the other day!

rmuir · 2016-05-16T02:13:54Z

The other thing not yet solved is missing type information for statements like int x = a[0] + 1 (with a a DEF). Unfortunately the tree does not give enough information, because the expected type of the outer (int x) is not yet visible in the inner (a[0] + 1). I will think about that.

OK, this would be great to look into. Already this patch helps, for example our script in the nightly benchmark:

(Math.log(Math.abs(doc['population'].value)) + doc['elevation'].value * doc['latitude'].value)/_score

We see removal of boxing for the log(abs(population)), instead of:

LDC "population"
INVOKEINTERFACE java/util/Map.get (Ljava/lang/Object;)Ljava/lang/Object;
INVOKEDYNAMIC value(Ljava/lang/Object;)Ljava/lang/Object; [ ... ]
INVOKESTATIC org/elasticsearch/painless/Def.DefTodouble (Ljava/lang/Object;)D
INVOKESTATIC java/lang/Math.abs (D)D
INVOKESTATIC java/lang/Math.log (D)D

we now see:

LDC "population"
INVOKEINTERFACE java/util/Map.get (Ljava/lang/Object;)Ljava/lang/Object;
INVOKEDYNAMIC value(Ljava/lang/Object;)D [ ... ]
INVOKESTATIC java/lang/Math.abs (D)D
INVOKESTATIC java/lang/Math.log (D)D

Unfortunately the optimization ends there for now (but this is progress!). I think part of the problem is that for the larger expression, painless has no idea that it needs to return a double in this case, so at the end of the day always returns def, and then runAsDouble() in the scripting api just unboxes that. The other part is the math operators, but its related :)

I will look into trying to clean this up, i hate that we don't know our return type when being compiled, i think its grossly unfair of the scripting api to do this. If we can fix it, then I think more type information would bubble up the tree and we'd kill more of the overhead.

uschindler · 2016-05-16T08:12:39Z

I think part of the problem is that for the larger expression, painless has no idea that it needs to return a double in this case, so at the end of the day always returns def, and then runAsDouble() in the scripting api just unboxes that. The other part is the math operators, but its related

There is some parts missing in painless:

The first issue is order of evaluation: It first evaluates the inner expressions in EBinary, but does not give expected to them - it does not know it at that time. The inner EChain then sees expected==null and cannot apply the type optimization. For the method call and assignment case, painless passes expected already, this is why it works. This is also the reason why I added the null check; maybe add a comment that this is not always given
For the type propagation, painless only have it inner->outer. So AnalyzerCaster can only take the operands and calculated the type that comes out, e.g. for addition. But it has no code to do the same other way round.

One idea to fix this would be a two-phase analyze: In the first step of analyze the code is checked for inconsistenceies and every node would assign the "I'd be happy to get that input and output types". In a second pass of optimize the information collected in first phase would be used to adopt types also the other way round: if a sub node returns Def only after first phase, but the acceptor reported that it may accept double or float, make the inner node adopt its return type. The last step would then be writing of byte code.

uschindler · 2016-05-16T08:15:33Z

I think part of the problem is that for the larger expression, painless has no idea that it needs to return a double in this case, so at the end of the day always returns def, and then runAsDouble() in the scripting api just unboxes that

Thats a separate issue. I would make 2 abstract Executables: One that returns Object and one that returns double. Painless checks while compiling which one to implement and returns the double one if possible. I think this should be possible to implement with some logic in the Writer if the "actual" type of the root node is primitive.

uschindler · 2016-05-16T08:32:55Z

Also a bit unrelated to this issue, but we may investigate it in a separate issue: If boxing is needed (we cannot avoid it everywhere, like def i = 0; [...]; i++, the code boxes "constants" (in the increment case the constant 1) over and over. I know that hotspot optimizes that away in most cases, but it still looks bad and makes bytecode unreadable.

We already have a EConstant node, so we may optimize that to add all boxed constants it sees in code as static final field to the script's class and just refer to them with a static field load (here it would be named, e.g. BOXED_INT_1) instead of the box instruction. We would just need some code to collect them and write all of them using visitField after we wrote the bytecode. We may be able to put the whole logic for that into EConstant and the pool of constants in the Variables or CompilerSettings!

This is not a cache - it is static, it just saves instances of all boxed constants found in code.

jdconrad · 2016-05-16T16:49:47Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/node/LDefArray.java


    AExpression index;

    LDefArray(final int line, final String location, final AExpression index) {
-        super(line, location, 0);


Nice catch!

jdconrad · 2016-05-16T17:19:05Z

At Uwe, thanks for tackling this! I left a couple of comments.

jdconrad · 2016-05-16T17:26:35Z

@uschindler I really like your idea of the two-passes to collect all known type information!

uschindler · 2016-05-16T17:34:28Z

I really like your idea of the two-passes to collect all known type information!

Yes, I think this should be a separate issue. This issue is just to improve the situation, but a 2 pass analyze might be the best option, although this won't help for all cases. Sometimes it is impossible to catch all.

jdconrad · 2016-05-16T17:37:31Z

Oh yeah, definitely a separate issue. :)

uschindler · 2016-05-16T18:33:24Z

Hi Jack, I fixed the issue with the "this.actual" value so chained assignment works correct. So I need to change this actual output to the (not) casted one. The next part in chained expression is then casting the duped value afterwards. Now it passes.

jdconrad · 2016-05-16T18:42:15Z

@uschindler This looks great, if you're happy with it, I'm happy with it. +1 I'm happy to commit as-is unless there was anything else you wanted to add or change.

uschindler · 2016-05-16T18:42:58Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/node/EChain.java

        expression = expression.cast(settings, definition, variables);

        statement = true;
-        actual = read ? last.after : definition.voidType;


for the regular case it should be the same, but moved into the else block

Hi Jack, you see this was moved up to else block!

Yes! Thank you.

jdconrad · 2016-05-16T18:47:04Z

Sorry for the confusion on this.actual versus expression.actual. Took a bit for the caffeine to kick in!

jdconrad · 2016-05-16T18:53:56Z

@uschindler Are you ready for me to commit?

uschindler · 2016-05-16T18:54:16Z

No problem :)

Maybe we should change the assignments tos use this as prefix also in source code. As EChain is also an expression, this makes clear if the inner expression or this is meant by the assignments. I can fix this, OK?

jdconrad · 2016-05-16T18:56:16Z

@uschindler Yeah, please do. Just let me know whenever you're ready.

…nner "expression" is affected

uschindler · 2016-05-16T19:07:47Z

OK, I am done. Thanks for help and finding the remaining issue with the cast.

We should really work on more tests for more corner cases.

In addition, I was talking with @rmuir about some way to check that "optimizations" in bytecode survive refactorings. One idea is to dump the bytecode to a String in tests using debugger class and look for patterns in it (e.g. for this issue make sure that it does not box by looking for "java/lang/Integer.valueOf" and fail test if it occurs in bytecode). Alternatively use forbiddenapis' API (as test dependency) and invoke de.forbiddenapis.Checker to forbid some method calls in the class :-)

jdconrad · 2016-05-16T19:11:34Z

@uschindler Thanks again for working on this issue! I really like the idea of checking for the optimizations using the patterns from the bytecode. Definitely need to ensure these changes stay in the future.

painless: Make field stores not box; use GeneratorAdapter.invokeDynma…

604bcd9

…ic for consistency with other method calls

uschindler force-pushed the painless_removeStoreBoxing branch from ec712d4 to 604bcd9 Compare May 15, 2016 16:32

clintongormley added >enhancement review labels May 15, 2016

clintongormley changed the title ~~painless: remove boxing when storing values in "def" fields/arrays~~ Remove boxing when storing values in "def" fields/arrays May 15, 2016

painless: Also remove boxing for reads and method calls

d221cd1

uschindler changed the title ~~Remove boxing when storing values in "def" fields/arrays~~ Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods May 15, 2016

painless: make compound statement like a[1]++ work with def. There wa…

65aca4f

…s also a bug with the size of LDefArray: fixed

rmuir mentioned this pull request May 16, 2016

Remove LeafSearchScript.runAsFloat(): Nothing calls it. #18364

Merged

jdconrad reviewed May 16, 2016
View reviewed changes

painless: Fix issue with dup and cast

3a5ef68

painless: Add Jack's test

b05ac87

uschindler reviewed May 16, 2016
View reviewed changes

jdconrad mentioned this pull request May 16, 2016

Ongoing Painless Improvements #17992

Closed

18 tasks

painless: Some reformatting in EChain to make it clear if "this" or i…

d6cbbde

…nner "expression" is affected

jdconrad merged commit a2c2628 into elastic:master May 16, 2016

uschindler deleted the painless_removeStoreBoxing branch May 16, 2016 19:19

clintongormley added the v5.0.0-alpha3 label May 17, 2016

clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods #18359

Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods #18359

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

rmuir commented May 15, 2016

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

rmuir commented May 16, 2016

rmuir commented May 16, 2016

uschindler commented May 16, 2016

uschindler commented May 16, 2016

uschindler commented May 16, 2016 •

edited

Loading

jdconrad May 16, 2016

jdconrad commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016 •

edited

Loading

uschindler May 16, 2016

jdconrad May 16, 2016

jdconrad commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016

Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods #18359

Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods #18359

Conversation

uschindler commented May 15, 2016 • edited Loading

uschindler commented May 15, 2016 • edited Loading

rmuir commented May 15, 2016

uschindler commented May 15, 2016 • edited Loading

uschindler commented May 15, 2016 • edited Loading

uschindler commented May 15, 2016 • edited Loading

uschindler commented May 15, 2016 • edited Loading

uschindler commented May 15, 2016 • edited Loading

rmuir commented May 16, 2016

rmuir commented May 16, 2016

uschindler commented May 16, 2016

uschindler commented May 16, 2016

uschindler commented May 16, 2016 • edited Loading

jdconrad May 16, 2016

Choose a reason for hiding this comment

jdconrad commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016 • edited Loading

uschindler May 16, 2016

Choose a reason for hiding this comment

jdconrad May 16, 2016

Choose a reason for hiding this comment

jdconrad commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 16, 2016

jdconrad commented May 16, 2016

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 15, 2016 •

edited

Loading

uschindler commented May 16, 2016 •

edited

Loading

jdconrad commented May 16, 2016 •

edited

Loading