Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods #18359

Merged
merged 6 commits into from May 16, 2016

Conversation

uschindler
Copy link
Contributor

@uschindler uschindler commented May 15, 2016

I finally managed to get the field and array stores to def fields or arrays no longer box. The problem to solve was to hack into EChain.analyzeWrite and promote the type the other way round:

  • If it finds out that last node (the store link) in the chain accepts a def type, the code changes the return type of the expression (of which result should be stored) and disables casting the usual way. After that the return type of the expression gets promoted to directly to the store node.
  • The code in field store and array store's write() was adapted to use the promoted type in the descriptor (this change was missing in my last PR already).

In addition in this PR I made all invokedynamic calls use the GeneratorAdapter method name instead on the underlying visitor's method name. This makes all invokes named the same throughout codebase.

Another change is removal of a useless List -> array clone.

@uschindler
Copy link
Contributor Author

uschindler commented May 15, 2016

I dumped the bytecode of several array tests (field stores are hard at the moment; they are also completely untested, as far as I know):

First example:
def x = new int[4]; x[0] = 5; return x[0];

This creates the following invokedynamic:

  public execute(Ljava/util/Map;Lorg/apache/lucene/search/Scorer;Lorg/elasticsearch/search/lookup/LeafDocLookup;Ljava/lang/Object;)Ljava/lang/Object;
   L0
    LINENUMBER 1 L0
    ICONST_4
    NEWARRAY T_INT
    ASTORE 5
   L1
    LINENUMBER 1 L1
    ALOAD 5
    ICONST_0
    ICONST_5
    INVOKEDYNAMIC arrayStore(Ljava/lang/Object;II)V [
      // handle kind 0x6 : INVOKESTATIC
      org/elasticsearch/painless/DefBootstrap.bootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/CallSite;
      // arguments:
      4
    ]
   L2
[...]

As you see there is no boxing involved anymore, the value is pushed unmodified to stack.

It also works if the value to Store is originally a String, although it does not really involves boxing, but you see that type passed to invokedyanmic is preserved:

def x = new String[4]; x[0] = 'foobar'; return x[0];

  public execute(Ljava/util/Map;Lorg/apache/lucene/search/Scorer;Lorg/elasticsearch/search/lookup/LeafDocLookup;Ljava/lang/Object;)Ljava/lang/Object;
   L0
    LINENUMBER 1 L0
    ICONST_4
    ANEWARRAY java/lang/String
    ASTORE 5
   L1
    LINENUMBER 1 L1
    ALOAD 5
    ICONST_0
    LDC "foobar"
    INVOKEDYNAMIC arrayStore(Ljava/lang/Object;ILjava/lang/String;)V [
      // handle kind 0x6 : INVOKESTATIC
      org/elasticsearch/painless/DefBootstrap.bootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/CallSite;
      // arguments:
      4
    ]
   L2
[...]

@rmuir
Copy link
Contributor

rmuir commented May 15, 2016

Thanks for killing more boxing! I will look when i get home.

@uschindler
Copy link
Contributor Author

uschindler commented May 15, 2016

Thanks @rmuir!
I am currently not sure if the code here is 100% correct, because it misuses the before field. I will change the PR to add a field storeType or similar. I can then also add an instanceof-check instead of the stupid "hack" I did now. The new code in analyzeWrite should really only be done for stores on DEF fields, so a superclass for LDefField and LDefArray should be introduced.
The current solutions is a hack and may break in future.

@uschindler
Copy link
Contributor Author

uschindler commented May 15, 2016

New version, I nuked the old one. This is now clean.

The new superclass of LDef* Link nodes is also useful to handle return values correctly. For this we can have a similar approach that makes method calls or "loads" get the type that is really expected. This is also the reason why I adapted the superclass of unrelated LDefCall, too.

@clintongormley clintongormley changed the title painless: remove boxing when storing values in "def" fields/arrays Remove boxing when storing values in "def" fields/arrays May 15, 2016
@uschindler
Copy link
Contributor Author

uschindler commented May 15, 2016

I also was able to remove boxing from loads and method calls - as long as they are simple expressions. Still not solved are compound statements (increments,...). I have to first understand EChain.analyzeCompound(). But I am sure there is a solution to find the simplest possible type.

The new code works similar to before: It checks if the link node used for read is a LDefLink and the expression got a "expected" output type (which is set by parent node). If this is given it will adapt the output type of the last node to the expected type. Because of that no cast is needed.

@uschindler uschindler changed the title Remove boxing when storing values in "def" fields/arrays Remove boxing when loading and storing values in "def" fields/arrays, remove boxing onsimple method calls of "def" methods May 15, 2016
@uschindler
Copy link
Contributor Author

uschindler commented May 15, 2016

Example: def x = new HashMap(); x['abc'] = 5; int z = x.get('abc'); return z;

Bytecode for the part int z = x.get('abc'):

    ALOAD 5
    LDC "abc"
    INVOKEDYNAMIC get(Ljava/lang/Object;Ljava/lang/String;)I [
      // handle kind 0x6 : INVOKESTATIC
      org/elasticsearch/painless/DefBootstrap.bootstrap(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;I)Ljava/lang/invoke/CallSite;
      // arguments:
      0
    ]
    ISTORE 6

…s also a bug with the size of LDefArray: fixed
@uschindler
Copy link
Contributor Author

uschindler commented May 15, 2016

I also fixed a bug with compound statements and made them work with def at all. Problem was: Size of LDefArray was missing (was 0 instead of 2 like for LBrace), which caused ASM error because stack dups were missing.

For compound statements, we can unfortunately make no type propagation, because we load from DEF and store from DEF, so there is no more type information. E.g., if you increment a DEF, it will always box. But that has nothing to do with this issue (invokedynamic).

The other thing not yet solved is missing type information for statements like int x = a[0] + 1 (with a a DEF). Unfortunately the tree does not give enough information, because the expected type of the outer (int x) is not yet visible in the inner (a[0] + 1). I will think about that.

This PR now solves all the "simple" cases and also fixes a bug with def arrays and stack size.

@rmuir
Copy link
Contributor

rmuir commented May 16, 2016

I also fixed a bug with compound statements and made them work with def at all. Problem was: Size of LDefArray was missing (was 0 instead of 2 like for LBrace), which caused ASM error because stack dups were missing.

Thank you, I think i hit this one when experimenting the other day!

@rmuir
Copy link
Contributor

rmuir commented May 16, 2016

The other thing not yet solved is missing type information for statements like int x = a[0] + 1 (with a a DEF). Unfortunately the tree does not give enough information, because the expected type of the outer (int x) is not yet visible in the inner (a[0] + 1). I will think about that.

OK, this would be great to look into. Already this patch helps, for example our script in the nightly benchmark:

(Math.log(Math.abs(doc['population'].value)) + doc['elevation'].value * doc['latitude'].value)/_score

We see removal of boxing for the log(abs(population)), instead of:

LDC "population"
INVOKEINTERFACE java/util/Map.get (Ljava/lang/Object;)Ljava/lang/Object;
INVOKEDYNAMIC value(Ljava/lang/Object;)Ljava/lang/Object; [ ... ]
INVOKESTATIC org/elasticsearch/painless/Def.DefTodouble (Ljava/lang/Object;)D
INVOKESTATIC java/lang/Math.abs (D)D
INVOKESTATIC java/lang/Math.log (D)D

we now see:

LDC "population"
INVOKEINTERFACE java/util/Map.get (Ljava/lang/Object;)Ljava/lang/Object;
INVOKEDYNAMIC value(Ljava/lang/Object;)D [ ... ]
INVOKESTATIC java/lang/Math.abs (D)D
INVOKESTATIC java/lang/Math.log (D)D

Unfortunately the optimization ends there for now (but this is progress!). I think part of the problem is that for the larger expression, painless has no idea that it needs to return a double in this case, so at the end of the day always returns def, and then runAsDouble() in the scripting api just unboxes that. The other part is the math operators, but its related :)

I will look into trying to clean this up, i hate that we don't know our return type when being compiled, i think its grossly unfair of the scripting api to do this. If we can fix it, then I think more type information would bubble up the tree and we'd kill more of the overhead.

@uschindler
Copy link
Contributor Author

I think part of the problem is that for the larger expression, painless has no idea that it needs to return a double in this case, so at the end of the day always returns def, and then runAsDouble() in the scripting api just unboxes that. The other part is the math operators, but its related

There is some parts missing in painless:

  • The first issue is order of evaluation: It first evaluates the inner expressions in EBinary, but does not give expected to them - it does not know it at that time. The inner EChain then sees expected==null and cannot apply the type optimization. For the method call and assignment case, painless passes expected already, this is why it works. This is also the reason why I added the null check; maybe add a comment that this is not always given
  • For the type propagation, painless only have it inner->outer. So AnalyzerCaster can only take the operands and calculated the type that comes out, e.g. for addition. But it has no code to do the same other way round.

One idea to fix this would be a two-phase analyze: In the first step of analyze the code is checked for inconsistenceies and every node would assign the "I'd be happy to get that input and output types". In a second pass of optimize the information collected in first phase would be used to adopt types also the other way round: if a sub node returns Def only after first phase, but the acceptor reported that it may accept double or float, make the inner node adopt its return type. The last step would then be writing of byte code.

@uschindler
Copy link
Contributor Author

I think part of the problem is that for the larger expression, painless has no idea that it needs to return a double in this case, so at the end of the day always returns def, and then runAsDouble() in the scripting api just unboxes that

Thats a separate issue. I would make 2 abstract Executables: One that returns Object and one that returns double. Painless checks while compiling which one to implement and returns the double one if possible. I think this should be possible to implement with some logic in the Writer if the "actual" type of the root node is primitive.

@uschindler
Copy link
Contributor Author

uschindler commented May 16, 2016

Also a bit unrelated to this issue, but we may investigate it in a separate issue: If boxing is needed (we cannot avoid it everywhere, like def i = 0; [...]; i++, the code boxes "constants" (in the increment case the constant 1) over and over. I know that hotspot optimizes that away in most cases, but it still looks bad and makes bytecode unreadable.

We already have a EConstant node, so we may optimize that to add all boxed constants it sees in code as static final field to the script's class and just refer to them with a static field load (here it would be named, e.g. BOXED_INT_1) instead of the box instruction. We would just need some code to collect them and write all of them using visitField after we wrote the bytecode. We may be able to put the whole logic for that into EConstant and the pool of constants in the Variables or CompilerSettings!

This is not a cache - it is static, it just saves instances of all boxed constants found in code.


AExpression index;

LDefArray(final int line, final String location, final AExpression index) {
super(line, location, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@jdconrad
Copy link
Contributor

At Uwe, thanks for tackling this! I left a couple of comments.

@jdconrad
Copy link
Contributor

@uschindler I really like your idea of the two-passes to collect all known type information!

@uschindler
Copy link
Contributor Author

I really like your idea of the two-passes to collect all known type information!

Yes, I think this should be a separate issue. This issue is just to improve the situation, but a 2 pass analyze might be the best option, although this won't help for all cases. Sometimes it is impossible to catch all.

@jdconrad
Copy link
Contributor

Oh yeah, definitely a separate issue. :)

@uschindler
Copy link
Contributor Author

Hi Jack, I fixed the issue with the "this.actual" value so chained assignment works correct. So I need to change this actual output to the (not) casted one. The next part in chained expression is then casting the duped value afterwards. Now it passes.

@jdconrad
Copy link
Contributor

jdconrad commented May 16, 2016

@uschindler This looks great, if you're happy with it, I'm happy with it. +1 I'm happy to commit as-is unless there was anything else you wanted to add or change.

expression = expression.cast(settings, definition, variables);

statement = true;
actual = read ? last.after : definition.voidType;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the regular case it should be the same, but moved into the else block

Hi Jack, you see this was moved up to else block!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thank you.

@jdconrad
Copy link
Contributor

Sorry for the confusion on this.actual versus expression.actual. Took a bit for the caffeine to kick in!

@jdconrad
Copy link
Contributor

@uschindler Are you ready for me to commit?

@uschindler
Copy link
Contributor Author

No problem :)

Maybe we should change the assignments tos use this as prefix also in source code. As EChain is also an expression, this makes clear if the inner expression or this is meant by the assignments. I can fix this, OK?

@jdconrad jdconrad mentioned this pull request May 16, 2016
18 tasks
@jdconrad
Copy link
Contributor

@uschindler Yeah, please do. Just let me know whenever you're ready.

@uschindler
Copy link
Contributor Author

OK, I am done. Thanks for help and finding the remaining issue with the cast.

We should really work on more tests for more corner cases.

In addition, I was talking with @rmuir about some way to check that "optimizations" in bytecode survive refactorings. One idea is to dump the bytecode to a String in tests using debugger class and look for patterns in it (e.g. for this issue make sure that it does not box by looking for "java/lang/Integer.valueOf" and fail test if it occurs in bytecode). Alternatively use forbiddenapis' API (as test dependency) and invoke de.forbiddenapis.Checker to forbid some method calls in the class :-)

@jdconrad
Copy link
Contributor

@uschindler Thanks again for working on this issue! I really like the idea of checking for the optimizations using the patterns from the bytecode. Definitely need to ensure these changes stay in the future.

@jdconrad jdconrad merged commit a2c2628 into elastic:master May 16, 2016
@uschindler uschindler deleted the painless_removeStoreBoxing branch May 16, 2016 19:19
@clintongormley clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement v5.0.0-alpha3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants