New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Java 9 Indy String Concats, if available #18400

Merged
merged 4 commits into from May 18, 2016

Conversation

Projects
None yet
3 participants
@uschindler
Contributor

uschindler commented May 17, 2016

This PR adds support to painless for indyfied string concats, which is a new feature of Java 9. Instead of creating a StringBuilder in byte code and then appending all parts, the recommended way for bytecode in Java 9 is to use use invokedynamic on java.lang.invoke.StringConcatFactory. This allows many improvements in the JVM:

  • Hotspot is better in optimizing this, as the code is just a single method call with many (primitive or reference) arguments
  • It knows the size of the result array before (e.g. by summing up the lengths of all parts)
  • It can directly create a "compact string", while StringBuilder still uses a char[] inbetween

See http://openjdk.java.net/jeps/280 and http://cr.openjdk.java.net/~shade/8085796/notes.txt for more details. Not all of the above optimizations are implemented in JDK/Hotspot yet, but the old code will no longer get improvements in JDK, as it is too complicated to analyze and breaks easily.

The new code checks if the StringConcatFactory is available at run time with correct signature. Otherwise the byte code is generated as before. If indy string concats are enabled, all parts are pushed with their original types to stack and the MethodWriter's concat method just records the types, so it can build the descriptor for the single method call afterwards.

As number of arguments to concat is limited to 200, the code calls the invokedyanmic always after 200 items and just leaves the return value on stack as first param for next round. While testing this I found a parser slowness bug in painless (see #18398).

@uschindler

This comment has been minimized.

Show comment
Hide comment
@uschindler

uschindler May 17, 2016

Contributor

Here is the code gerenated from the following script: String s = \"cat\"; return s + true + 'abc' + null;

  public execute(Ljava/util/Map;Lorg/apache/lucene/search/Scorer;Lorg/elasticsearch/search/lookup/LeafDocLookup;Ljava/lang/Object;)Ljava/lang/Object;
   L0
    LINENUMBER 1 L0
    LDC "cat"
    ASTORE 5
   L1
    LINENUMBER 1 L1
    ALOAD 5
    ICONST_1
    LDC "abc"
    ACONST_NULL
    INVOKEDYNAMIC concat(Ljava/lang/String;ZLjava/lang/String;Ljava/lang/Object;)Ljava/lang/String; [
      // handle kind 0x6 : INVOKESTATIC
      java/lang/invoke/StringConcatFactory.makeConcat(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
      // arguments: none
    ]
    ARETURN
    MAXSTACK = 4
    MAXLOCALS = 6
Contributor

uschindler commented May 17, 2016

Here is the code gerenated from the following script: String s = \"cat\"; return s + true + 'abc' + null;

  public execute(Ljava/util/Map;Lorg/apache/lucene/search/Scorer;Lorg/elasticsearch/search/lookup/LeafDocLookup;Ljava/lang/Object;)Ljava/lang/Object;
   L0
    LINENUMBER 1 L0
    LDC "cat"
    ASTORE 5
   L1
    LINENUMBER 1 L1
    ALOAD 5
    ICONST_1
    LDC "abc"
    ACONST_NULL
    INVOKEDYNAMIC concat(Ljava/lang/String;ZLjava/lang/String;Ljava/lang/Object;)Ljava/lang/String; [
      // handle kind 0x6 : INVOKESTATIC
      java/lang/invoke/StringConcatFactory.makeConcat(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
      // arguments: none
    ]
    ARETURN
    MAXSTACK = 4
    MAXLOCALS = 6
@rmuir

This comment has been minimized.

Show comment
Hide comment
@rmuir

rmuir May 17, 2016

Contributor

this looks fantastic. i will look into the parsing performance issue.

Contributor

rmuir commented May 17, 2016

this looks fantastic. i will look into the parsing performance issue.

@rmuir

This comment has been minimized.

Show comment
Hide comment
@rmuir

rmuir May 17, 2016

Contributor

This looks fine to me: great to have the fastest strings possible :)

The parsing performance issue is fixed but it caused a conflict because I ported your test to master already. If you merge up I will push this.

Contributor

rmuir commented May 17, 2016

This looks fine to me: great to have the fastest strings possible :)

The parsing performance issue is fixed but it caused a conflict because I ported your test to master already. If you merge up I will push this.

Merge branch 'master' into painless_java9StringConcats
# Conflicts:
#	modules/lang-painless/src/test/java/org/elasticsearch/painless/StringTests.java
@uschindler

This comment has been minimized.

Show comment
Hide comment
@uschindler

uschindler May 17, 2016

Contributor

Hi @rmuir
I merged: the test is fast - thanks!
I will later add some checks to the visiteEnd method of the MethodWariter class to ensure state machine is in sane state.
I will also change the test to also test the limits (exact number of 200 args,...).
I will inform you once I found

Maybe you can look at benchmarking the string concats. I don't know how optimized that all is, but it should not be slower than without.

Maybe add a compile option to enable/disable indy string concats?

Contributor

uschindler commented May 17, 2016

Hi @rmuir
I merged: the test is fast - thanks!
I will later add some checks to the visiteEnd method of the MethodWariter class to ensure state machine is in sane state.
I will also change the test to also test the limits (exact number of 200 args,...).
I will inform you once I found

Maybe you can look at benchmarking the string concats. I don't know how optimized that all is, but it should not be slower than without.

Maybe add a compile option to enable/disable indy string concats?

@rmuir

This comment has been minimized.

Show comment
Hide comment
@rmuir

rmuir May 17, 2016

Contributor

unfortunately i dont have a benchmark that can be easily used out of box. Even for the search case, our benchmark has at least 20% noise :(

Currently instead of benchmarking, when reviewing I simply look at whether its better bytecode or not. This is not perfect but better than any benchmark we have. You can of course make a "stupid" benchmark that just does a bunch of concat and runs the script millions of times to try to see (e.g. from a test) if you have concerns, but I don't have any concerns :)

As far as compile option, do you have concerns about correctness? From my perspective it is a task for jenkins and we don't need it.

Contributor

rmuir commented May 17, 2016

unfortunately i dont have a benchmark that can be easily used out of box. Even for the search case, our benchmark has at least 20% noise :(

Currently instead of benchmarking, when reviewing I simply look at whether its better bytecode or not. This is not perfect but better than any benchmark we have. You can of course make a "stupid" benchmark that just does a bunch of concat and runs the script millions of times to try to see (e.g. from a test) if you have concerns, but I don't have any concerns :)

As far as compile option, do you have concerns about correctness? From my perspective it is a task for jenkins and we don't need it.

@uschindler

This comment has been minimized.

Show comment
Hide comment
@uschindler

uschindler May 17, 2016

Contributor

Hi,
I added one more check in the visitEnd() method of MethodWriter to check that the concat stack is empty. I also added a new test for nested concats (a concat has another concat indirectly hidden behind a method call in one of the parts).

I also found out that there is another slowdown with the long string concats if you use a cast to def instead of toString() or plain string literal. I will add this to the other slowness issue!

I think this is ready.

Contributor

uschindler commented May 17, 2016

Hi,
I added one more check in the visitEnd() method of MethodWriter to check that the concat stack is empty. I also added a new test for nested concats (a concat has another concat indirectly hidden behind a method call in one of the parts).

I also found out that there is another slowdown with the long string concats if you use a cast to def instead of toString() or plain string literal. I will add this to the other slowness issue!

I think this is ready.

@uschindler

This comment has been minimized.

Show comment
Hide comment
@uschindler

uschindler May 17, 2016

Contributor

As far as compile option, do you have concerns about correctness? From my perspective it is a task for jenkins and we don't need it.

No concerns. I just say: Java 9 is not yet finalized. The current code only activates concats if the method signature matches 100%, otherwise it uses string buffer, so risk is low. On Java 8 it will never be used, as StringConcatFactory does not exist there.

The configuration option was just meant as a way to disable it, if it really breaks with a later JDK9 EA release (e.g. if then change the magic number 200 shortly before release to say 150...). Currently its documented to be 200 in Javadocs, but those are not yet finalized ("Please note that the specifications and other information contained herein are not final and are subject to change"). I will suggest on the Mailing list to have a integer constant with that number in the StringConcatFactory.

Contributor

uschindler commented May 17, 2016

As far as compile option, do you have concerns about correctness? From my perspective it is a task for jenkins and we don't need it.

No concerns. I just say: Java 9 is not yet finalized. The current code only activates concats if the method signature matches 100%, otherwise it uses string buffer, so risk is low. On Java 8 it will never be used, as StringConcatFactory does not exist there.

The configuration option was just meant as a way to disable it, if it really breaks with a later JDK9 EA release (e.g. if then change the magic number 200 shortly before release to say 150...). Currently its documented to be 200 in Javadocs, but those are not yet finalized ("Please note that the specifications and other information contained herein are not final and are subject to change"). I will suggest on the Mailing list to have a integer constant with that number in the StringConcatFactory.

@rmuir

This comment has been minimized.

Show comment
Hide comment
@rmuir

rmuir May 17, 2016

Contributor

No concerns. I just say: Java 9 is not yet finalized.

Yes it is a risk but painless is documented as experimental at the moment. Therefore we shall experiment :)

Contributor

rmuir commented May 17, 2016

No concerns. I just say: Java 9 is not yet finalized.

Yes it is a risk but painless is documented as experimental at the moment. Therefore we shall experiment :)

@clintongormley clintongormley changed the title from painless: use Java 9 Indy String Concats, if available to Use Java 9 Indy String Concats, if available May 18, 2016

@rmuir rmuir merged commit 1022123 into elastic:master May 18, 2016

1 check passed

CLA Commit author has signed the CLA
Details
@rmuir

This comment has been minimized.

Show comment
Hide comment
@rmuir

rmuir May 18, 2016

Contributor

Thanks @uschindler !

Contributor

rmuir commented May 18, 2016

Thanks @uschindler !

@uschindler uschindler deleted the uschindler:painless_java9StringConcats branch May 18, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment