Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation horrible slow on some huge statements #18398

Closed
uschindler opened this issue May 17, 2016 · 9 comments

Comments

Projects
None yet
3 participants
@uschindler
Copy link
Contributor

commented May 17, 2016

While working on indify string concats for painless, I noticed that some statements take veeery long to compile. Some background:

Indy string concats only allow a maximum of 200 parts per invokedynamic. To test the code to allow more (using intermediate results), I added a simple statement with 211 items and let painless parse it:

String s = "cat"; return s + "000".toString() + "001".toString() + ... + "209".toString();

This takes on painless on my computer up to 300 seconds. If I remove the .toString() calls, it gets a bit faster, but the result is that strings are already merged while compiling (so the .toString() is just a workaround to prevent concats on compilation: I did not find a test for that, its cool that painless does this!). It gets even slower if you instead cast every single string to (def), its still works and works and works since 1000s.

Looks like there is some exponential runtime problem.

This is my test code:

    public void testAppendMany() {
        StringBuilder script = new StringBuilder("String s = \"cat\"; return s");
        StringBuilder result = new StringBuilder("cat");
        for (int i = 0; i < WriterConstants.MAX_INDY_STRING_CONCAT_ARGS + 10; i++) {
            final String s = String.format(Locale.ROOT,  "%03d", i);
            script.append(" + '").append(s).append("'.toString()");
            result.append(s);
        }
        System.out.println(Debugger.toString(script.toString()));
        //assertEquals(result.toString(), exec(script.toString()));
    }
@uschindler

This comment has been minimized.

Copy link
Contributor Author

commented May 17, 2016

FYI, the resulting bytecode with Java 9 and indyfied string concat looks like - (I will open a PR about that soon!):

  public execute(Ljava/util/Map;Lorg/apache/lucene/search/Scorer;Lorg/elasticsearch/search/lookup/LeafDocLookup;Ljava/lang/Object;)Ljava/lang/Object;
   L0
    LINENUMBER 1 L0
    LDC "cat"
    ASTORE 5
   L1
    LINENUMBER 1 L1
    ALOAD 5
    LDC "000"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "001"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "002"
[...]
    LDC "198"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    INVOKEDYNAMIC concat(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String; [
      // handle kind 0x6 : INVOKESTATIC
      java/lang/invoke/StringConcatFactory.makeConcat(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
      // arguments: none
    ]
    LDC "199"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "200"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "201"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "202"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "203"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "204"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "205"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "206"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "207"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "208"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    LDC "209"
    INVOKEVIRTUAL java/lang/String.toString ()Ljava/lang/String;
    INVOKEDYNAMIC concat(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String; [
      // handle kind 0x6 : INVOKESTATIC
      java/lang/invoke/StringConcatFactory.makeConcat(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
      // arguments: none
    ]
    ARETURN
    MAXSTACK = 200
    MAXLOCALS = 6

@clintongormley clintongormley changed the title painless: Compilation horrible slow on some huge statements Compilation horrible slow on some huge statements May 17, 2016

@uschindler

This comment has been minimized.

Copy link
Contributor Author

commented May 17, 2016

@clintongormley this is a bug not an enhancement!

@clintongormley clintongormley added >bug and removed >enhancement labels May 17, 2016

@clintongormley

This comment has been minimized.

Copy link
Member

commented May 17, 2016

Dutifully updated :)

@uschindler

This comment has been minimized.

Copy link
Contributor Author

commented May 17, 2016

Thanks!

@rmuir

This comment has been minimized.

Copy link
Contributor

commented May 17, 2016

I ran profiler and it looks to me the issue is caused by the grammar. I added DiagnosticErrorListener and ambiguity detection and it prints:

line 1:18 reportAttemptingFullContext d=15 (statement), input='Strings="cat";return'
line 1:16 reportAmbiguity d=15 (statement): ambigAlts={5, 11}, input='Strings="cat";'
line 1:4016 reportAttemptingFullContext d=28 (expression), input='+'000'.toString()+'001'.toString()+'002'.toString()+...+'209.toString()

The "..." are mine but it shows all 210 strings.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented May 17, 2016

The test also runs in 3-5 seconds instead of hundreds of seconds if i set SLL prediction mode. I don't think we should solve it that way (even though tests pass), instead we should figure out how to fix the ambiguity...

@rmuir

This comment has been minimized.

Copy link
Contributor

commented May 17, 2016

The issue is caused by making semicolons optional everywhere in the grammar. Fixing that makes the test run instantly.

@rmuir

This comment has been minimized.

Copy link
Contributor

commented May 17, 2016

Also by fixing the semicolon issue, i get no more ambiguity warnings with this hack. So this is the real bad guy. I will make a PR to fix this. We can make semicolons mandatory and be fast. If we want to make them optional in some way, we should do it in a way that is not a performance killer, or we don't do it. This isn't groovy.

diff --git a/modules/lang-painless/src/main/java/org/elasticsearch/painless/antlr/Walker.java b/modules/lang-painless/src/main/java/org/elasticsearch/painless/antlr/Walker.java
index 4f6e2f5..c776245 100644
--- a/modules/lang-painless/src/main/java/org/elasticsearch/painless/antlr/Walker.java
+++ b/modules/lang-painless/src/main/java/org/elasticsearch/painless/antlr/Walker.java
@@ -21,7 +21,9 @@ package org.elasticsearch.painless.antlr;

 import org.antlr.v4.runtime.ANTLRInputStream;
 import org.antlr.v4.runtime.CommonTokenStream;
+import org.antlr.v4.runtime.DiagnosticErrorListener;
 import org.antlr.v4.runtime.ParserRuleContext;
+import org.antlr.v4.runtime.atn.PredictionMode;
 import org.elasticsearch.painless.Operation;
 import org.elasticsearch.painless.Variables.Reserved;
 import org.elasticsearch.painless.antlr.PainlessParser.AfterthoughtContext;
@@ -139,8 +141,10 @@ public final class Walker extends PainlessParserBaseVisitor<ANode> {
         final PainlessParser parser = new PainlessParser(new CommonTokenStream(lexer));
         final ParserErrorStrategy strategy = new ParserErrorStrategy();

-        lexer.removeErrorListeners();
-        parser.removeErrorListeners();
+        //lexer.removeErrorListeners();
+        //parser.removeErrorListeners();
+        parser.addErrorListener(new DiagnosticErrorListener());
+        parser.getInterpreter().setPredictionMode(PredictionMode.LL_EXACT_AMBIG_DETECTION);
         parser.setErrorHandler(strategy);

         return parser.source();
@rmuir

This comment has been minimized.

Copy link
Contributor

commented May 17, 2016

Thanks for reporting this @uschindler !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.