Allow painless to implement more interfaces #22983

nik9000 · 2017-02-05T15:18:20Z

Generalizes two previously hard coded things in painless into
generic concepts:

The "main method" is no longer hardcoded to:

public abstract Object execute(Map<String, Object> params,
        Scorer scorer, LeafDocLookup doc, Object value);

Instead Painless's compiler takes an and implements that. It looks like:

public interface SomeScript {
    // Argument names we expose to Painless scripts
    String[] ARGUMENTS = new String[] {"a", "b"};
    // Method implemented by Painless script. Must be named execute but can have any parameters or return any value.
    Object execute(String a, int b);
    // Is the "a" argument used by the script?
    boolean uses$a();
}
SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap());
Object result = script.execute("a", 1);

PainlessScriptEngine now compiles all scripts to the new
GenericElasticsearchScript interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.

_score and ctx are no longer hardcoded to be extracted from
#score and params respectively. Instead Painless's default
implementation of Elasticsearch scripts uses the uses$_score and
uses$ctx methods to determine if it is used and gives them
dummy values if they are not used.
Throwing the ScriptException is now handled by the Painless
script itself. That way Painless doesn't have to leak the metadata
that is required to build the fancy stack trace. And all painless scripts
get the fancy stack trace.

@FunctionalInterface

Generalizes two previously hard coded things in painless into generic concepts: 1. The "main method" is no longer hardcoded to: ``` public abstract Object execute(Map<String, Object> params, Scorer scorer, LeafDocLookup doc, Object value); ``` Instead Painless's compiler takes a functional interface and implements that. It looks like: ``` @FunctionalInterface public interface NoArgs { Object test(); } NoArgs script = scriptEngine.compile(NoArgs.class, null, "the_script_here", emptyMap()); ``` `PainlessScriptEngine` now compiles all scripts to the new `GenericElasticsearchScript` interface by default for compatibility with the rest of Elasticsearch until it is able to use this new ability. 2. `_score` and `ctx` are no longer hardcoded to be extracted from `#score` and `params` respectively. Instead the compiler has the concept of a `DerivedArgument` which is an argument to the script that is derived from the others, immutable, and only declared and set if used. Using it looks like: ``` ManyArgs script = scriptEngine.compile(ManyArgs.class, null, "a2", emptyMap(), new DerivedArgument(Definition.INT_TYPE, "a2", (writer, locals) -> { // final int a2 = 2 * a; Variable a = locals.getVariable(null, "a"); Variable a2 = locals.getVariable(null, "a2"); writer.push(2); writer.visitVarInsn(Opcodes.ILOAD, a.getSlot()); writer.math(GeneratorAdapter.MUL, Definition.INT_TYPE.type); writer.visitVarInsn(Opcodes.ISTORE, a2.getSlot()); })); assertEquals(2, script.test(1, 0, 0, 0)); ``` These are obviously fairly advanced to create but they allow us to remove hard coding elasticsearch specifics into Painless's interals. And they give us a clear path away from `GenericElasticsearchScript` when Elasticsearch is ready to be done with that.

nik9000 · 2017-02-05T15:19:36Z

I believe this should help with the "script context" concept for Elasticsearch. It is, at least, how I was envisioning it. Ulterior motive: one day I'd like to be able to use Painless as an embedded oriented language outside of Elasticsearch. One day.

nik9000 · 2017-02-05T15:21:09Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/Arg.java

+     * The whitelisted type of the parameter. The default if unspecified is to look for an exact match for the type from the whitelist. If
+     * there is no exact match then the compilation will fail.
+     */
+    String type() default "$infer$";


I'm torn here. Supporting $infer$ can be a little weird but it saves quite a bit of thought when you have an interface with a whitelisted type - especially for something like int.

nik9000 · 2017-02-05T15:22:24Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/Arg.java

+@Documented
+@Retention(RetentionPolicy.RUNTIME)
+@Target(ElementType.PARAMETER)
+public @interface Arg {


I wanted to make this an annotation so it was right next to that parameter on the method in the functional interface. I feel like it makes the interfaces much more readable. I chose a short name because you type it a lot when you have many arguments. I can certainly be convinced to go with a more obvious name like PainlessArg or something.

nik9000 · 2017-02-05T15:22:39Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/Compiler.java

@@ -43,7 +44,7 @@
    /**
     * The maximum number of characters allowed in the script source.
     */
-    static int MAXIMUM_SOURCE_LENGTH = 16384;
+    static final int MAXIMUM_SOURCE_LENGTH = 16384;


I think this was just an oversight so I fixed it while I was here.

nik9000 · 2017-02-05T15:25:50Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/Compiler.java


        try {
            Class<? extends Executable> clazz = loader.define(CLASS_NAME, root.getBytes());
            java.lang.reflect.Constructor<? extends Executable> constructor =
                    clazz.getConstructor(String.class, String.class, BitSet.class);

-            return constructor.newInstance(name, source, root.getStatements());
+            return iface.cast(constructor.newInstance(name, source, root.getStatements()));


It is super convenient to just return the thing of the provided type.

nik9000 · 2017-02-05T15:26:37Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/GenericElasticsearchScript.java

+
+    DerivedArgument[] DERIVED_ARGUMENTS = new DerivedArgument[] {
+        new DerivedArgument(Definition.DOUBLE_TYPE, "_score", (writer, locals) -> {
+            // If _score is used in the script then run this before any user code:


These are just lifted from SSource. The only change is that the string are no longer constants in Locals because this isn't really part of Painless's core anymore.

nik9000 · 2017-02-05T15:30:13Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/PainlessScriptEngineService.java

+        return compile(GenericElasticsearchScript.class, scriptName, scriptSource, params, GenericElasticsearchScript.DERIVED_ARGUMENTS);
+    }
+
+    <T> T compile(Class<T> iface, String scriptName, final String scriptSource, final Map<String, String> params,


Package private because I want to test with this. At some point I think we should yank some of this code into another spot that isn't Elasticsearch specific.

But that can totally wait.

nik9000 · 2017-02-05T15:31:12Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/antlr/Walker.java

-        return new Walker(sourceName, sourceText, settings, debugStream).source;
+    public static SSource buildPainlessTree(MainMethod mainMethod, String sourceName, String sourceText, CompilerSettings settings,
+            Printer debugStream) {
+        return new Walker(mainMethod, sourceName, sourceText, settings, debugStream).source;


The walker needs to know stuff about the main method so it can figure out the variables in scope.

nik9000 · 2017-02-05T15:31:59Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/node/SSource.java

-            return Locals.MAIN_KEYWORDS.contains(name);
-        }
-
-        public boolean usesScore() {


Now part of usedDerivedArguments.

nik9000 · 2017-02-05T15:32:31Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/node/SSource.java


        void setMaxLoopCounter(int max);
        int getMaxLoopCounter();
    }

    public static final class MainMethodReserved implements Reserved {
-        private boolean score = false;
-        private boolean ctx = false;
+        private final Map<String, DerivedArgument> usedDerivedArguments = new LinkedHashMap<>();


Note that we might change the order of the derived argument now - it used to be they'd be in the declared order. Now they are in the order of first used (as seen by walking the AST).

nik9000 · 2017-02-05T15:34:19Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/node/SSource.java

-        if (reserved.usesScore()) {
-            // if the _score value is used, we do this once:
-            // final double _score = scorer.score();
-            Variable scorer = mainMethod.getVariable(null, Locals.SCORER);


Move to GenericElasticsearchScript.

nik9000 · 2017-02-05T15:45:58Z

Hmmm.... I'm not really sure how either the annotation or DerivedArguments would work in a world where Elasticsearch has a bunch of these interfaces - I guess the Elasticsearch Painless plug would have companion interfaces and adapt them to the Elasticsearch ones. I'm not sure! But it feels to me like this is a start at least.

nik9000 · 2017-02-07T23:00:18Z

@jdconrad it looks like this is a good place to read about getting the parameter names. It looks like they aren't stored by default but you can have javac store them. I'm not sure what we should do there.

Instead of implementing fancy derived arguments which require some asm knowledge, the script now has `getUsedVariables` which you can use to skip expensive computations required to build arguments to the painless script.

…ipts Also, it doesn't have the execute method on it any more....

* Removes the PainlessScript interface and use an abstract base class instead. * Remove the NeedsScript marker interface. Now that we have `getUsedVariables()` we don't need it.

nik9000 · 2017-02-08T17:45:09Z

@jdconrad - I had another go at this this morning. I removed the need for DerivedArgument and the type parameter on the annotation and the NeedsScore marker interface. All by adding a getUsedVariables method to the metadata that returns the variables used in the main method. Now you can find out if a parameter is used just by checking if it is in the Set<String> returned by that method. This is way simpler than I had it before.

nik9000 · 2017-02-08T17:53:54Z

@jdconrad - I had another go at this this morning. I removed the need for DerivedArgument and the type parameter on the annotation and the NeedsScore marker interface. All by adding a getUsedVariables method to the metadata that returns the variables used in the main method. Now you can find out if a parameter is used just by checking if it is in the Set<String> returned by that method. This is way simpler than I had it before.

nik9000 · 2017-02-08T22:25:23Z

Notes from talking to @jdconrad and @rjernst:

We would like to drop the @FunctionalInterface requirement and instead make the interface look like:

interface {
  static final String[] PARAMS = new String[] {"a", "b"};
  int execute(int a, int b);
  boolean uses$a();
}

The execute method name is special. It is the only method that Painless will accept as the main method.
The parameter names to the execute method are in the PARAMS array.
The uses$a like methods are for finding out if an argument is used.
No other methods are allowed on the interface.
We'd like to whitelist LeafDocLookup and move the generics cleanups on it outside of this PR.

Now we use the uses$argName methods to get that information.

That means we no longer have to make compilation metadata available outside of Painless at all.

nik9000 · 2017-02-15T22:55:37Z

We'd like to whitelist LeafDocLookup and move the generics cleanups on it outside of this PR.

This is actually failure complicated to get right and I'd kind of like to delay that until the next PR....

No need to leak it.

nik9000 · 2017-02-15T23:06:45Z

@rjernst and @jdconrad, I think this is ready for another look. I did almost all the things we talked about. I skipped whitelisting LeafDocLookup because it turns out that is a little more difficult than it ought to be to get right and I think it should wait for the next PR.

I also moved the exception handling from ScriptImpl into the generated script. I did that because it felt more right and because it allowed me to make all of the metadata about statements and stuff private.

jdconrad · 2017-02-16T16:58:08Z

I'll try to take a look today.

nik9000 · 2017-02-17T00:16:13Z

I'll try to take a look today.

Thanks!

I realized while I was out today that this still isn't fixed for void execute methods. It also doesn't optimize the primitive returns. Would those be ok in a followup?

nik9000 · 2017-02-17T13:27:13Z

Updated the PR description to be accurate for what the PR now implements. I'm keeping the (inaccurate) name for now though.

Why does this work, actually?

jdconrad · 2017-02-17T16:49:49Z

@nik9000 Just want you to know that I'm not ignoring this review. I just want some more time to think about it again because it is pretty fundamental, and I don't want to deal with any backcompat issues down the road here.

nik9000 · 2017-02-17T16:56:43Z

@nik9000 Just want you to know that I'm not ignoring this review. I just want some more time to think about it again because it is pretty fundamental, and I don't want to deal with any backcompat issues down the road here.

++

Fine by me. I found something else to have a look at in the mean time. I think it is safe. Whether or not it is what we want to do is another matter. I'm certainly willing to rework as needed.

jdconrad

LGTM. Left a few comments.

jdconrad · 2017-02-21T16:27:13Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/GenericElasticsearchScript.java

 /**
- * Marker interface that a generated {@link Executable} uses the {@code _score} value
+ * Generic script interface that all scripts that elasticsearch uses implement.


Should this comment be modified until this class is moved out to scripting in ES?

I think the comment is accurate until we get contexts. Maybe it should read "Generic script interface that Painless implements for all Elasticsearch scripts." or something like that. One day we'll be able to remove this entirely. I think.

That seems good just for now at least.

jdconrad · 2017-02-21T16:38:10Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/PainlessScript.java

+    /**
+     * Metadata about the script.
+     */
+    public static class ScriptMetadata {


I would leave this like how it was before without this extra internal class. It's not actually used in any new places outside of Painless from what I can tell (maybe I missed something), so these things are literally just part of the PainlessScript still.

I'll move it back. When I had this implementing arbitrarily named methods it made sense to try and minimize the number of methods on the superclass.

jdconrad · 2017-02-21T16:54:50Z

modules/lang-painless/src/main/java/org/elasticsearch/painless/node/SSource.java

@@ -324,6 +315,38 @@ void write(MethodWriter writer, Globals globals) {
            writer.visitInsn(Opcodes.ACONST_NULL);
            writer.returnValue();
        }
+
+        writer.mark(endTry);


Is there a specific reason for making this part of the ASM because it makes it much more difficult to read about the types of exceptions we catch, especially if anyone not familiar with this project has to do some weekend/on-call debugging? As far as I can tell this was simply cut from the previous location it was at. However, if the reason for this is because we want all implementations to use this, I don't think this is the correct approach. I think we want to this to remain on a case by case basis especially if this projects ends up outside of ES.

I did move it out for the reason you mentioned. The error messages become very difficult to make sense of without the exception interpretation. I agree that it is unfortunate that this references a class in Elasticsearch core, but I think that is a thing we can fix later.

The other thing I like about this is that it makes the internals required to generate the right exception not publicly facing.

I'm happy to back it out propose it separately. Without it the exceptions that don't go through ScriptImpl are wonky though.

I'm definitely conflicted here; you make good arguments. I would just leave this as you have it for now then and it's something we will have to address when we split off down the road. Maybe users pick the exceptions that should be caught as part of a script, but that's a discussion for a different PR for sure.

nik9000 · 2017-02-21T19:09:53Z

Thanks for all your reviews @jdconrad! Thanks for brainstorming about this, @rjernst!

master: 9105672
5.4: 34cceff

Generalizes three previously hard coded things in painless into generic concepts: 1. The "main method" is no longer hardcoded to: ``` public abstract Object execute(Map<String, Object> params, Scorer scorer, LeafDocLookup doc, Object value); ``` Instead Painless's compiler takes an interface and implements it. It looks like: ``` public interface SomeScript { // Argument names we expose to Painless scripts String[] ARGUMENTS = new String[] {"a", "b"}; // Method implemented by Painless script. Must be named execute but can have any parameters or return any value. Object execute(String a, int b); // Is the "a" argument used by the script? boolean uses$a(); } SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap()); Object result = script.execute("a", 1); ``` `PainlessScriptEngine` now compiles all scripts to the new `GenericElasticsearchScript` interface by default for compatibility with the rest of Elasticsearch until it is able to use this new ability. 2. `_score` and `ctx` are no longer hardcoded to be extracted from `#score` and `params` respectively. Instead Painless's default implementation of Elasticsearch scripts uses the `uses$_score` and `uses$ctx` methods to determine if it is used and gives them dummy values if they are not used. 3. Throwing the `ScriptException` is now handled by the Painless script itself. That way Painless doesn't have to leak the metadata that is required to build the fancy stack trace. And all painless scripts get the fancy stack trace.

) Fixes Painless to properly implement scripts that return primitives and void. Adds some simple tests that we emit sane opcodes and some other tests that we implement primitives as expected. Mostly this is just a fix following up from #22983 but there is one thing I did really worth talking about, I think. So, before this script Painless scripts could only ever return Object and they did would always return null for paths that didn't return any values. Now that they can return primitives the question is "what should Painless return from paths that don't return any values?" And I answered that with "whatever the JLS default value is". So 0/0L/0f/0d/false.

nik9000 added :Plugin Lang Painless >enhancement v5.3.0 v6.0.0-alpha1 labels Feb 5, 2017

nik9000 added 3 commits February 5, 2017 10:24

Javadoc and drop <T> when not needed

926ddc0

Remove execute constant

0f64e85

Javadoc

931f6bc

nik9000 commented Feb 5, 2017

View reviewed changes

nik9000 requested a review from jdconrad February 5, 2017 15:35

clintongormley added v5.4.0 and removed v5.3.0 labels Feb 7, 2017

nik9000 added 7 commits February 7, 2017 18:12

Merge branch 'master' into painless_interfaces

9b1ebdd

Remove one usage of type

38cf24e

Replace DerivedArguments with getUsedVariables

a49779a

Instead of implementing fancy derived arguments which require some asm knowledge, the script now has `getUsedVariables` which you can use to skip expensive computations required to build arguments to the painless script.

Round out the PainlessScript interface

c10aefc

fix javadoc

58d11af

Rename Executable so it is more clear it is the base class of all scr…

a56bd9f

…ipts Also, it doesn't have the execute method on it any more....

Remove interfaces

f3b0865

* Removes the PainlessScript interface and use an abstract base class instead. * Remove the NeedsScript marker interface. Now that we have `getUsedVariables()` we don't need it.

nik9000 added 5 commits February 15, 2017 12:57

Merge branch 'master' into painless_interfaces

e203a81

Start support for uses$varName

65d0a94

Implements uses$varName methods

aaf4f53

Remove used variables from metadata

5b2aa41

Now we use the uses$argName methods to get that information.

Move exception handling into generated script

5cb78aa

That means we no longer have to make compilation metadata available outside of Painless at all.

nik9000 added 3 commits February 15, 2017 17:26

Switch from annotation to constant

926f82a

Remove we don't need

b0a3b44

Remove extra

1d2ef7d

nik9000 added 3 commits February 15, 2017 17:57

Rename variable

962492b

Make method protected

8dd2c9c

No need to leak it.

Add misisng comment

69d451f

Add test for returning void

93f3da9

Why does this work, actually?

jdconrad approved these changes Feb 21, 2017

View reviewed changes

nik9000 added 3 commits February 21, 2017 13:37

Merge branch 'master' into painless_interfaces

5d6e49a

Make javadoc more clear

0080437

Remove ScriptMetdata

09b2c70

nik9000 merged commit 9105672 into elastic:master Feb 21, 2017

nik9000 changed the title ~~Allow painless to implement arbitrary interfaces~~ Allow painless to implement more interfaces Feb 21, 2017

nik9000 mentioned this pull request Feb 21, 2017

Fix Painless's implementation of interfaces returning primitives #23298

Merged

clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018

Allow painless to implement more interfaces #22983

Allow painless to implement more interfaces #22983

Conversation

nik9000 commented Feb 5, 2017 • edited Loading

nik9000 commented Feb 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nik9000 commented Feb 5, 2017

nik9000 commented Feb 7, 2017

nik9000 commented Feb 8, 2017

nik9000 commented Feb 8, 2017

nik9000 commented Feb 8, 2017

nik9000 commented Feb 15, 2017

nik9000 commented Feb 15, 2017

jdconrad commented Feb 16, 2017

nik9000 commented Feb 17, 2017

nik9000 commented Feb 17, 2017

jdconrad commented Feb 17, 2017

nik9000 commented Feb 17, 2017

jdconrad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nik9000 commented Feb 21, 2017 • edited Loading

nik9000 commented Feb 5, 2017 •

edited

Loading

nik9000 commented Feb 21, 2017 •

edited

Loading