Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow painless to implement more interfaces #22983

Merged
merged 26 commits into from
Feb 21, 2017

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Feb 5, 2017

Generalizes two previously hard coded things in painless into
generic concepts:

  1. The "main method" is no longer hardcoded to:
public abstract Object execute(Map<String, Object> params,
        Scorer scorer, LeafDocLookup doc, Object value);

Instead Painless's compiler takes an and implements that. It looks like:

public interface SomeScript {
    // Argument names we expose to Painless scripts
    String[] ARGUMENTS = new String[] {"a", "b"};
    // Method implemented by Painless script. Must be named execute but can have any parameters or return any value.
    Object execute(String a, int b);
    // Is the "a" argument used by the script?
    boolean uses$a();
}
SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap());
Object result = script.execute("a", 1);

PainlessScriptEngine now compiles all scripts to the new
GenericElasticsearchScript interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.

  1. _score and ctx are no longer hardcoded to be extracted from
    #score and params respectively. Instead Painless's default
    implementation of Elasticsearch scripts uses the uses$_score and
    uses$ctx methods to determine if it is used and gives them
    dummy values if they are not used.

  2. Throwing the ScriptException is now handled by the Painless
    script itself. That way Painless doesn't have to leak the metadata
    that is required to build the fancy stack trace. And all painless scripts
    get the fancy stack trace.

Generalizes two previously hard coded things in painless into
generic concepts:

1. The "main method" is no longer hardcoded to:
```
public abstract Object execute(Map<String, Object> params,
        Scorer scorer, LeafDocLookup doc, Object value);
```
Instead Painless's compiler takes a functional interface and
implements that. It looks like:
```
@FunctionalInterface
public interface NoArgs {
    Object test();
}
NoArgs script = scriptEngine.compile(NoArgs.class, null, "the_script_here", emptyMap());
```

`PainlessScriptEngine` now compiles all scripts to the new
`GenericElasticsearchScript` interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.

2. `_score` and `ctx` are no longer hardcoded to be extracted from
`#score` and `params` respectively. Instead the compiler has the
concept of a `DerivedArgument` which is an argument to the script
that is derived from the others, immutable, and only declared and
set if used. Using it looks like:
```
ManyArgs script = scriptEngine.compile(ManyArgs.class, null, "a2", emptyMap(),
        new DerivedArgument(Definition.INT_TYPE, "a2", (writer, locals) -> {
            // final int a2 = 2 * a;
            Variable a = locals.getVariable(null, "a");
            Variable a2 = locals.getVariable(null, "a2");

            writer.push(2);
            writer.visitVarInsn(Opcodes.ILOAD, a.getSlot());
            writer.math(GeneratorAdapter.MUL, Definition.INT_TYPE.type);
            writer.visitVarInsn(Opcodes.ISTORE, a2.getSlot());
        }));
assertEquals(2, script.test(1, 0, 0, 0));
```

These are obviously fairly advanced to create but they allow us to
remove hard coding elasticsearch specifics into Painless's interals.
And they give us a clear path away from `GenericElasticsearchScript`
when Elasticsearch is ready to be done with that.
@nik9000
Copy link
Member Author

nik9000 commented Feb 5, 2017

I believe this should help with the "script context" concept for Elasticsearch. It is, at least, how I was envisioning it. Ulterior motive: one day I'd like to be able to use Painless as an embedded oriented language outside of Elasticsearch. One day.

* The whitelisted type of the parameter. The default if unspecified is to look for an exact match for the type from the whitelist. If
* there is no exact match then the compilation will fail.
*/
String type() default "$infer$";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn here. Supporting $infer$ can be a little weird but it saves quite a bit of thought when you have an interface with a whitelisted type - especially for something like int.

@Documented
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.PARAMETER)
public @interface Arg {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to make this an annotation so it was right next to that parameter on the method in the functional interface. I feel like it makes the interfaces much more readable. I chose a short name because you type it a lot when you have many arguments. I can certainly be convinced to go with a more obvious name like PainlessArg or something.

@@ -43,7 +44,7 @@
/**
* The maximum number of characters allowed in the script source.
*/
static int MAXIMUM_SOURCE_LENGTH = 16384;
static final int MAXIMUM_SOURCE_LENGTH = 16384;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was just an oversight so I fixed it while I was here.


try {
Class<? extends Executable> clazz = loader.define(CLASS_NAME, root.getBytes());
java.lang.reflect.Constructor<? extends Executable> constructor =
clazz.getConstructor(String.class, String.class, BitSet.class);

return constructor.newInstance(name, source, root.getStatements());
return iface.cast(constructor.newInstance(name, source, root.getStatements()));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is super convenient to just return the thing of the provided type.


DerivedArgument[] DERIVED_ARGUMENTS = new DerivedArgument[] {
new DerivedArgument(Definition.DOUBLE_TYPE, "_score", (writer, locals) -> {
// If _score is used in the script then run this before any user code:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just lifted from SSource. The only change is that the string are no longer constants in Locals because this isn't really part of Painless's core anymore.

return compile(GenericElasticsearchScript.class, scriptName, scriptSource, params, GenericElasticsearchScript.DERIVED_ARGUMENTS);
}

<T> T compile(Class<T> iface, String scriptName, final String scriptSource, final Map<String, String> params,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package private because I want to test with this. At some point I think we should yank some of this code into another spot that isn't Elasticsearch specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that can totally wait.

return new Walker(sourceName, sourceText, settings, debugStream).source;
public static SSource buildPainlessTree(MainMethod mainMethod, String sourceName, String sourceText, CompilerSettings settings,
Printer debugStream) {
return new Walker(mainMethod, sourceName, sourceText, settings, debugStream).source;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The walker needs to know stuff about the main method so it can figure out the variables in scope.

return Locals.MAIN_KEYWORDS.contains(name);
}

public boolean usesScore() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now part of usedDerivedArguments.


void setMaxLoopCounter(int max);
int getMaxLoopCounter();
}

public static final class MainMethodReserved implements Reserved {
private boolean score = false;
private boolean ctx = false;
private final Map<String, DerivedArgument> usedDerivedArguments = new LinkedHashMap<>();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we might change the order of the derived argument now - it used to be they'd be in the declared order. Now they are in the order of first used (as seen by walking the AST).

if (reserved.usesScore()) {
// if the _score value is used, we do this once:
// final double _score = scorer.score();
Variable scorer = mainMethod.getVariable(null, Locals.SCORER);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to GenericElasticsearchScript.

@nik9000 nik9000 requested a review from jdconrad February 5, 2017 15:35
@nik9000
Copy link
Member Author

nik9000 commented Feb 5, 2017

Hmmm.... I'm not really sure how either the annotation or DerivedArguments would work in a world where Elasticsearch has a bunch of these interfaces - I guess the Elasticsearch Painless plug would have companion interfaces and adapt them to the Elasticsearch ones. I'm not sure! But it feels to me like this is a start at least.

@nik9000
Copy link
Member Author

nik9000 commented Feb 7, 2017

@jdconrad it looks like this is a good place to read about getting the parameter names. It looks like they aren't stored by default but you can have javac store them. I'm not sure what we should do there.

Instead of implementing fancy derived arguments which require some
asm knowledge, the script now has `getUsedVariables` which you can
use to skip expensive computations required to build arguments to
the painless script.
…ipts

Also, it doesn't have the execute method on it any more....
* Removes the PainlessScript interface and use an abstract base
class instead.
* Remove the NeedsScript marker interface. Now that we have
`getUsedVariables()` we don't need it.
@nik9000
Copy link
Member Author

nik9000 commented Feb 8, 2017

@jdconrad - I had another go at this this morning. I removed the need for DerivedArgument and the type parameter on the annotation and the NeedsScore marker interface. All by adding a getUsedVariables method to the metadata that returns the variables used in the main method. Now you can find out if a parameter is used just by checking if it is in the Set<String> returned by that method. This is way simpler than I had it before.

1 similar comment
@nik9000
Copy link
Member Author

nik9000 commented Feb 8, 2017

@jdconrad - I had another go at this this morning. I removed the need for DerivedArgument and the type parameter on the annotation and the NeedsScore marker interface. All by adding a getUsedVariables method to the metadata that returns the variables used in the main method. Now you can find out if a parameter is used just by checking if it is in the Set<String> returned by that method. This is way simpler than I had it before.

@nik9000
Copy link
Member Author

nik9000 commented Feb 8, 2017

Notes from talking to @jdconrad and @rjernst:

  1. We would like to drop the @FunctionalInterface requirement and instead make the interface look like:
interface {
  static final String[] PARAMS = new String[] {"a", "b"};
  int execute(int a, int b);
  boolean uses$a();
}
  1. The execute method name is special. It is the only method that Painless will accept as the main method.
  2. The parameter names to the execute method are in the PARAMS array.
  3. The uses$a like methods are for finding out if an argument is used.
  4. No other methods are allowed on the interface.
  5. We'd like to whitelist LeafDocLookup and move the generics cleanups on it outside of this PR.

Now we use the uses$argName methods to get that information.
That means we no longer have to make compilation metadata
available outside of Painless at all.
@nik9000
Copy link
Member Author

nik9000 commented Feb 15, 2017

We'd like to whitelist LeafDocLookup and move the generics cleanups on it outside of this PR.

This is actually failure complicated to get right and I'd kind of like to delay that until the next PR....

@nik9000
Copy link
Member Author

nik9000 commented Feb 15, 2017

@rjernst and @jdconrad, I think this is ready for another look. I did almost all the things we talked about. I skipped whitelisting LeafDocLookup because it turns out that is a little more difficult than it ought to be to get right and I think it should wait for the next PR.

I also moved the exception handling from ScriptImpl into the generated script. I did that because it felt more right and because it allowed me to make all of the metadata about statements and stuff private.

@jdconrad
Copy link
Contributor

I'll try to take a look today.

@nik9000
Copy link
Member Author

nik9000 commented Feb 17, 2017

I'll try to take a look today.

Thanks!

I realized while I was out today that this still isn't fixed for void execute methods. It also doesn't optimize the primitive returns. Would those be ok in a followup?

@nik9000
Copy link
Member Author

nik9000 commented Feb 17, 2017

Updated the PR description to be accurate for what the PR now implements. I'm keeping the (inaccurate) name for now though.

Why does this work, actually?
@jdconrad
Copy link
Contributor

@nik9000 Just want you to know that I'm not ignoring this review. I just want some more time to think about it again because it is pretty fundamental, and I don't want to deal with any backcompat issues down the road here.

@nik9000
Copy link
Member Author

nik9000 commented Feb 17, 2017

@nik9000 Just want you to know that I'm not ignoring this review. I just want some more time to think about it again because it is pretty fundamental, and I don't want to deal with any backcompat issues down the road here.

++

Fine by me. I found something else to have a look at in the mean time. I think it is safe. Whether or not it is what we want to do is another matter. I'm certainly willing to rework as needed.

Copy link
Contributor

@jdconrad jdconrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a few comments.

/**
* Marker interface that a generated {@link Executable} uses the {@code _score} value
* Generic script interface that all scripts that elasticsearch uses implement.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this comment be modified until this class is moved out to scripting in ES?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment is accurate until we get contexts. Maybe it should read "Generic script interface that Painless implements for all Elasticsearch scripts." or something like that. One day we'll be able to remove this entirely. I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems good just for now at least.

/**
* Metadata about the script.
*/
public static class ScriptMetadata {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave this like how it was before without this extra internal class. It's not actually used in any new places outside of Painless from what I can tell (maybe I missed something), so these things are literally just part of the PainlessScript still.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move it back. When I had this implementing arbitrarily named methods it made sense to try and minimize the number of methods on the superclass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@@ -324,6 +315,38 @@ void write(MethodWriter writer, Globals globals) {
writer.visitInsn(Opcodes.ACONST_NULL);
writer.returnValue();
}

writer.mark(endTry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason for making this part of the ASM because it makes it much more difficult to read about the types of exceptions we catch, especially if anyone not familiar with this project has to do some weekend/on-call debugging? As far as I can tell this was simply cut from the previous location it was at. However, if the reason for this is because we want all implementations to use this, I don't think this is the correct approach. I think we want to this to remain on a case by case basis especially if this projects ends up outside of ES.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did move it out for the reason you mentioned. The error messages become very difficult to make sense of without the exception interpretation. I agree that it is unfortunate that this references a class in Elasticsearch core, but I think that is a thing we can fix later.

The other thing I like about this is that it makes the internals required to generate the right exception not publicly facing.

I'm happy to back it out propose it separately. Without it the exceptions that don't go through ScriptImpl are wonky though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely conflicted here; you make good arguments. I would just leave this as you have it for now then and it's something we will have to address when we split off down the road. Maybe users pick the exceptions that should be caught as part of a script, but that's a discussion for a different PR for sure.

@nik9000 nik9000 merged commit 9105672 into elastic:master Feb 21, 2017
@nik9000
Copy link
Member Author

nik9000 commented Feb 21, 2017

Thanks for all your reviews @jdconrad! Thanks for brainstorming about this, @rjernst!

master: 9105672
5.4: 34cceff

nik9000 added a commit that referenced this pull request Feb 21, 2017
Generalizes three previously hard coded things in painless into
generic concepts:

1. The "main method" is no longer hardcoded to:
```
public abstract Object execute(Map<String, Object> params,
        Scorer scorer, LeafDocLookup doc, Object value);
```
Instead Painless's compiler takes an interface and implements it. It looks like:
```
public interface SomeScript {
    // Argument names we expose to Painless scripts
    String[] ARGUMENTS = new String[] {"a", "b"};
    // Method implemented by Painless script. Must be named execute but can have any parameters or return any value.
    Object execute(String a, int b);
    // Is the "a" argument used by the script?
    boolean uses$a();
}
SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap());
Object result = script.execute("a", 1);
```

`PainlessScriptEngine` now compiles all scripts to the new
`GenericElasticsearchScript` interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.

2. `_score` and `ctx` are no longer hardcoded to be extracted from
`#score` and `params` respectively. Instead Painless's default
implementation of Elasticsearch scripts uses the `uses$_score` and
`uses$ctx` methods to determine if it is used and gives them
dummy values if they are not used.

3. Throwing the `ScriptException` is now handled by the Painless
script itself. That way Painless doesn't have to leak the metadata
that is required to build the fancy stack trace. And all painless scripts
get the fancy stack trace.
@nik9000 nik9000 changed the title Allow painless to implement arbitrary interfaces Allow painless to implement more interfaces Feb 21, 2017
nik9000 added a commit that referenced this pull request Feb 21, 2017
)

Fixes Painless to properly implement scripts that return primitives
and void. Adds some simple tests that we emit sane opcodes and some
other tests that we implement primitives as expected.

Mostly this is just a fix following up from #22983 but there is one
thing I did really worth talking about, I think. So, before this script
Painless scripts could only ever return Object and they did would always
return null for paths that didn't return any values. Now that they
can return primitives the question is "what should Painless return
from paths that don't return any values?" And I answered that with
"whatever the JLS default value is". So 0/0L/0f/0d/false.
nik9000 added a commit that referenced this pull request Feb 21, 2017
)

Fixes Painless to properly implement scripts that return primitives
and void. Adds some simple tests that we emit sane opcodes and some
other tests that we implement primitives as expected.

Mostly this is just a fix following up from #22983 but there is one
thing I did really worth talking about, I think. So, before this script
Painless scripts could only ever return Object and they did would always
return null for paths that didn't return any values. Now that they
can return primitives the question is "what should Painless return
from paths that don't return any values?" And I answered that with
"whatever the JLS default value is". So 0/0L/0f/0d/false.
@clintongormley clintongormley added :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache and removed :Plugin Lang Painless labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement v5.4.0 v6.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants