ESQL: enhance SHOW FUNCTIONS command #99736

luigidellaquila · 2023-09-21T08:25:34Z

Enhance SHOW FUNCTIONS command to return as structured information as possible about the function signature, ie.

function name
return type
param names
param types
param descriptions

For now, as an example, the annotations are used only on sin() and date_parse() functions; if we agree on this approach, I'll proceed to

enhance all the currently implemented functions with the needed information
improve the function tests to verify that every new implemented function provides meaningful information

This feature can be useful for the end user, but the main goal is to give Kibana an easy way to produce in-line documentation (contextual messages, autocomplete) for functions

Similar to current implementation, that has a @Named("paramName") annotation for function parameters, this PR introduces two more annotations @Param(name, type, description, optional) and @FunctionInfo() to provide information about single parameters and functions.

The result of SHOW FUNCTIONS query will have the following columns:

name (keyword): the function name
synopsis (keyword): the full signature of the funciton, eg. double sin(n:integer|long|double)
argNames (keyword MV): the function argument names
argTypes (keyword MV): the function argument types
argDescriptions (keyword MD): a textual description of each function argument
returnType (keyword): the return type of the function
description (keyword): a textual description of the function

Open questions:

how structured shoud types be? Eg. should we have a strict @Typed("keyword")/@Typed({"keyword", "text"}) or should we have a more generic type description, eg. @Typed("numeric"), @Typed("any")? The first one is more useful for API consumption but it's hard with our complex type system (type classes, custom types, unsupported and so on); the second one is less structured, but probably more useful for documentation, that is the most immediate use case of this feature. All the types are listed explicitly
~~we have alternatives for the synopsis, eg.~~
- ~~functionName(<paramName>:<paramType>, ...): <returnType>~
- ~~<returnType> functionName(<paramName>:<paramType>, ...)~~
- ~~<returnType> functionName(<paramType> <paramName>, ...)~~
  Using <returnType> functionName(<paramName>:<paramType>, ...) for now. If multiple types are supported, then they will be separated by pipes, eg. double sin(n:integer|long|double).

github-actions · 2023-09-21T08:25:49Z

Documentation preview:

✨ Changed pages

elasticsearchmachine · 2023-09-21T08:25:59Z

Pinging @elastic/es-ql (Team:QL)

elasticsearchmachine · 2023-09-21T08:26:00Z

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

elasticsearchmachine · 2023-09-21T08:26:22Z

Hi @luigidellaquila, I've created a changelog YAML for you.

dej611 · 2023-09-21T09:20:38Z

Cool PR @luigidellaquila ! That would be great to use it in Kibana.

Few comments:

I would prefer the functionName(<paramName>[?]:<paramType>, ...): <returnType> annotation type for the synopsis (just personal reference)
For our use case the argDescriptions and description fields would probably be discarded as we need to translate them anyway. So not sure if it's worth keeping them rather than publish them on the doc (which would be great)
I think what is missing here is the variadic function definition, which some functions have (i.e. concat): I think a flag like variadic here would work for us.
as for optional args, rather than the (Optional) text in the description I would add a ? either at the end of the arg name (dateString?) or at the beginning of the type (?keyword) to indicate such thing. In terms of synopsis using a notation as dateString ?: keyword would work

Taking as example the date_parse definition, I would propose something like:

name:keyword	synopsis:keyword	argNames:keyword	argTypes:keyword	variadic: boolean	returnType:keyword
date_parse	date_parse(datePattern:keyword, dateString?:keyword): date	[datePattern, dateString?]	[keyword, keyword]	false	date

nik9000 · 2023-09-21T11:50:28Z

So not sure if it's worth keeping them rather than publish them on the doc (which would be great)

The docs actually use the show functions infrastructure to generate bits of themselves. Something like that should be possible.

alex-spies

I think this is very cool! If others are happy with this approach as well, this LGTM.

alex-spies · 2023-09-21T12:10:04Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateParse.java

@@ -40,7 +43,13 @@ public class DateParse extends ScalarFunction implements OptionalArgument, Evalu
    private final Expression field;
    private final Expression format;

-    public DateParse(Source source, Expression first, Expression second) {
+    @Described("Parses a string into a date value")
+    @Typed("date")


thought (maybe for the future): I think it's really good that type information is showing up in the function class; currently, we generate docs by inferring supported types from the tested types - which is great in terms of automation, but comes with a bit of indirection and is not so explicit.

Maybe, we could evolve this by having a proper method on each function that returns the supported input and output types; this would be more explicit and the automatically generated tests could pick that up as well.

With this PR, input/output types are kinda defined in two places now: one is here, via annotations, and the other is in the abstract function test case and is inferred from the tested types. There's also a chance of this drifting apart and being inconsistent.

alex-spies · 2023-09-21T12:11:33Z

.../plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/show/ShowFunctions.java

-        List<String> args = new ArrayList<>(params.length);
+        Constructor<?> constructor = constructors[0];
+        Described functionDescAnn = constructor.getAnnotation(Described.class);
+        String funcitonDescription = functionDescAnn == null ? "" : functionDescAnn.value();


Suggested change

String funcitonDescription = functionDescAnn == null ? "" : functionDescAnn.value();

String functionDescription = functionDescAnn == null ? "" : functionDescAnn.value();

astefan · 2023-09-21T13:16:55Z

how structured shoud types be? Eg. should we have a strict @typed("keyword")/@typed({"keyword", "text"}) or should we have a more generic type description, eg. @typed("numeric"), @typed("any")? The first one is more useful for API consumption but it's hard with our complex type system (type classes, custom types, unsupported and so on); the second one is less structured, but probably more useful for documentation, that is the most immediate use case of this feature.

I think it would be great if we can synchronize the parameter types with the error messages we generate whenever a wrong parameter type is used. For example, in ToVersion when resolving the types we generate an error of the form argument of [] must be [version, keyword or text], found value [..... Imo, whatever form we use in these errors the same must be used in show functions.

…s' into esql/function_signatures

luigidellaquila · 2023-09-21T13:24:38Z

I'm not pushing the railroad diagrams because the generation does not seem to work very well on my machine. In case, it's something we'll have to check before merging

dej611 · 2023-09-21T13:40:11Z

Reporting feedback as one who had to go thru all the ESQL documentation to workout arg, types and returning types of every function, I would rather prefer that synopsis snippet above in the documentation rather than the railroad diagram.

luigidellaquila · 2023-09-21T13:56:59Z

Thanks everybody for the feedback

Maybe, we could evolve this by having a proper method on each function that returns the supported input and output types; this would be more explicit and the automatically generated tests could pick that up as well.

I thought a bit about this, the real problem is that we don't have an instance of the function (all this is working by reflection) so the best we can do is have a static method.

In general, having the tests decide the expected output of a function is not an optimal solution IMHO: we should be aware of what the function accepts/returns and the tests should verify it, so having a more strict way of defining types is preferrable.

For example, in ToVersion when resolving the types we generate an error of the form argument of [] must be [version, keyword or text], found value [..... Imo, whatever form we use in these errors the same must be used in show functions.

The number of possible options scares me a bit here, especially for the documentation, but I agree that it would be good to be consistent between docs and error messages

luigidellaquila · 2023-09-21T13:57:14Z

@elasticmachine run elasticsearch-ci/part-1

luigidellaquila · 2023-09-21T13:58:39Z

@elasticmachine update branch

luigidellaquila · 2023-09-21T14:42:56Z

@elasticmachine run elasticsearch-ci/part-1

nik9000 · 2023-09-21T15:06:21Z

Reporting feedback as one who had to go thru all the ESQL documentation to workout arg, types and returning types of every function, I would rather prefer that synopsis snippet above in the documentation rather than the railroad diagram.

We can absolutely create these and stuff them somewhere so kibana could pick them up. I'd have no problem just making markdown files and landing them somewhere. The railroad diagram was a request from someone else and it's ok with me. There's also some type information at the bottom of some of the functions now. But neither the type information nor the railroad diagrams is complete at the moment.

costin · 2023-09-21T20:23:33Z

...ql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateParse.java

+    public DateParse(
+        Source source,
+        @Named("datePattern") @Typed("keyword") @Described("A valid date pattern") @Optional Expression first,
+        @Named("dateString") @Typed("keyword") @Described("A string representing a date") Expression second
+    ) {


Instead of having multiple annotations (which can have different order, might be missing, etc..) use one with multiple fields:
@Param(name="datePattern", type="keyword", description="A valid date pattern", optional=true/false)

Separately it's worth checking if the field name is not already available in the debug info (method.getParameterNames()) - this would avoid the name attribute and use the field name instead.
This will force us to actually give them better names such as datePattern instead of first.

👍

Debug info could not be available in all the cases, but I'll give it a try

costin · 2023-09-21T20:25:38Z

...gin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/math/Sin.java

+    @Described("Returns the trigonometric sine of an angle")
+    @Typed("double")


The type information is already available in getDataType() - better to get it from there to avoid getting out of sync.

dataType() method is not an option here unfortunately, it's an instance method and here we don't have function instances, but only classes and reflection.
Even worse, some functions return different types depending on the input, so calling that method makes sense only in the context of a query.

costin

This looks good Luigi! Left some comments.
It would be something nice to have for 8.11 however please time box it - I expect in time a lot of minor tweaks here and there but for now, time is of the essence.

costin · 2023-09-21T20:34:11Z

.../plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/show/ShowFunctions.java

 import static org.elasticsearch.xpack.ql.type.DataTypes.KEYWORD;

 public class ShowFunctions extends LeafPlan {

    private final List<Attribute> attributes;

+    public record ArgSignature(String name, String type, String description, boolean optional) {}
+
+    public record FunctionSignature(String name, List<ArgSignature> args, String returnType, String description, boolean variadic) {


Better to 'promote' this to FunctionDescription and move that as part of the EsqlFunctionRegistry so that it gets created each time a function definition gets created.
Move the reflection/annotation logic there along with some fallbacks in case the annotations are missing (which we should catch through some basic tests that check the functions registered and that they all have the proper description/annotation).

ShowFunction itself should just call the functionregistry and iterate over the data.

Good point, this logic belongs to EsqlFunctionRegistry, I'll move it there for now.
Moving it into FunctionDescription and generating it only once is a bit more complicated, because that class is part of QL. It's probably a good candidate for a follow-up PR, to keep this one small.

however please time box it

Definitely!

I think you meant FunctionDefinition, FunctionDescription would be a new class specific to ESQL.

Yes, sorry, FunctionDefinition

astefan · 2023-09-22T06:51:07Z

For example, in ToVersion when resolving the types we generate an error of the form argument of [] must be [version, keyword or text], found value [..... Imo, whatever form we use in these errors the same must be used in show functions.

The number of possible options scares me a bit here, especially for the documentation, but I agree that it would be good to be consistent between docs and error messages

I'm not following, what "possible options"?
Please, note that the error message for some of these functions is automatically built. ToVersion for example, has an EVALUATORS constant that defines a list of data types and their correspondent evaluator/converter. The error message for improper data type parameters comes from the key of this EVALUATORS Map.

luigidellaquila · 2023-09-22T06:54:46Z

I'm not following, what "possible options"?

I mean the number of supported data types for a parameter. A funciton like to_string() will accept all the available data types, should we return a list of tens of types? Or should we just return "any" or something similar? Maybe the first is better for API development, but the second one will be easier to understand for docs.

…s' into esql/function_signatures

luigidellaquila · 2023-09-26T11:59:36Z

@astefan I went with your suggestion to have the full list of supported types, and IMHO the result is pretty good.

I also enhanced the tests to check that types defined in the annotations are the same as those actually tested (and that generate the type tables for documentation), so that we are sure that the three (annotations, actual unit test coverage and docs) are aligned.

For now the test is lenient (ie. if a function is not annotated, the checks are skipped) but as soon as all the functions are correctly annotated we can easily enable a more strict test logic.

luigidellaquila · 2023-09-26T12:09:16Z

...src/test/java/org/elasticsearch/xpack/esql/expression/function/AbstractFunctionTestCase.java

+                continue; // TODO remove this eventually, so that all the functions will have to provide singature info
+            }
+            Set<String> signatureTypes = typesFromSignature.get(i);
+            assertEquals(annotationTypes, signatureTypes);


Assertions in @AfterClass are just terrible, but it's the only place where I have all the signatures.

astefan

It's looking good I believe. Left some small comments.

Another comment, that I mentioned previously and where I didn't get any input for:

Please, note that the error message for some of these functions is automatically built. ToVersion for example, has an EVALUATORS constant that defines a list of data types and their correspondent evaluator/converter. The error message for improper data type parameters comes from the key of this EVALUATORS Map.

The fact that we manually add the parameter types in annotations for each function makes it a fragile mechanism. I haven't looked at all our functions to see if all of them use these evaluators I mentioned in my previous comment, but if they are (or we adapt them all to use these evaluators inside a Map) then the show functions command would use the same mechanism that's already used to list the allowed data types in our functions' error messages. And this is a much more robust mechanism, covered also by testing. I would, at least, look into this as a possibility to have something that's:

centralized
automatic
less error prone

astefan · 2023-09-26T12:24:39Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/show.csv-spec

+       name:keyword      |                        synopsis:keyword                |       argNames:keyword  | argTypes:keyword |             argDescriptions:keyword                |  returnType:keyword   |  description:keyword  |   optionalArgs:boolean |  variadic:boolean
+is_finite                |? is_finite(arg1:?)                                     |arg1                     |?                 |  ""                                                  |?              | ""                      | false                | false
+is_infinite              |? is_infinite(arg1:?)                                   |arg1                     |?                 |  ""                                                  |?              | ""                      | false                | false
+is_nan                   |? is_nan(arg1:?)                                        |arg1                     |?                 |  ""                                                  |?              | ""                      | false                | false
 ;


Personal preference: have a query show functions | keep synopsis to make it easier in reviews to see the actual changes in synopsis.

astefan · 2023-09-26T13:49:48Z

...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java

+                if (List.class.isAssignableFrom(params[i].getType())) {
+                    variadic = true;
+                }


Suggested change

if (List.class.isAssignableFrom(params[i].getType())) {

variadic = true;

}

variadic |= List.class.isAssignableFrom(params[i].getType());

astefan · 2023-09-26T13:55:45Z

...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/EsqlFunctionRegistry.java

+        Constructor<?> constructor = constructors[0];
+        FunctionInfo functionInfo = constructor.getAnnotation(FunctionInfo.class);
+        String functionDescription = functionInfo == null ? "" : functionInfo.description();
+        String[] returnType = functionInfo == null ? new String[] { "?" } : functionInfo.returnType();


What's the meaning of ? as a return parameter type? If it's a function that's yet to have its proper definition added to it, I suggest an empty string for now.

Yes, it's for functions without a proper definition.
It's a transient solution though, my intention to annotate all the functions correctly and then enable a more strict test strategy to verify that ? is never returned.

luigidellaquila · 2023-09-27T07:35:32Z

parameter types in annotations for each function makes it a fragile mechanism. [...] see if all of them use these evaluators I mentioned in my previous comment, but if they are [...] then the show functions command would use the same mechanism that's already used to list the allowed data types

That would be really convenient, but unfortunately it's not the case. to_*() functions are the only functions that use this mechanism, most of (probably all) the other functions use custom logic to validate the inputs and to generate the evaluators, and this logic in some cases depends on the query context, so it's not something we can use in SHOW FUNCTIONS.
I tried to find a better way to infer the types, but I didn't find a solution that works for all the functions and that is robust enough with current function implementation.

luigidellaquila · 2023-09-27T09:41:43Z

@elasticmachine run elasticsearch-ci/docs

luigidellaquila · 2023-09-27T12:37:01Z

@elasticmachine update branch

luigidellaquila · 2023-09-27T14:23:17Z

@elasticmachine run elasticsearch-ci/docs

luigidellaquila · 2023-09-29T07:32:59Z

@costin @astefan any additional comments on this PR?
It would be cool to merge it today so that I have time to annotate all the functions.

astefan

LGTM

bpintea

I have one (late) question: the initial attempt at obtaining this functionality (in ESQL-835) was turned down for similar reasons noted already, but also internationalisation. If this is no longer a concern and Kibana could generate better autocomplete with this, it LGTM.

luigidellaquila · 2023-10-02T10:01:47Z

About internationalization, Kibana won't be able to use our descriptions anyway (for that exact reason), so we could decide to drop them, unless we decide to use them for docs later.
Param names are a different story, they were there already and they were not internationalized (and they should not be IMHO)

dej611 · 2023-10-02T12:37:00Z

Param names are a different story, they were there already and they were not internationalized (and they should not be IMHO)

Agree here. The only thing which would make sense to localize are type string, but we can take of them on the kibana side as well.

As for the documentation, it would be nice to have a shared documentation place for both Elasticsearch and Kibana ESQL area, internationalisation included where both can pull it and always stay in sync.

costin

Thanks Luigi - this is good improvement however I'm afraid there's too much information for this to be useful for human users; I've raised an issue for discussion at #100146
Let's get this in for 8.11 and we'll sort things out later.

…s' into esql/function_signatures

ESQL: enhance SHOW FUNCTIONS command

1e55d19

luigidellaquila added >enhancement :Analytics/ES|QL AKA ESQL ES|QL-ui Impacts ES|QL UI labels Sep 21, 2023

elasticsearchmachine added Team:QL (Deprecated) Meta label for query languages team v8.11.0 labels Sep 21, 2023

Update docs/changelog/99736.yaml

9d0e758

alex-spies approved these changes Sep 21, 2023

View reviewed changes

luigidellaquila added 2 commits September 21, 2023 15:23

Add optionalArgs and variadic

4bae7e0

Merge remote-tracking branch 'luigidellaquila/esql/function_signature…

65c819c

…s' into esql/function_signatures

Merge branch 'main' into esql/function_signatures

9e3404d

costin reviewed Sep 21, 2023

View reviewed changes

costin requested changes Sep 21, 2023

View reviewed changes

luigidellaquila added 4 commits September 25, 2023 14:38

Merge branch 'main' into esql/function_signatures

60d5730

Add tests (not complete yet)

aa4b6ad

Enable validation and fix tests

d51dfc3

Merge remote-tracking branch 'luigidellaquila/esql/function_signature…

6c7d35f

…s' into esql/function_signatures

luigidellaquila commented Sep 26, 2023

View reviewed changes

Delete unused classes

14dc8e7

luigidellaquila requested review from costin and not-napoleon September 26, 2023 12:22

astefan reviewed Sep 26, 2023

View reviewed changes

Implement review suggestions

4d8aa1d

Merge branch 'main' into esql/function_signatures

54ed932

astefan approved these changes Sep 29, 2023

View reviewed changes

bpintea reviewed Oct 2, 2023

View reviewed changes

costin mentioned this pull request Oct 2, 2023

ESQL: Show functions ergonomics #100146

Closed

costin approved these changes Oct 2, 2023

View reviewed changes

luigidellaquila added 3 commits October 2, 2023 18:56

Merge branch 'main' into esql/function_signatures

de854e3

Fix tests

b712f8f

Merge remote-tracking branch 'luigidellaquila/esql/function_signature…

47ccf11

…s' into esql/function_signatures

luigidellaquila added the auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Oct 2, 2023

elasticsearchmachine merged commit 6e79013 into elastic:main Oct 2, 2023
12 checks passed

luigidellaquila deleted the esql/function_signatures branch October 2, 2023 17:57

dej611 mentioned this pull request Oct 11, 2023

[ES|QL] New show aggs command for stats supported functions #99853

Closed

	String funcitonDescription = functionDescAnn == null ? "" : functionDescAnn.value();
	String functionDescription = functionDescAnn == null ? "" : functionDescAnn.value();

		@Described("Returns the trigonometric sine of an angle")
		@Typed("double")

ESQL: enhance SHOW FUNCTIONS command #99736

ESQL: enhance SHOW FUNCTIONS command #99736

Conversation

luigidellaquila commented Sep 21, 2023 • edited

github-actions bot commented Sep 21, 2023

elasticsearchmachine commented Sep 21, 2023

elasticsearchmachine commented Sep 21, 2023

elasticsearchmachine commented Sep 21, 2023

dej611 commented Sep 21, 2023

nik9000 commented Sep 21, 2023

alex-spies left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

astefan commented Sep 21, 2023

luigidellaquila commented Sep 21, 2023

dej611 commented Sep 21, 2023

luigidellaquila commented Sep 21, 2023

luigidellaquila commented Sep 21, 2023

luigidellaquila commented Sep 21, 2023

luigidellaquila commented Sep 21, 2023

nik9000 commented Sep 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

costin left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

astefan commented Sep 22, 2023

luigidellaquila commented Sep 22, 2023

luigidellaquila commented Sep 26, 2023

Choose a reason for hiding this comment

astefan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

luigidellaquila commented Sep 27, 2023

luigidellaquila commented Sep 27, 2023

luigidellaquila commented Sep 27, 2023

luigidellaquila commented Sep 27, 2023

luigidellaquila commented Sep 29, 2023

astefan left a comment

Choose a reason for hiding this comment

bpintea left a comment

Choose a reason for hiding this comment

luigidellaquila commented Oct 2, 2023

dej611 commented Oct 2, 2023

costin left a comment

Choose a reason for hiding this comment

luigidellaquila commented Sep 21, 2023 •

edited

costin left a comment •

edited