New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: enhance SHOW FUNCTIONS command #99736
ESQL: enhance SHOW FUNCTIONS command #99736
Conversation
Documentation preview: |
Pinging @elastic/es-ql (Team:QL) |
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
Hi @luigidellaquila, I've created a changelog YAML for you. |
Cool PR @luigidellaquila ! That would be great to use it in Kibana. Few comments:
Taking as example the
|
The docs actually use the show functions infrastructure to generate bits of themselves. Something like that should be possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is very cool! If others are happy with this approach as well, this LGTM.
@@ -40,7 +43,13 @@ public class DateParse extends ScalarFunction implements OptionalArgument, Evalu | |||
private final Expression field; | |||
private final Expression format; | |||
|
|||
public DateParse(Source source, Expression first, Expression second) { | |||
@Described("Parses a string into a date value") | |||
@Typed("date") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought (maybe for the future): I think it's really good that type information is showing up in the function class; currently, we generate docs by inferring supported types from the tested types - which is great in terms of automation, but comes with a bit of indirection and is not so explicit.
Maybe, we could evolve this by having a proper method on each function that returns the supported input and output types; this would be more explicit and the automatically generated tests could pick that up as well.
With this PR, input/output types are kinda defined in two places now: one is here, via annotations, and the other is in the abstract function test case and is inferred from the tested types. There's also a chance of this drifting apart and being inconsistent.
List<String> args = new ArrayList<>(params.length); | ||
Constructor<?> constructor = constructors[0]; | ||
Described functionDescAnn = constructor.getAnnotation(Described.class); | ||
String funcitonDescription = functionDescAnn == null ? "" : functionDescAnn.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String funcitonDescription = functionDescAnn == null ? "" : functionDescAnn.value(); | |
String functionDescription = functionDescAnn == null ? "" : functionDescAnn.value(); |
I think it would be great if we can synchronize the parameter types with the error messages we generate whenever a wrong parameter type is used. For example, in |
…s' into esql/function_signatures
I'm not pushing the railroad diagrams because the generation does not seem to work very well on my machine. In case, it's something we'll have to check before merging |
Reporting feedback as one who had to go thru all the ESQL documentation to workout arg, types and returning types of every function, I would rather prefer that synopsis snippet above in the documentation rather than the railroad diagram. |
Thanks everybody for the feedback
I thought a bit about this, the real problem is that we don't have an instance of the function (all this is working by reflection) so the best we can do is have a static method. In general, having the tests decide the expected output of a function is not an optimal solution IMHO: we should be aware of what the function accepts/returns and the tests should verify it, so having a more strict way of defining types is preferrable.
The number of possible options scares me a bit here, especially for the documentation, but I agree that it would be good to be consistent between docs and error messages |
@elasticmachine run elasticsearch-ci/part-1 |
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/part-1 |
We can absolutely create these and stuff them somewhere so kibana could pick them up. I'd have no problem just making markdown files and landing them somewhere. The railroad diagram was a request from someone else and it's ok with me. There's also some type information at the bottom of some of the functions now. But neither the type information nor the railroad diagrams is complete at the moment. |
public DateParse( | ||
Source source, | ||
@Named("datePattern") @Typed("keyword") @Described("A valid date pattern") @Optional Expression first, | ||
@Named("dateString") @Typed("keyword") @Described("A string representing a date") Expression second | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having multiple annotations (which can have different order, might be missing, etc..) use one with multiple fields:
@Param(name="datePattern", type="keyword", description="A valid date pattern", optional=true/false)
Separately it's worth checking if the field name is not already available in the debug info (method.getParameterNames()) - this would avoid the name
attribute and use the field name instead.
This will force us to actually give them better names such as datePattern instead of first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Debug info could not be available in all the cases, but I'll give it a try
@Described("Returns the trigonometric sine of an angle") | ||
@Typed("double") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type information is already available in getDataType() - better to get it from there to avoid getting out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataType()
method is not an option here unfortunately, it's an instance method and here we don't have function instances, but only classes and reflection.
Even worse, some functions return different types depending on the input, so calling that method makes sense only in the context of a query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good Luigi! Left some comments.
It would be something nice to have for 8.11 however please time box it - I expect in time a lot of minor tweaks here and there but for now, time is of the essence.
import static org.elasticsearch.xpack.ql.type.DataTypes.KEYWORD; | ||
|
||
public class ShowFunctions extends LeafPlan { | ||
|
||
private final List<Attribute> attributes; | ||
|
||
public record ArgSignature(String name, String type, String description, boolean optional) {} | ||
|
||
public record FunctionSignature(String name, List<ArgSignature> args, String returnType, String description, boolean variadic) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to 'promote' this to FunctionDescription
and move that as part of the EsqlFunctionRegistry
so that it gets created each time a function definition gets created.
Move the reflection/annotation logic there along with some fallbacks in case the annotations are missing (which we should catch through some basic tests that check the functions registered and that they all have the proper description/annotation).
ShowFunction itself should just call the functionregistry and iterate over the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, this logic belongs to EsqlFunctionRegistry
, I'll move it there for now.
Moving it into FunctionDescription
and generating it only once is a bit more complicated, because that class is part of QL. It's probably a good candidate for a follow-up PR, to keep this one small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however please time box it
Definitely!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant FunctionDefinition, FunctionDescription would be a new class specific to ESQL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry, FunctionDefinition
I'm not following, what "possible options"? |
I mean the number of supported data types for a parameter. A funciton like |
@astefan I went with your suggestion to have the full list of supported types, and IMHO the result is pretty good. I also enhanced the tests to check that types defined in the annotations are the same as those actually tested (and that generate the type tables for documentation), so that we are sure that the three (annotations, actual unit test coverage and docs) are aligned. For now the test is lenient (ie. if a function is not annotated, the checks are skipped) but as soon as all the functions are correctly annotated we can easily enable a more strict test logic. |
continue; // TODO remove this eventually, so that all the functions will have to provide singature info | ||
} | ||
Set<String> signatureTypes = typesFromSignature.get(i); | ||
assertEquals(annotationTypes, signatureTypes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions in @AfterClass
are just terrible, but it's the only place where I have all the signatures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good I believe. Left some small comments.
Another comment, that I mentioned previously and where I didn't get any input for:
Please, note that the error message for some of these functions is automatically built. ToVersion for example, has an EVALUATORS constant that defines a list of data types and their correspondent evaluator/converter. The error message for improper data type parameters comes from the key of this EVALUATORS Map.
The fact that we manually add the parameter types in annotations for each function makes it a fragile mechanism. I haven't looked at all our functions to see if all of them use these evaluators I mentioned in my previous comment, but if they are (or we adapt them all to use these evaluators inside a Map) then the show functions
command would use the same mechanism that's already used to list the allowed data types in our functions' error messages. And this is a much more robust mechanism, covered also by testing. I would, at least, look into this as a possibility to have something that's:
- centralized
- automatic
- less error prone
name:keyword | synopsis:keyword | argNames:keyword | argTypes:keyword | argDescriptions:keyword | returnType:keyword | description:keyword | optionalArgs:boolean | variadic:boolean | ||
is_finite |? is_finite(arg1:?) |arg1 |? | "" |? | "" | false | false | ||
is_infinite |? is_infinite(arg1:?) |arg1 |? | "" |? | "" | false | false | ||
is_nan |? is_nan(arg1:?) |arg1 |? | "" |? | "" | false | false | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personal preference: have a query show functions | keep synopsis
to make it easier in reviews to see the actual changes in synopsis.
if (List.class.isAssignableFrom(params[i].getType())) { | ||
variadic = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (List.class.isAssignableFrom(params[i].getType())) { | |
variadic = true; | |
} | |
variadic |= List.class.isAssignableFrom(params[i].getType()); |
Constructor<?> constructor = constructors[0]; | ||
FunctionInfo functionInfo = constructor.getAnnotation(FunctionInfo.class); | ||
String functionDescription = functionInfo == null ? "" : functionInfo.description(); | ||
String[] returnType = functionInfo == null ? new String[] { "?" } : functionInfo.returnType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the meaning of ?
as a return parameter type? If it's a function that's yet to have its proper definition added to it, I suggest an empty string for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's for functions without a proper definition.
It's a transient solution though, my intention to annotate all the functions correctly and then enable a more strict test strategy to verify that ?
is never returned.
That would be really convenient, but unfortunately it's not the case. |
@elasticmachine run elasticsearch-ci/docs |
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one (late) question: the initial attempt at obtaining this functionality (in ESQL-835) was turned down for similar reasons noted already, but also internationalisation. If this is no longer a concern and Kibana could generate better autocomplete with this, it LGTM.
About internationalization, Kibana won't be able to use our descriptions anyway (for that exact reason), so we could decide to drop them, unless we decide to use them for docs later. |
Agree here. The only thing which would make sense to localize are type string, but we can take of them on the kibana side as well. As for the documentation, it would be nice to have a shared documentation place for both Elasticsearch and Kibana ESQL area, internationalisation included where both can pull it and always stay in sync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Luigi - this is good improvement however I'm afraid there's too much information for this to be useful for human users; I've raised an issue for discussion at #100146
Let's get this in for 8.11 and we'll sort things out later.
…s' into esql/function_signatures
Fixes #99507
Enhance SHOW FUNCTIONS command to return as structured information as possible about the function signature, ie.
For now, as an example, the annotations are used only on
sin()
anddate_parse()
functions; if we agree on this approach, I'll proceed toThis feature can be useful for the end user, but the main goal is to give Kibana an easy way to produce in-line documentation (contextual messages, autocomplete) for functions
Similar to current implementation, that has a
@Named("paramName")
annotation for function parameters, this PR introduces two more annotations@Param(name, type, description, optional)
and@FunctionInfo()
to provide information about single parameters and functions.The result of
SHOW FUNCTIONS
query will have the following columns:double sin(n:integer|long|double)
Open questions:
how structured shoud types be? Eg. should we have a strictAll the types are listed explicitly@Typed("keyword")
/@Typed({"keyword", "text"})
or should we have a more generic type description, eg.@Typed("numeric")
,@Typed("any")
? The first one is more useful for API consumption but it's hard with our complex type system (type classes, custom types, unsupported and so on); the second one is less structured, but probably more useful for documentation, that is the most immediate use case of this feature.we have alternatives for the synopsis, eg.functionName(<paramName>:<paramType>, ...): <returnType>
~<returnType> functionName(<paramName>:<paramType>, ...)
<returnType> functionName(<paramType> <paramName>, ...)
Using
<returnType> functionName(<paramName>:<paramType>, ...)
for now. If multiple types are supported, then they will be separated by pipes, eg.double sin(n:integer|long|double)
.