-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Add OPTIONS clause to FROM command #106636
Conversation
This adds an OPTIONS clause to FROM, allowing to specify search or index resolution options, such as: preference, allow_no_indices or ignore_unavailable.
Hi @bpintea, I've created a changelog YAML for you. |
requestFilter, | ||
concreteIndices, | ||
originalIndices, | ||
esSourceOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only added this parameter, diff grew because of breaking down of arguments line, due to line limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about CCQ? Does it support having these preferences sent over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they're part of the EsRelation, which is part of the Fragment which is sent across. I've added some REST tests and CSV test that's executed part of CCQ.
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, thanks @bpintea !
Added a couple of remarks, many of which are minor. The major remarks are (summarized):
- test coverage for the preferences options
- docs update
- wondering whether the right hand side of
"some_option"="some_value"
should be in quotes, even if it is expected to be e.g. a boolean wondering whether we need to register warnings in case of e.g. a shard preference for a non-existent shard@astefan chimed in and said this is consistent with what ES does in case of_search
, so please disregard.
...ugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/RestEsqlTestCase.java
Outdated
Show resolved
Hide resolved
...ugin/esql/qa/server/src/main/java/org/elasticsearch/xpack/esql/qa/rest/RestEsqlTestCase.java
Show resolved
Hide resolved
@@ -99,7 +99,20 @@ field | |||
; | |||
|
|||
fromCommand | |||
: FROM fromIdentifier (COMMA fromIdentifier)* metadata? | |||
: FROM fromIdentifier (COMMA fromIdentifier)* fromOptions? metadata? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think we need to update the docs for FROM as well to reflect the grammar change.
- I think we should mention in the docs that
FROM
can use backticks, e.g. in the edge case that an index is calledoptions
ormetadata
- but that's maybe out of scope for this PR.
FROM `metadata`,`options` ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the docs will follow in separate PR -- I wanted to ensure we want to keep the format.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Show resolved
Hide resolved
} | ||
|
||
static void writeEsRelation(PlanStreamOutput out, EsRelation relation) throws IOException { | ||
assert relation.children().size() == 0; | ||
out.writeNoSource(); | ||
writeEsIndex(out, relation.index()); | ||
writeAttributes(out, relation.output()); | ||
if (out.getTransportVersion().onOrAfter(TransportVersions.ESQL_ES_SOURCE_OPTIONS)) { | ||
if (relation.esSourceOptions() != null) { | ||
out.writeBoolean(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is marking the existence of the source option with a boolean really necessary?
We encode the options as 3 optional strings; that handles the optionality already, why repeat this logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is marking the existence of the source option with a boolean really necessary? We encode the options as 3 optional strings
Without this boolean, how would the remote side know if it should attempt to read three (optional) strings or not? Reading an optional strings involves consuming at least a boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant to say: StreamOutput.writeOptionalString
already prepends a boolean to say "there's a string!" or "there's no string!". We're doubling this logic.
The serialization code in this PR, however, optimizes for the (common?) case where there's no options. My point is that this is unnecessary, because we'd only be sending 1 byte (false
) instead of 3 bytes (false, false, false
), at the cost of slightly more complex code (additional null-handling at the level of EsSourceOptions
).
I don't feel strongly about this, but I'd prefer if
- the
EsSourceOptions
were not nullable (each of the options that are encoded in them is nullable already, so there's two ways to express that no options were set) - and if the serialization always sent the
EsSourceOptions
as 3 optional strings, rather than specialcasing for when theEsSourceOptions
isnull
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, you were still addressing the nullability of the EsSourceOptions
instance. Ok, took on your suggestion and applied it.
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/analyzer/TableInfo.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/options/EsSourceOptions.java
Show resolved
Hide resolved
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/parser/StatementParserTests.java
Outdated
Show resolved
Hide resolved
; | ||
|
||
configOption | ||
: string ASSIGN string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it's a bit unfortunate that this leads to quotes around e.g. booleans. In particular, one has to write ... OPTIONS "allow_no_indices"="true"
instead of "allow_no_indices"=true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, ideally one would simply write smth like OPTIONS allow_no_indices=true
.
But I think the rational was that (1) it might require explaining when to use value quotes and when not (not that difficult, but would come with some load) and (2) that we can relax the quoting later, if needed.
assertEquals(expectedTextBody("txt", count, null), runEsqlAsTextWithFormat(builder.apply(null), "txt", null)); | ||
|
||
// returns nothing (0 for count), given the non-existing shard as preference | ||
String option = "\"preference\"=\"_shards:666\""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we add more tests that demonstrate that the other possible preference options work correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We treat the values in a opaque way and this tests that these "makes it through". Testing it exhaustively involves multiple nodes, which seemed an overkill just for this feature (and I assume there are already other tests for all these values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review in progress. Found that FROM employees OPTIONS "preference"=""
(empty string) returns org.elasticsearch.ElasticsearchException$1: Index 0 out of bounds for length 0
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/options/EsSourceOptions.java
Show resolved
Hide resolved
|
||
public void addOption(String name, String value) { | ||
switch (name) { | ||
case "allow_no_indices" -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could take the name of this option, at least, from IndicesOptions.WildcardOptions.ALLOW_NO_INDICES
and use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, fixed.
(I started from "preference", which is used as hardcoded string everywhere, so didn't explore further.)
} | ||
preference = value; | ||
} | ||
default -> throw new IllegalArgumentException("unknown option named [" + name + "]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about offering some suggestions? StringUtils.findSimilar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, added.
public void testFromOptionsInvalidIndicesOptionValue() { | ||
expectError( | ||
FROM + " options \"allow_no_indices\"=\"foo\"", | ||
"line 1:20: invalid options provided: Could not convert [allow_no_indices] to boolean" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we can be more strict. From my tests, we only allow true
and false
(not even uppercase versions of them). So, unless we offer more options (like, t
/f
, 1
/0
, TRUE
/FALSE
) and I believe we shouldn't and we must follow the same behavior ES does, we can improve this message (and potentially use the same ES generates). ES itself provides more info:
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Could not convert [allow_no_indices] to boolean"
}
],
"type": "illegal_argument_exception",
"reason": "Could not convert [allow_no_indices] to boolean",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Failed to parse value [TRUE] as only [true] or [false] are allowed."
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
The parsing of the options is delegated to IndicesOptions
, so we follow those rules. But forgot to pipe the cause through, it's now fixed. And I also incorporated your later observation, though IMO this unpacking of the causes should be left to the client (but maybe here it's OK to do it).
requestFilter, | ||
concreteIndices, | ||
originalIndices, | ||
esSourceOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about CCQ? Does it support having these preferences sent over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I've added one suggestion and, also, I would be more comfortable if there would be a csv-spec test, as well. And here I am more thinking about the mixed-cluster and multi-clusters IT tests (the csv tests are used there as well; with EsqlSpecTestCase
base class). RestEsqlTestCase
(the one you have in PR) is for single-node environment only.
@@ -220,7 +220,7 @@ public LogicalPlan visitFromCommand(EsqlBaseParser.FromCommandContext ctx) { | |||
try { | |||
esSourceOptions.addOption(name, value); | |||
} catch (IllegalArgumentException iae) { | |||
throw new ParsingException(source(nameContext), "invalid options provided: " + iae.getMessage()); | |||
throw new ParsingException(iae, source(nameContext), "invalid options provided: " + iae.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant with my previous suggestion was something like this:
throw new ParsingException(iae, source(nameContext), "invalid options provided: " + iae.getMessage()); | |
var cause = iae.getCause() != null ? "; " + iae.getCause().getMessage() : ""; | |
throw new ParsingException(iae, source(nameContext), "invalid options provided: " + iae.getMessage() + cause); |
to have an error message like "reason": "line 1:56: invalid options provided: Could not convert [allow_no_indices] to boolean; Failed to parse value [TRUE] as only [true] or [false] are allowed.",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left only minor remarks. Thanks @bpintea !
@@ -186,8 +186,10 @@ FROM_OPENING_BRACKET : OPENING_BRACKET -> type(OPENING_BRACKET); | |||
FROM_CLOSING_BRACKET : CLOSING_BRACKET -> type(CLOSING_BRACKET); | |||
FROM_COMMA : COMMA -> type(COMMA); | |||
FROM_ASSIGN : ASSIGN -> type(ASSIGN); | |||
FROM_QUTED_STRING : QUOTED_STRING -> type(QUOTED_STRING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FROM_QUTED_STRING : QUOTED_STRING -> type(QUOTED_STRING); | |
FROM_QUOTED_STRING : QUOTED_STRING -> type(QUOTED_STRING); |
@@ -99,7 +99,20 @@ field | |||
; | |||
|
|||
fromCommand | |||
: FROM fromIdentifier (COMMA fromIdentifier)* metadata? | |||
: FROM fromIdentifier (COMMA fromIdentifier)* fromOptions? metadata? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized, it will not be possible to write FROM index METADATA ... OPTIONS ...
because we only allow the fixed order FROM
->OPTIONS
->METADATA
.
Probably okay for now, but we might want to generalize that later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I had that thought too (it's just a grammar line more). But had a look at SQL (which doesn't allow this swapping where it could) and thought of how we now request quotes (i.e. no flexibility) and thought I'd keep it inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this - I'm fine with the order being fixed however I would keep the metadata? first, e.g.:
FROM index METADATA ... OPTIONS...
since in general I expect OPTIONS to be the last item in a command mainly because there can be a long list of parameters and because METADATA and others could be considered specialized OPTIONS and thus should take precedence.
convertFromDatetimeWithOptions#[skip:-8.13.99, reason:OPTIONS added in 8.14] | ||
FROM employees OPTIONS "allow_no_indices"="false","preference"="_shards:0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if floats.csv-spec
is the best place for this; we're adding the test to test the OPTIONS
, no? Maybe it's time for a from.csv-spec
? This could be consolidated with id.csv-spec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, thanks.
List<String> matches = StringUtils.findSimilar(name, List.of(ALLOW_NO_INDICES, IGNORE_UNAVAILABLE, OPTION_PREFERENCE)); | ||
if (matches.isEmpty() == false) { | ||
String suggestions = matches.size() == 1 ? "[" + matches.get(0) + "]" : "any of " + matches; | ||
message += ", did you mean " + suggestions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message += ", did you mean " + suggestions; | |
message += ", did you mean " + suggestions + "?"; |
This adds an
OPTIONS
clause toFROM
command, allowing to specify search or index resolution options.For now, the following options are supported:
preference
allow_no_indices
ignore_unavailable
.The
OPTIONS
clause take a comma-separated list of quoted AVPs. Example: