-
Notifications
You must be signed in to change notification settings - Fork 25.3k
ES|QL: No plain strings in Literal #129399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES|QL: No plain strings in Literal #129399
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
} else if (dataType == VERSION && value instanceof BytesRef br) { | ||
str = new Version(br).toString(); | ||
// } else if (dataType == IP && value instanceof BytesRef ip) { | ||
// str = DocValueFormat.IP.format(ip); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case, is this intentionally commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... unfortunately it is, for two reasons:
- IP conversion from/to BytesRefs could trigger assertion errors, and it's pretty annoying in tests
- We are explicitly using strings for IPs in some cases, see CobineDisjunctions; we'll have to change that before we can force IPs as BytesRefs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add this as a comment/todo referencing an issue to resolve it eventually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created an issue and I'm linking it in a TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
if (dataType == KEYWORD || dataType == TEXT) { | ||
str = BytesRefs.toString(value); | ||
} else if (dataType == VERSION && value instanceof BytesRef br) { | ||
str = new Version(br).toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an improvement? Or were Versions a readable String before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had some cases (mostly unit tests) where we passed plain strings to VERSION literals.
Even for tests it's not a good thing, as we are testing something that is different than the actual production execution.
@@ -398,7 +398,7 @@ eth0 |gamma |fe80::cae2:65ff:fece:feb9 | |||
lo0 |gamma |fe80::cae2:65ff:fece:feb9 |fe81::cae2:65ff:fece:feb9 | |||
; | |||
|
|||
cdirMatchEqualsInsOrsIPs | |||
cicdirMatchEqualsInsOrsIPs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cicdirMatchEqualsInsOrsIPs | |
cidrMatchEqualsInsOrsIPs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I would replace every cast+utf8ToString() to BytesRefs.toString()
, if possible
@@ -182,8 +183,10 @@ private static String readEvaluator() { | |||
} | |||
|
|||
private static String stringEvaluator(ToIp.LeadingZeros leadingZeros) { | |||
if (leadingZeros == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this was here before, and in jdk21 the case null
works (right?), do we need this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My IntelliJ (Java 21) keeps complaining, but maybe it's some wrong setting on my side
if (value instanceof String) { | ||
return false; | ||
} | ||
if (value instanceof Collection<?> c) { | ||
return c.stream().allMatch(x -> noPlainStrings(x, dataType)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: consider pattern matching here
if (value instanceof String) { | |
return false; | |
} | |
if (value instanceof Collection<?> c) { | |
return c.stream().allMatch(x -> noPlainStrings(x, dataType)); | |
} | |
return switch (value) { | |
case String s -> false; | |
case Collection<?> c -> c.stream().allMatch(x -> noPlainStrings(x, dataType)); | |
default -> true; | |
} |
} | ||
|
||
public static Literal l(Object value, DataType type) { | ||
if (value instanceof String && (type == DataType.TEXT || type == DataType.KEYWORD)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: type == DataType.TEXT || type == DataType.KEYWORD
theck looks a bit cheaper. Do you mind making it first (before instanceof)?
@@ -39,6 +42,10 @@ public static Literal of(Source source, Object value) { | |||
if (value instanceof Literal) { | |||
return (Literal) value; | |||
} | |||
DataType type = DataType.fromJava(value); | |||
if (value instanceof String && (type == TEXT || type == KEYWORD)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
@@ -116,7 +117,7 @@ public ExpressionEvaluator.Factory toEvaluator(ToEvaluator toEvaluator) { | |||
Object folded = field.fold(toEvaluator.foldCtx()); | |||
if (folded instanceof Integer num) { | |||
checkNumber(num); | |||
return toEvaluator.apply(new Literal(source(), " ".repeat(num), KEYWORD)); | |||
return toEvaluator.apply(new Literal(source(), BytesRefs.toBytesRef(" ".repeat(num)), KEYWORD)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to repeat a lot, I wonder if it is worth introducing:
pusblic static Literal.keyword(Source source, String literal) {
return new Literal(source, BytesRefs.toBytesRef(literal), KEYWORD);
}
to simplify construction of string literals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I think it makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks for the feedback folks! |
💔 Backport failed
You can use sqren/backport to manually backport by running |
Manual backport #129702 |
Fixes: #129322
Making sure that all the
TEXT
,KEYWORD
andVERSION
Literals contain a BytesRef (and not a plainjava.lang.String
).The mix of the two was the reason for some ClassCastExceptions, eg. #129322
Steps: