New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Load values a different way #101235
Conversation
Pinging @elastic/es-ql (Team:QL) |
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
Hi @nik9000, I've created a changelog YAML for you. |
Here's the speedup from decoding ordinals in a more sensible way:
One thing that's crying out to me - we really want to load from stored fields like |
- match: { values.0.14: 20 } | ||
- match: { values.0.15: null } | ||
- match: { values.0.15: 20 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are mostly supported now. I suppose they should move out.
Wow, this is awesome. |
Just use the ordinals properly and everything is faster! Seriously, though, if we had a BytesRefBlock that allowed an extra layer of indirection we could resolve the ordinals one time and let them flow through the system. That'd be pretty sweet! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful! I think we can leverage the sequential stored-fields reader after this change too. Thank you, Nik!
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/package-info.java
Outdated
Show resolved
Hide resolved
.../esql/compute/src/main/java/org/elasticsearch/compute/operator/exchange/ExchangeService.java
Outdated
Show resolved
Hide resolved
...l/compute/src/main/java/org/elasticsearch/compute/operator/exchange/ExchangeSinkHandler.java
Outdated
Show resolved
Hide resolved
...compute/src/main/java/org/elasticsearch/compute/operator/exchange/ExchangeSourceHandler.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/package-info.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/X-Block.java.st
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeService.java
Outdated
Show resolved
Hide resolved
.../plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/TransportEsqlQueryAction.java
Outdated
Show resolved
Hide resolved
* {@link #beginPositionEntry} followed by two or more {@code append<Type>} | ||
* calls, and then {@link #endPositionEntry}. | ||
*/ | ||
interface Builder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make this Builder Releasable and release it if we hit a breaker when reading values. Let's do this in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++. I think we have to do it to properly track memory on blocks built by field loading. We just aren't doing that yet.
I'd love to rework stored fields a bit before trying that. I think there should be a "load stored fields" operator. And we can do that by expanding |
I talked to @dnhatn and @ChrisHegarty about the funny The interface we expose is fairly small. Not actually small. But, like, small-ish. We may one day pull |
This removes a no longer used java file from ESQL. We stopped using it in elastic#101235.
This removes a no longer used java file from ESQL. We stopped using it in #101235.
This removes a no longer used java file from ESQL. We stopped using it in elastic#101235.
This also significantly lowered the per-field overhead of loading fields. Especially empty fields. It turns out that sometimes this is a huge deal. |
This changes how we load values in ESQL, delegating to the
MappedFieldType
like we do with doc values and synthetic source. This allows a much more OO way of getting the loads working which makes that path much easier to read. And! It means those code paths look like doc values. So there's symmetry. It's like it rhymes.There are a few side effects here:
_source
. Everything that can be loaded from_source
in scripts will load from_source
in ESQL._source
no longer sorts the fields. Same for stored fields. Now we keep them in whatever they were stored in. This is a pretty marginal time save because loading from_source
is so much more time consuming than the sort. But it's something.