ESQL: Fuse MV_MIN and MV_MAX and document process #138029

nik9000 · 2025-11-13T15:28:27Z

Fuses the MV_MIN and MV_MAX functions to block loading for a few types. And documents the fusion process so folks doing similar work have a recipe to follow. Like LENGTH, this is not often going to be a super hot function but it is an "obvious" one to fuse because it reduces the data we load and it's fairly easy to implement on top of the loaders we already have. Queries like:

FROM test | STATS SUM(MV_MAX(i))

over ten million docs go from ~84ms to ~56ms. So, like, 33% improvement.

Adds REST tests for the `percentiles_bucket` pipeline bucket aggregation. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to elastic#26220

elasticsearchmachine · 2025-11-13T15:28:53Z

Hi @nik9000, I've created a changelog YAML for you.

nik9000 · 2025-11-13T15:29:05Z

I've flipped this to draft because I haven't done a full survey of the csv-spec functions for coverage around this issue.

…next

elasticsearchmachine · 2025-11-24T18:49:03Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 · 2025-11-25T13:13:37Z

I've removed this from draft because I've added a bunch more MV_MIN and MV_MAX tests. We could really use more around half_float, short, float, and byte which I added as a note to #137679.

martijnvg

I really like how this works and it gives a very good speedup. I left just one comment, and someone else should review too. But other than that LGTM

martijnvg · 2025-11-25T13:30:33Z

muted-tests.yml

 - class: org.elasticsearch.xpack.esql.heap_attack.HeapAttackLookupJoinIT
  method: testLookupExplosionBigString
  issue: https://github.com/elastic/elasticsearch/issues/138510
+- class: org.elasticsearch.xpack.esql.qa.single_node.GenerativeForkIT


Is this intended to be muted?

Yes. Sorry. Had a pending comment explaining it that I hadn't posted.

Those queries are now "optimized incorrectly". I've got one fix for the rule in #138544 so I'm going to get that and this in and work on that query. I'm calling this mute another thing that block release of the field fusion work.

nik9000 · 2025-11-25T13:14:36Z

muted-tests.yml

  issue: https://github.com/elastic/elasticsearch/issues/138510
+- class: org.elasticsearch.xpack.esql.qa.single_node.GenerativeForkIT
+  method: test {csv-spec:inlinestats.MvMinMvExpand}
+  issue: https://github.com/elastic/elasticsearch/issues/137679


This is tracked in the meta-issue blocking deployment of pushing functions to field loads: #137679

…next

carlosdelest

Very very nice docs. Thank you.

carlosdelest · 2025-11-25T17:32:46Z

server/src/main/java/org/elasticsearch/index/mapper/BlockLoader.java

- * Interface for loading data in a block shape. Instances of this class
- * must be immutable and thread safe.
+ * Loads values from a chunk of lucene documents into a "Block" for the compute engine.
+ * <p>


My past self would have loved this. Thanks for the comments, will help my future self when my present self forgets about this.

carlosdelest · 2025-11-25T17:48:39Z

...java/org/elasticsearch/xpack/esql/expression/function/blockloader/BlockLoaderExpression.java

+ *     <li>{@code STATS} or another function - these <strong>won't</strong> use your new code</li>
+ * </ul>
+ * <p>
+ *     It's <strong>fairly</strong> likely we already have tests for all these cases.


What about FORK, LOOKUP JOIN and subqueries? Are we expecting them to work?

I think that, once we fix FORK, LOOKUP JOIN, and subqueries, the same solution will apply to pushing all functions. So we won't need a test for every single function. We'll need some with a couple of representative functions. But I wrote this from the perspective of someone writing a new fused expression a few weeks from now. That's..... aspirational!

carlosdelest · 2025-11-25T17:56:33Z

...rc/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/EsqlScalarFunction.java

+ *     "fused" into field loading using {@link BlockLoaderExpression}. Functions like {@code V_COSINE}
+ *     can use the vector search index to compute the result. Functions like {@code MV_MIN} can


This is not exactly true - we avoid loading the vector into blocks by calculating just the single float coming out of a similarity function. The function is not calculated as part of the data structure.

Oh, I thought that was the plan and hadn't looked super close. I remember Ben saying something about being able to push down to lucene somehow if we knew the doc ids were in ascending order.

…next

didn't mean to have this in this PR

Adds REST tests for the `percentiles_bucket` pipeline bucket aggregation. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to elastic#26220

nik9000 added 8 commits June 24, 2022 14:13

REST tests for percentiles_bucket agg

5a06fcc

Adds REST tests for the `percentiles_bucket` pipeline bucket aggregation. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to elastic#26220

ESQL: Fush MV_MIN and MV_MAX into field loading

bc37bb1

Merge branch 'main' into esql_fuse_next

6581d4e

explain

d485863

Merge branch 'main' into esql_fuse_next

9e677d3

More

2ff3ca3

Instructions for BlockLoader

3c46443

Docs

f7517f4

nik9000 added >feature :Analytics/ES|QL AKA ESQL v9.3.0 labels Nov 13, 2025

nik9000 marked this pull request as draft November 13, 2025 15:28

Update docs/changelog/138029.yaml

cd46d68

nik9000 added 10 commits November 17, 2025 15:32

Merge branch 'main' into esql_fuse_next

3e1ea1b

words

4962474

Merge branch 'main' into esql_fuse_next

c32fe1a

Fix merge

6b9962b

Merge branch 'main' into esql_fuse_next

e80d609

Merge branch 'main' into esql_fuse_next

1b65487

Merge branch 'main' into esql_fuse_next

91bcf0b

More tests

5273bdb

Merge branch 'main' into esql_fuse_next

ad33027

Merge remote-tracking branch 'nik9000/esql_fuse_next' into esql_fuse_…

a2c7425

…next

nik9000 marked this pull request as ready for review November 24, 2025 18:48

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 24, 2025

nik9000 added 2 commits November 24, 2025 17:06

Merge branch 'main' into esql_fuse_next

741f80a

Merge branch 'main' into esql_fuse_next

7e78824

nik9000 requested review from carlosdelest, limotova and martijnvg November 25, 2025 13:11

martijnvg approved these changes Nov 25, 2025

View reviewed changes

nik9000 commented Nov 25, 2025

View reviewed changes

nik9000 added 5 commits November 25, 2025 09:19

Merge branch 'main' into esql_fuse_next

3806255

Merge remote-tracking branch 'nik9000' into esql_fuse_next

94cb7ca

Merge remote-tracking branch 'nik9000/esql_fuse_next' into esql_fuse_…

36a4c3a

…next

Remove extra

5b80d78

Merge branch 'main' into esql_fuse_next

bd86626

nik9000 enabled auto-merge (squash) November 25, 2025 16:39

nik9000 added 2 commits November 25, 2025 12:10

fixup

a61e7a7

Merge branch 'main' into esql_fuse_next

bdcf316

carlosdelest approved these changes Nov 25, 2025

View reviewed changes

nik9000 added 3 commits November 25, 2025 13:11

Merge branch 'main' into esql_fuse_next

c4008e8

Merge branch 'main' into esql_fuse_next

979c072

Merge remote-tracking branch 'nik9000/esql_fuse_next' into esql_fuse_…

57a2730

…next

nik9000 disabled auto-merge November 25, 2025 18:14

nik9000 enabled auto-merge (squash) November 25, 2025 18:18

Undo accident

35e7cdc

didn't mean to have this in this PR

nik9000 merged commit e6c5dcc into elastic:main Nov 25, 2025
34 checks passed

		* "fused" into field loading using {@link BlockLoaderExpression}. Functions like {@code V_COSINE}
		* can use the vector search index to compute the result. Functions like {@code MV_MIN} can

ESQL: Fuse MV_MIN and MV_MAX and document process #138029

ESQL: Fuse MV_MIN and MV_MAX and document process #138029

Uh oh!

Conversation

nik9000 commented Nov 13, 2025

Uh oh!

elasticsearchmachine commented Nov 13, 2025

Uh oh!

nik9000 commented Nov 13, 2025

Uh oh!

elasticsearchmachine commented Nov 24, 2025

Uh oh!

nik9000 commented Nov 25, 2025

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carlosdelest left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants