T digest csv test support #138391

not-napoleon · 2025-11-20T22:28:02Z

Wire up the t-digest field type to the ESQL CSV tests. Mostly this involves adding the element type and plumbing it though, with some additional work filling out the block builder behaviors.

Conflicts: x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

…test-support' into t-digest-csv-test-support

kkrik-es · 2025-11-27T13:09:42Z

More test failures?

JonasKunz · 2025-11-27T14:15:00Z

x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/type/DataType.java

    ),

+    /*
+    TDIGEST(


Is this commented-out code intentionally in this PR?

JonasKunz · 2025-11-27T14:19:37Z

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/TDigestArrayBlock.java

    }
+
+    @Override
+    public TDigestHolder getTDigestHolder(int offset, BytesRef scratch) {


I don't see the point of passing in the BytesRef scratch here?
This method still allocates one TDigestHolder per entry when iterating over the block.
In fact, TDigestHolder is actually the bigger object compared to BytesRef.

I'd recommend to start simple and improve in a follow-up:
Just don't pass a scratch in, simply allocate the BytesRef in this method.
Then later add a custom type for your scratch (or make TDigestHolder reusable) and pass that in as second argument.

JonasKunz · 2025-11-27T14:25:56Z

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/TDigestHolder.java

+        return encodedDigest;
+    }
+
+    // TODO - compute these if they're not given? or do that at object creation time, maybe.


Is there any possibility for these to be not populated?
I think if users don't provide them at ingest time, we estimate them then.
This means that at query time, they are always present?

I'd recommend adding an assert assertInvariants() call to the TDigestArrayBlock constructor and the corresponding assertInvariants() method similar to the one I added for exponential histogram blocks.
In that method, I'd check that min, max, sum and count are present or null according to our expectance.

Having this check there saved me quite a bit of headache during the exponential histogram work.

JonasKunz · 2025-11-27T14:26:50Z

...ck/plugin/esql/compute/test/src/main/java/org/elasticsearch/compute/test/BlockTestUtils.java


+    public static TDigestHolder randomTDigest() {
+        // TODO: This is mostly copied from TDigestFieldMapperTests; refactor it.
+        Map<String, Object> value = new LinkedHashMap<>();


Suggested change

Map<String, Object> value = new LinkedHashMap<>();

unused code

JonasKunz · 2025-11-27T14:31:09Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/tdigest.csv-spec

+Make sure we can even load tdigest data
+required_capability: tdigest_field_type_basic_functionality
+
+FROM tdigest_standard_index | KEEP @timestamp,instance;


Suggested change

FROM tdigest_standard_index | KEEP @timestamp,instance;

FROM tdigest_standard_index | WHERE responseTime IS NOT NULL | KEEP @timestamp,instance;

I don't think that your original query would even load the responseTime, would it?
If I'm wrong here I'd be keen on some explanations

JonasKunz · 2025-11-27T14:33:18Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

         */
        EXPONENTIAL_HISTOGRAM_PRE_TECH_PREVIEW_V4(EXPONENTIAL_HISTOGRAM_FEATURE_FLAG),

+        TDIGEST_FIELD_TYPE_BASIC_FUNCTIONALITY(T_DIGEST_ESQL_SUPPORT),


FYI I got annoyed having to add a new capability every time I do a small change while still behind a feature flag.

That's why I changed my approach:
I now simply increment the version suffix of EXPONENTIAL_HISTOGRAM_PRE_TECH_PREVIEW_V4 on every change that would wrongly break the bwc tests. That helps with not ending up with a mess of capabilities which we have to clean up later before removing the feature flag.

not-napoleon added 3 commits November 20, 2025 09:51

wiring up CSV tests (in progress)

8c41366

I'm not proud of this, but it works and it only took an hour

b63668b

wire up the parser, now that it's visible

c205ddb

not-napoleon added >non-issue WIP :StorageEngine/ES|QL Timeseries / metrics / logsdb capabilities in ES|QL v9.3.0 labels Nov 20, 2025

not-napoleon mentioned this pull request Nov 20, 2025

Add support for storing and querying T-Digest sketches #137649

Open

42 tasks

elasticsearchmachine and others added 16 commits November 20, 2025 22:36

[CI] Auto commit changes from spotless

0b3b329

checkpoint, doesn't compile

f6022dc

checkpoint, doesn't compile

fc5e497

Merge branch 'main' into t-digest-csv-test-support

0105c0b

Conflicts: x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java

element type wired up enough to compile the tests

7108827

escape commas in the CSV

bc252a5

[CI] Auto commit changes from spotless

40e6415

minor test tweaks that don't fix anything

bb7174b

parsing works. It's awful, but it works

840cc55

better parsing solution, thanks Dima!

7ea1073

actually implement building the block

bd72b1a

Merge remote-tracking branch 'refs/remotes/not-napoleon/t-digest-csv-…

d7f463d

…test-support' into t-digest-csv-test-support

[CI] Auto commit changes from spotless

f0a629e

turn no commits into todos

b89bcef

Merge remote-tracking branch 'refs/remotes/not-napoleon/t-digest-csv-…

24e0bf9

…test-support' into t-digest-csv-test-support

Merge branch 'main' into t-digest-csv-test-support

ae17da7

not-napoleon marked this pull request as ready for review November 26, 2025 21:20

not-napoleon requested a review from a team as a code owner November 26, 2025 21:20

not-napoleon requested review from JonasKunz and kkrik-es November 26, 2025 22:12

JonasKunz reviewed Nov 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

T digest csv test support #138391

T digest csv test support #138391

Uh oh!

not-napoleon commented Nov 20, 2025 •

edited

Loading

Uh oh!

kkrik-es commented Nov 27, 2025

Uh oh!

JonasKunz Nov 27, 2025

Uh oh!

JonasKunz Nov 27, 2025

Uh oh!

JonasKunz Nov 27, 2025

Uh oh!

JonasKunz Nov 27, 2025

Uh oh!

JonasKunz Nov 27, 2025

Uh oh!

JonasKunz Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	FROM tdigest_standard_index \| KEEP @timestamp,instance;
	FROM tdigest_standard_index \| WHERE responseTime IS NOT NULL \| KEEP @timestamp,instance;

T digest csv test support #138391

Are you sure you want to change the base?

T digest csv test support #138391

Uh oh!

Conversation

not-napoleon commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kkrik-es commented Nov 27, 2025

Uh oh!

JonasKunz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

JonasKunz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

JonasKunz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

JonasKunz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

JonasKunz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

JonasKunz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

not-napoleon commented Nov 20, 2025 •

edited

Loading