Skip to content

Conversation

@not-napoleon
Copy link
Member

@not-napoleon not-napoleon commented Nov 20, 2025

Wire up the t-digest field type to the ESQL CSV tests. Mostly this involves adding the element type and plumbing it though, with some additional work filling out the block builder behaviors.

@not-napoleon not-napoleon marked this pull request as ready for review November 26, 2025 21:20
@not-napoleon not-napoleon requested a review from a team as a code owner November 26, 2025 21:20
@kkrik-es
Copy link
Contributor

More test failures?

),

/*
TDIGEST(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this commented-out code intentionally in this PR?

}

@Override
public TDigestHolder getTDigestHolder(int offset, BytesRef scratch) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of passing in the BytesRef scratch here?
This method still allocates one TDigestHolder per entry when iterating over the block.
In fact, TDigestHolder is actually the bigger object compared to BytesRef.

I'd recommend to start simple and improve in a follow-up:
Just don't pass a scratch in, simply allocate the BytesRef in this method.
Then later add a custom type for your scratch (or make TDigestHolder reusable) and pass that in as second argument.

return encodedDigest;
}

// TODO - compute these if they're not given? or do that at object creation time, maybe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any possibility for these to be not populated?
I think if users don't provide them at ingest time, we estimate them then.
This means that at query time, they are always present?

I'd recommend adding an assert assertInvariants() call to the TDigestArrayBlock constructor and the corresponding assertInvariants() method similar to the one I added for exponential histogram blocks.
In that method, I'd check that min, max, sum and count are present or null according to our expectance.

Having this check there saved me quite a bit of headache during the exponential histogram work.


public static TDigestHolder randomTDigest() {
// TODO: This is mostly copied from TDigestFieldMapperTests; refactor it.
Map<String, Object> value = new LinkedHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Map<String, Object> value = new LinkedHashMap<>();

unused code

Make sure we can even load tdigest data
required_capability: tdigest_field_type_basic_functionality

FROM tdigest_standard_index | KEEP @timestamp,instance;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FROM tdigest_standard_index | KEEP @timestamp,instance;
FROM tdigest_standard_index | WHERE responseTime IS NOT NULL | KEEP @timestamp,instance;

I don't think that your original query would even load the responseTime, would it?
If I'm wrong here I'd be keen on some explanations

*/
EXPONENTIAL_HISTOGRAM_PRE_TECH_PREVIEW_V4(EXPONENTIAL_HISTOGRAM_FEATURE_FLAG),

TDIGEST_FIELD_TYPE_BASIC_FUNCTIONALITY(T_DIGEST_ESQL_SUPPORT),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I got annoyed having to add a new capability every time I do a small change while still behind a feature flag.

That's why I changed my approach:
I now simply increment the version suffix of EXPONENTIAL_HISTOGRAM_PRE_TECH_PREVIEW_V4 on every change that would wrongly break the bwc tests. That helps with not ending up with a mess of capabilities which we have to clean up later before removing the feature flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :StorageEngine/ES|QL Timeseries / metrics / logsdb capabilities in ES|QL v9.3.0 WIP

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants