Skip to content

Conversation

@JonasKunz
Copy link
Contributor

@JonasKunz JonasKunz commented Nov 20, 2025

Followup for #138177.

Prior to this PR, we defined the SUM of an empty histogram to be 0.0 in ES|QL.
As a result, this rightfully would yield a division-by-zero warning when trying to compute the average: SUM/COUNT = 0.0/0.0.
Empty histograms are an edge case: however, they will be more common when we support cumulative histograms.
Then, the "delta" of an unchanged counter histogram will be an empty histogram.

In this PR, we change the definition of SUM to be null for empty histograms so that the average computation is null/0.0 instead, which correctly returns null without warnings.

To stay efficient, we also apply this change to the doc-values when we store exponential histograms. This prevents us from having to do any conversions on the sum block when loading, we can just load it as-is as the invariants between doc values and the block match.

Because this changes the invariants of the block, it is inherently backwards-incompatible. This is not a problem because the field is hidden behind a feature flag, but we still run bwc tests with feature flags enabled.
For this reason I replaced the ES|QL capabilities with a single, new one, which essentially disables BWC tests for this feature up until now.

@elasticsearchmachine elasticsearchmachine added v9.3.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Nov 20, 2025
@JonasKunz JonasKunz changed the title Exphisto null sum Exponential histograms: Define sum of empty histograms as null instead of 0.0 Nov 20, 2025
@JonasKunz JonasKunz changed the title Exponential histograms: Define sum of empty histograms as null instead of 0.0 Exponential histograms: define sum of empty histograms as null instead of 0.0 Nov 20, 2025
@JonasKunz JonasKunz changed the title Exponential histograms: define sum of empty histograms as null instead of 0.0 Define sum of empty exponential histograms as null instead of 0.0 Nov 20, 2025
@JonasKunz JonasKunz marked this pull request as ready for review November 20, 2025 11:19
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 20, 2025
required_capability: enrich_load
required_capability: fix_replace_missing_field_with_null_duplicate_name_id_in_layout
required_capability: dense_vector_agg_metric_double_if_fns
required_capability: exponential_histogram
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed anymore, because in EsqlSpecTestCase we already ensure that we only load the histogram-CSV data into an index if the capability is present.
Same for lookup-join.csv-spec

@JonasKunz JonasKunz requested a review from dnhatn November 20, 2025 11:21
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@JonasKunz JonasKunz merged commit 2d2362d into elastic:main Nov 21, 2025
34 checks passed
@JonasKunz JonasKunz deleted the exphisto-null-sum branch November 21, 2025 08:30
not-napoleon added a commit that referenced this pull request Nov 24, 2025
Part of #137988
Implement #138349 for the t-digest field type.

This changes the behavior of the sum sub-field of t-digest when the digest is empty. Prior to this PR we treated it as 0, and this changes it to be null. This avoids a division by zero error in ESQL when trying to calculate the average of an empty histogram. We are adopting the same behavior for the exponential histogram field (see PR linked above), and this is important to keep the semantics of the two fields as close as possible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL external-contributor Pull request authored by a developer outside the Elasticsearch team >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants