Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle more arithmetic operators for native histograms #12262

Merged
merged 14 commits into from May 16, 2023

Conversation

zenador
Copy link
Contributor

@zenador zenador commented Apr 13, 2023

Addresses #12250

Works in unit tests and when manually testing on the UI with scalars and float metrics. This fixes the issue we observed with query sharding histograms with instant query splitting, however the float imprecision accumulates and the final result is very different in unit tests (tolerance was originally 1e-12 for floats but the histogram ones only passed with 1e-2).

Also added avg for native histograms as in #11973, to be discussed there.

@zenador zenador force-pushed the histogram-promql-operators branch 2 times, most recently from 390be44 to e9e2fa2 Compare April 21, 2023 18:12
@zenador zenador marked this pull request as ready for review April 21, 2023 18:43
promql/engine.go Outdated Show resolved Hide resolved
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
@zenador zenador force-pushed the histogram-promql-operators branch from 97ac586 to 47404c4 Compare May 5, 2023 12:01
…oats

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Just some not-so-substantial comments.

Special thanks for the testing. This should be much easier once #11170 is done.

Two other things to do:

  1. Amend the documentation to explain what works now.
  2. I think we can/should also fix other aggregation operators (stddev etc.). Some of them can be made work with histograms, others should ignore them cleanly (currently, we count them, which might throw some results off).

I think (1) needs to be included in this PR, but (2) can be done in a separate PR.

model/histogram/float_histogram.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
@zenador zenador force-pushed the histogram-promql-operators branch from b6d895d to e5ebdb8 Compare May 11, 2023 18:35
@zenador
Copy link
Contributor Author

zenador commented May 11, 2023

Besides what you mentioned, noticed a few more issues:

  • Mul is an exact duplicate of Scale which is used for rate. Removed Mul. Please let me know if you prefer to rename all the Scales to Mul or something.
  • When adding, subtracting, and aggregating, we do not compact the histogram, ending up with ugly redundant buckets (but this problem doesn't happen with sum/avg_over_time, so we had a lot of redundancy in TestNativeHistogram_Sum_Count_Add_AvgOperator, with some expected histograms compacted and some not. I added the compact step to fix that, but not sure if this is unnecessary overhead if we don't actually care if the result is compacted.

Updated std var/dev to ignore histograms for now so it won't throw the results off for floats.

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
@zenador zenador force-pushed the histogram-promql-operators branch from e5ebdb8 to 31353ed Compare May 11, 2023 18:54
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
@beorn7
Copy link
Member

beorn7 commented May 11, 2023

Good catch with the Scale vs Mul. Now that we have Div and Add and Sub, maybe we should call Scale Mul indeed.

I'll get to the other points ASAP.

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
@zenador zenador force-pushed the histogram-promql-operators branch from 87ac94a to 012b2e7 Compare May 11, 2023 21:04
Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. That's a huge chunk of work done.

I hope it's OK if I squash the 14 commits into one.

Comment on lines +336 to +340
All other operators (and unmentioned cases for the above operators) do not
behave in a meaningful way. They either treat the histogram sample as if it
were a float sample of value 0, or (in case of arithmetic operations between a
scalar and a vector) they leave the histogram sample unchanged. This behavior
will change to a meaningful one before native histograms are a stable feature.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't quite true anymore (due to my earlier giant PR). The intention is that native histograms are just ignored, but it needs some research to verify if that's actually the case. Mostly saying this as a mental note to myself. We don't need to fix this paragraph in this PR.

@beorn7 beorn7 merged commit 191bf90 into prometheus:main May 16, 2023
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants