Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a histogram_quantiles function #1547

Closed
grobie opened this Issue Apr 12, 2016 · 11 comments

Comments

Projects
None yet
5 participants
@grobie
Copy link
Member

grobie commented Apr 12, 2016

We often calculate 3 quantiles for our metrics, 0.5, 0.9 and 0.99. Instead of having to duplicate definitions and metric evaluations, it'd be beneficial to have a variadic function to calculate all at once. That would save operations in rule evaluations.

histgram_quantiles(<vector>, ...<quantile>)

@grobie grobie changed the title [Feature Request] Add a histogram_quantiles function Add a histogram_quantiles function Apr 12, 2016

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Apr 12, 2016

I'm still open to add variadic arguments.
This of course reverses our regular argument order, which is not exactly intuitive.

This in itself won't be usable since histogram_quantiles doesn't attach a quantile label to the result metrics or similar.
So you couldn't differentiate between the different quantiles in your result.

@grobie

This comment has been minimized.

Copy link
Member Author

grobie commented Apr 12, 2016

This function would generate one timeseries with a label {quantile="<quantile>"} for each of the <quantile> arguments.

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Apr 12, 2016

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 12, 2016

I'm not sure about this.

Duplication in rules files isn't an argument, as that can be handled by configuration management. It'd also be the first function to return multiple results, which seems a bit messy to me.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Apr 12, 2016

I had the discussion about quantile labels with @beorn7 in the past.
I wanted to assign them in recording rules to be symmetric with summaries. Apparently there has been a discussion a long time ago that the label was a mistake in the first place. The argument being, that every label should be meaningful to aggregate over.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 12, 2016

Indeed. I Find the {quantile="..."} label highly problematic. It's handy to plot multiple quantile graphs, but semantically, it would make more sense to have them in the label name (e.g. job:request_duration_seconds:median, job:request_duration_seconds:99perc).

Similar to @brian-brazil , I don't really have a problem with duplication in rules. We have that anyway, and deliberately.

I can see the argument of saving computational resources, but I'd only take that as a last resort. We had problems with rule evaluation times in the past (especially when it came to histograms), but with 0.18, that's pretty much solved (and further improvements are still possible with time-based indices).

As a more general note, I believe many users calculate quantiles in situations where they really want something like "percent of queries served within 100ms" (or an Apdex score, which is very similar). Those numbers are way more natural to calculate with Prometheus, and often more meaningful. I assume that the love for quantiles comes from a history where quantiles were more natural to compute than an Apdex score. Rather than making it easy to mass-calculate quantiles, I'd first like to encourage people to think if they really want a quantile.

@grobie

This comment has been minimized.

Copy link
Member Author

grobie commented Apr 12, 2016

You're right. I got so used to our quantile labels that it never occurred to me that they violate our label guideline.

@grobie grobie closed this Apr 12, 2016

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Apr 12, 2016

@beorn7 by now I agree that it's better off in the name.

Since client_golang will experience breaking changes, would it make sense to switch summaries from the label to the :99perc suffix? The asymmetry is weird to me – and if it's bad practice anyway...

I realize that's quite invasive. But most other clients don't even have client-side percentiles so it wouldn't cause too much friction across languages.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 12, 2016

:99perc suffix

Colons should only really be added when aggregating, as we don't know the relevant aggregation levels down in instrumentation. _99p or something would be better.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 12, 2016

The quantile labels are everywhere. They are essentially the way we flatten the complex "summary" metric type into individual time series:

  • See summary representation of the text format.
  • See how the Prometheus server breaks down a summary into time series.

So we needed to change many places. The "proper" solution would be to make complex metric types like "summary" and "histogram" first class citizens (and make the flattening into elementary time series an internal implementation detail).

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.