Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upImplement `quantile` and `quantile_over_time`. #552
Comments
beorn7
added
the
feature-request
label
Feb 23, 2015
beorn7
self-assigned this
Mar 3, 2015
This comment has been minimized.
This comment has been minimized.
|
@matthiasr just reported an use-case for this. (Just adding here to get an impression how much needed this feature actually is...) |
This comment has been minimized.
This comment has been minimized.
|
More requests for this have reached me. Since it is easy to implement, we should add it. |
This comment has been minimized.
This comment has been minimized.
|
Just got one more request for this by email. |
This comment has been minimized.
This comment has been minimized.
|
Yea, it'd be great for us for federation. Buckets and bucket-related metadata are easily aggregatable (just |
fabxc
added
kind/enhancement
and removed
feature request
labels
Apr 28, 2016
This comment has been minimized.
This comment has been minimized.
|
For |
This comment has been minimized.
This comment has been minimized.
|
I would think so, for `_over_time`. Without weighting it's really
`_over_samples`.
|
This comment has been minimized.
This comment has been minimized.
|
I assume you mean that samples with a larger gap to other samples should have a stronger weight? |
This comment has been minimized.
This comment has been minimized.
|
Yes, that's the question. |
This comment has been minimized.
This comment has been minimized.
|
sounds good to me, but might be tricky to implement at start and end of a series. I guess we need a similar heuristics as for the rate function. |
This comment has been minimized.
This comment has been minimized.
|
Previously you were against such things for |
This comment has been minimized.
This comment has been minimized.
|
The use cases I saw so far for But frankly, if I take the SoundCloud use case, there are very few uses of Different aspect of the In the Conclusion: I think we need to go back to use cases for |
This comment has been minimized.
This comment has been minimized.
|
One major use cases for We have a recording rule that, for every system, records whether it is "available" (within acceptable latency and error bounds). We record the availability over a time period from this using |
This comment has been minimized.
This comment has been minimized.
|
For clarity, the "is available" metric is 0/1 valued, so the |
This comment has been minimized.
This comment has been minimized.
|
The other question is will the two methods produce notably different results in standard use cases? For a small gap weighting is unlikely to make much of a difference over a day/week, whereas the semantics to use for say an hour long gap will depend on the use case. This may be one where going with simplicity is the best option. |
This comment has been minimized.
This comment has been minimized.
|
@matthiasr Good point with our SLO calculation. The query is in a different repo, and that's why I didn't find it in my little survey. That's probably the most legitimate use case, and you are right, if rule evaluations were irregular, a weighted average would be in order. @brian-brazil Indeed, I assume that in regular cases, the results wouldn't differ as a constant scrape and rule-evaluation period is the regular case. One might argue that any irregularity could be seen as "no data", so that data around a gap should not get more weight. Let's expand on that: Imagine we have a 1h gap in our availability metric (which is 0 or 1, depending if a microservice considers itself within SLO or not). The 1h gap was created because the collecting Prometheus server had a downtime. If our query now averages over the whole day (to get availability for that day), we would currently simply disregard the 1h gap. With weighting of the samples, the last sample before the gap and the last sample after the gap would get the weight of half an hour, while all the other samples only have the weight of the usual rule-evaluation period (e.g. 30s). If one of those samples happens to be exceptional (like the only 0 among all 1s otherwise), it would bias the result. The opposite case would be a change of the rule evaluation period during the averaging interval. E.g. 1m until noon, 30s after noon. Now it would be sane to give the samples before noon double weight in the average. A possible approach to cover both cases would be to apply the start/stop heuristics within the interval, too, i.e. see a single longer gap as a real gap while compensating for changing in sample spacing that looks more systematic. But that will be a lot of complexity. At this point, I tend towards simplicity, i.e. declaring any changes in sample spacing as irregularities that the |
This comment has been minimized.
This comment has been minimized.
|
I'm for simplicity here too. |
This comment has been minimized.
This comment has been minimized.
|
Sounds we have agreement. I have created prometheus/docs#470 to clarify the behavior in the docs. |
brian-brazil
referenced this issue
Jul 8, 2016
Merged
Implement quantile and quantile_over_time #1799
brian-brazil
closed this
in
#1799
Jul 21, 2016
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
beorn7 commentedFeb 23, 2015
Implement the
quantileaggregator and thequantile_over_timefunction.See https://docs.google.com/a/soundcloud.com/document/d/1uSenXRDjDaJLV3qnSD09GqgPdEEDPjER0mVsnGaCYF0/edit#heading=h.vuk4s3dgrhjr