-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(prometheus) Added the possibility to generate gauge metrics from distribution based metrics #1011
Conversation
As usual, great work! |
Hey @pnerg! Thanks again for the dedication and contributions man 😍 Regarding this comment:
The sum will not really be "how many of something" in this case, specially not with a range sampler. For example, if you were measuring the number of open connections to a database with a range sampler and during a entire reporting interval there were exactly 10 open connections, the reported sum would be 9000 (summing the value "10" three times every 200ms for the whole minute). I'm working on a blog post related to range samplers here: https://github.com/kamon-io/kamon.io/blob/c825fb60e0a3646804246496b4c0af38663408a8/_posts/2021-04-29-monitoring-queues-and-resources-with-kamon-range-samplers.md and even though it needs some polishing, I'm sure it will help you understand the logic behind range samplers. Probably adding Have you considered turning range samplers into a summary with q=0 for the min, q=1 for the max and maybe even a couple percentiles instead? I'm not saying that's should be done, but it is an idea that crossed my mind as I'm reasoning about this PR and how it fits the data and Prometheus practices. |
And one more thing: it seems like this PR would report the same metric both as histogram and as a group of gauges. Wouldn't that mean that the |
There you go my lack of understanding the statistical distributions and how to interpret them...:blush: Great thing that you're drafting a blog describing the range samplers, us noobs really need that... 👍 Yes I played around with summaries and histograms and I get that the histograms and range samplers are for statistical reporting over time but I also need something to measure the here and now. Having read your excellent blog I'm wondering if the PR make sense if I remove the |
So, I guess you implement that, and we're ready to merge :D |
a177b61
to
7d42c95
Compare
Updated the PR and replaced |
Done, merged, good job @pnerg ! |
This is hopefully the answer to several discussions (#899, #914) around why some of the measurements are using range sampler and not gauge.
This PR adds the possibility to configure (by name pattern) for which distribution based (range, time, histogram) metrics to also generate gauge metrics
E.g. configuring something like this:
Would also render gauge metrics for all
http-server
metrics.The existing histogram/summary metrics will still remain, these gauges are add-ons.
For each distribution metric that matches the configured filter three gauges are created
min
- representing the min value seen in the distribution of the snapshotmax
- representing the max value seen in the distribution of the snapshotsum
- representing the sum of all values seen in the distribution of the snapshot. Which is relevant to understand how many of something (e.g. connections) there has been measured in the intervalE.g. running with akka-http and enabling gauge metrics according to the above configuration I'd get something like this for the range metric
http.server.connection.open
When I stop running requests the gauges eventually drop to zero
I think this is a non-intrusive way of providing more suitable metric formats for Prometheus for some of the use cases where one just wants an easy way to see what is going on in the app here and now.