-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Histogram bucket label value gets converted to float somehow #8367
Comments
I don't see anything here that indicates a bug in Prometheus, as all the output you've provided is via 3rd party systems. If you have evidence that Prometheus ingestion is mangling data, please share it in the form of /query or /query_range output. I would presume that the data is like this as that is what the target is exposing, and thus should be taken up with them. |
Here's the output of a query against
The target is exposing the values as "1" and not as "1.0". |
You're absoluately sure that 10.100.16.9:8080 is exposing This sounds like something to take up with Grafana. |
Can you try |
@SteffenDE the "problem" is that Caddy activates OpenMetrics here. The client_golang documentation explains what's happening:
However, if you look at a In your case, I assume some of your Caddy instances are configured differently and do not expose OpenMetrics format, see caddyserver/caddy#3944 how that might be possible. This whole issue is a bit unfortunate and has a lot of historic context. Treating what's essentially a floating point number as a string creates this kind of shenanigans. You are not the first to run into it. My vision has always been that Prometheus will eventually become aware of bucket boundaries not being floats, and at least there should be some sanitation within Prometheus. (It was there in Prometheus 1.x. It got dropped in Prometheus 2.x, arguably by accident. And my attempt to at least make it a requirement for OpenMetrics was not successful.) If you want to learn more in a somewhat entertaining way, you could watch me telling the story. I'm not sure how Grafana's heatmap is dealing with it. I could imagine it's fine as long as the labels don't get mixed. In any case, a PromQL I guess, for now, you just have to make sure that all your Caddy instances are all exposing OpenMetrics or all not exposing OpenMetrics. You have my full sympathy that this is annoying and that it could have been avoided by a more thoughtful design. |
What did you do?
I'm tracking Caddy metrics (https://caddyserver.com/) with prometheus (latest docker image) and am experiencing a strange problem: I've got three instances reporting metrics and for one instance the bucket label le="1" is saved as le="1.0":
(query:
sum(increase(caddy_http_request_duration_seconds_bucket{handler=~"reverse_proxy", server="srv0", le=~"1|1.0"}[5m])) by (le, instance)
)I've checked the http output of the specific caddy instance and every "le" label is correctly formatted as int. Only when querying prometheus I get a float value:
It also happens to the other buckets (le="5", le="10").
(This really messes up the grafana heatmap when the affected instance is included:
vs
)
What did you expect to see?
The value should be the same le="1" as with the other instances.
What did you see instead? Under which circumstances?
The value is a float instead of an integer.
Environment
The text was updated successfully, but these errors were encountered: