Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaps with `histogram_quantile` if no count change #2662

Closed
Dominik-K opened this Issue Apr 27, 2017 · 6 comments

Comments

Projects
None yet
2 participants
@Dominik-K
Copy link
Contributor

Dominik-K commented Apr 27, 2017

What did you do?
Our service gets only sometimes some requests via gRPC. To check the request counts, I run this query:
histogram_quantile(0.99, sum(rate(grpc_server_handling_seconds_bucket[5m])) by (le))

What did you expect to see?
A solid line at 0 were there are gaps now.

What did you see instead? Under which circumstances?
Gaps between the request. There are no 0 values pointing with the mouse over it.
screen shot 2017-04-27 at 19 03 08

I checked that the service was scraped and the counter grpc_server_handling_seconds_bucket is in the metrics (manually "scraped"). The service was up the whole time.

I got the gaps at the same timeframes by doing a division by zero:
screen shot 2017-04-27 at 19 18 32

Environment

  • System information:
    Official latest prom/prometheus Docker image.

  • Prometheus version:
    prometheus, version 1.6.1 (branch: master, revision: 4666df5)

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Apr 27, 2017

When all rate buckets are 0, histogram_quantile() returns NaN (can you check in the returned AJAX data that this is the case?). The graph will not show 0 values for NaN, but should just show a gap, as you mention. This makes sense to me, as there were not any requests at all in the considered time window, so no quantiles can be calculated.

@Dominik-K

This comment has been minimized.

Copy link
Contributor Author

Dominik-K commented May 8, 2017

@juliusv You're right. It actually makes sense to have a gap if there are no requests. We just were confused as we are trained/familiar that a gap in Prometheus/Grafana meant a downtime (until we've got that little requests). But for downtime there's the up metric.

How can I check the Ajax return data?

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented May 8, 2017

How can I check the Ajax return data?

In Chrome: open Chrome inspector -> "Network" tab -> (refresh a graph) -> click on one of the query_range requests -> "Response"

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented May 8, 2017

Cool, closing here then.

@juliusv juliusv closed this May 8, 2017

@Dominik-K

This comment has been minimized.

Copy link
Contributor Author

Dominik-K commented May 11, 2017

They are NaN. Thanks for your answer.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.