Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nginx_ingress_controller_requests metrics seems way off #3256

Closed
paalkr opened this issue Oct 16, 2018 · 4 comments
Closed

nginx_ingress_controller_requests metrics seems way off #3256

paalkr opened this issue Oct 16, 2018 · 4 comments

Comments

@paalkr
Copy link
Contributor

paalkr commented Oct 16, 2018

NGINX Ingress controller version:
0.20.0

Kubernetes version):
1.10.5

It seems like the Prometheus statistics scraped from the ingress engine and my NLB in front of my K8s cluster do not agree upon the load. And I happen to know that the NLB are displaying correct statistics. The old ingress engine version using VTS stats does agree with the NLB on the load.

This is my grafana query.
round(sum(irate(nginx_ingress_controller_requests[1m])) by (status),1.0)

The real load is around 100 to 2000 requests pr minute, as the NLB correctly states.

Please have a look at the images below.
NLB metrics
image

Ingress engine metrics
image

Metrics are scraped and collected by Prometheus each 30 seconds.

@towolf
Copy link
Contributor

towolf commented Oct 20, 2018

In Prometheus, rate() and irate() are "per second", even if the range vector is [1m]. You may want to try increase()

@paalkr
Copy link
Contributor Author

paalkr commented Oct 21, 2018

Thanks for the replay. I have tried increase(), but in that case the metrics seems to be to high (compared with the NLB metrics). I'm not too familiar with the Prometheus query language, so any help or suggestions are welcome.

@paalkr
Copy link
Contributor Author

paalkr commented Oct 21, 2018

Seems like what I try to achieve is maybe not so simple
prometheus/prometheus#3746

@towolf
Copy link
Contributor

towolf commented Oct 21, 2018

The bug you link to is a deficiency in Prometheus, IMO, but probably not related to your issue.

Your Grafana graph seems to average around 100req/s, which would be 6000req/m. Maybe this includes more requests than what the LB sees?

When I correlate counted requests from access log in Elasticsearch with nginx_ingress_controller_requests, it does seem to match.

@paalkr paalkr closed this as completed Oct 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants