Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using irate to see connection rate (req/sec) is not showing what is expected #2268

Closed
sudhirpandey opened this Issue Dec 9, 2016 · 7 comments

Comments

Projects
None yet
2 participants
@sudhirpandey
Copy link

sudhirpandey commented Dec 9, 2016

What did you do?
Collected metrics from front end of haproxy using haproxy exporter
What did you expect to see?
We would like to see Connectation rate , ie avg req/sec

What did you see instead? Under which circumstances?
The graphs shows connection counter total, and resets to null value in given interval

Environment

  • System information:

    Linux 3.10.0-327.4.5.el7.x86_64 x86_64

  • Prometheus version:

   prometheus, version 1.3.1 (branch: master, revision: be476954e80349cb7ec3ba6a3247cd712189dfcb)
  build user:       root@37f0aa346b26
  build date:       20161104-20:24:03
  go version:       go1.7.3
  • Graphs are shown here:
    Trying to see req/sec, the interval is 5m
    with irate
    screen shot 2016-12-09 at 1 33 49 pm

without irate
screen shot 2016-12-09 at 1 37 33 pm

What i was expecting was irate would take two data points between 5min and get the difference and which then gets divided by number of seconds in 5min interval..

for exampe at timestamp t1, we have 65000request_total and at t2=t1+5min , 75000 request
i was under the impression that the req /s wold yield a data point with
(75000-65000)/300 = 33.33 req/sec.

but the numbers i am getting in the graps are way too high.. Could you please explain how is it being calculated.

A 24hr graph cycle for the request rate that i am trying to graph looks like this
screen shot 2016-12-09 at 2 28 40 pm

Which does not make sense for me at all. By the way i have 24hr retention time. Is there any docs etcs i am missing info about.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 9, 2016

Can you share the raw data points for a 5m period in which irate is producing the wrong answer?

@sudhirpandey

This comment has been minimized.

Copy link
Author

sudhirpandey commented Dec 9, 2016

sure , here it goes.

{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"frontend":"public","instance":"172.30.227.154:9101","job":"haproxy"},"values":[[1481292030.781,"20991.033333333333"],[1481292045.781,"20991.033333333333"],[1481292060.781,"208.5"],[1481292075.781,"208.5"],[1481292090.781,"221.13333333333333"],[1481292105.781,"221.13333333333333"],[1481292120.781,"675.5"],[1481292135.781,"675.5"],[1481292150.781,"21931.966666666667"],[1481292165.781,"21931.966666666667"],[1481292180.781,"608.5666666666667"],[1481292195.781,"608.5666666666667"],[1481292210.781,"202.76666666666668"],[1481292225.781,"202.76666666666668"],[1481292240.781,"273.9"],[1481292255.781,"273.9"],[1481292270.781,"254.8"],[1481292285.781,"254.8"],[1481292300.781,"220.76666666666668"],[1481292315.781,"220.76666666666668"],[1481292330.781,"252.63333333333333"],[1481292345.781,"252.63333333333333"],[1481292360.781,"252.86666666666667"],[1481292375.781,"252.86666666666667"],[1481292390.781,"23825.633333333335"],[1481292405.781,"23825.633333333335"],[1481292420.781,"694.5666666666667"],[1481292435.781,"694.5666666666667"],[1481292450.781,"279.26666666666665"],[1481292465.781,"279.26666666666665"],[1481292480.781,"236.9"],[1481292495.781,"236.9"],[1481292510.781,"232.1"],[1481292525.781,"232.1"],[1481292540.781,"151.56666666666666"],[1481292555.781,"151.56666666666666"],[1481292570.781,"25231.633333333335"],[1481292585.781,"25231.633333333335"],[1481292600.781,"811.7333333333333"]]}]}}

generated via

api/v1/query_range?query=irate(haproxy_frontend_http_requests_total{frontend='public'}[5m])&start=2016-12-09T14:00:30.781Z&end=2016-12-09T14:10:00.781Z&step=15s

The problems is, i am seeing with the irate graph, it increases to some maximum to some point for a period of time and then resets.

How ever the new session graphs seems to point at the things what we are expecting to see.

screen shot 2016-12-09 at 3 26 06 pm

Also we see behaviors from front_end_connection rate which seems to be expected, and is in align with the new session rate.

screen shot 2016-12-09 at 3 22 12 pm

while backend connection_rate graphs show similar problematic behavior, which seems to grow to a certain point and then drop like some kind of counter.
screen shot 2016-12-09 at 3 23 00 pm

which cannot be related to front-end connection rate graph at all.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 9, 2016

That data looks like a mix of data from 2-3 exporters. Are you sure each instance represents a single haproxy?

@sudhirpandey

This comment has been minimized.

Copy link
Author

sudhirpandey commented Dec 9, 2016

so the setup is , we have 3 haproxy containers (pods), that have same haproxy config, running under a service, There is only only one exporter scraping from the the haproxy service in openshift origin as follows

scrape_configs:
- job_name:  "haproxy"
  static_configs:
    - targets:
      - "172.30.227.154:9101"

where the above ip is the haproxy service-ip that has three haproxy pods underneath.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 9, 2016

That won't work, and will have oddness as you've discovered. You need a haproxy exporter per haproxy, and you need to scrape all 3 independently..

@sudhirpandey

This comment has been minimized.

Copy link
Author

sudhirpandey commented Dec 9, 2016

Thanks a lot ,
Now i have adjusted the config to query each the haproxy exporter individually. And now it shows exactly as expected :)

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.