Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please follow Prometheus best practices #81

Closed
SuperQ opened this issue Apr 13, 2019 · 30 comments
Closed

Please follow Prometheus best practices #81

SuperQ opened this issue Apr 13, 2019 · 30 comments
Labels
2.0 This issue affects the HAProxy 2.0 stable branch. 2.1 This issue affects the HAProxy 2.1 stable branch. 2.2 This issue affects the HAProxy 2.2 stable branch. 2.3 This issue affects the HAProxy 2.3 stable branch. dev This issue affects the HAProxy development branch. severity: minor This issue is of MINOR severity.

Comments

@SuperQ
Copy link

SuperQ commented Apr 13, 2019

It's great that there is now a Prometheus metrics endpoint in haproxy. However, a number of the metrics fail to follow best practices. I understand the desire to expose everything possible, but this is not always good for users.

For example, haproxy_frontend_connections_rate_current is extraneous because it duplicates rate(haproxy_frontend_connections_total[1m]).

Having these duplicates take up resources and confuse users as to which metrics are important to watch.

@SuperQ SuperQ added status: needs-triage This issue needs to be triaged. type: bug This issue describes a bug. labels Apr 13, 2019
@lukastribus
Copy link
Member

CC'ing @capflam

@capflam
Copy link
Member

capflam commented Apr 15, 2019

I'm not a Prometheus expert and honestly have no time to review all metrics to figure out if it should be kept or not. If you have a list of metrics to remove, it will help me a lot. But note that in the official exporter, which should follow best practices, it is not pretty clear. On one hand, only haproxy_frontend_connections_total appears, no rate, as you said. But on the other hand, haproxy_frontend_sessions_total and haproxy_frontend_current_session_rate are both provided.

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

Yes, it seems like there are a few mistakes in the official exporter. I can see about building a cleanup list.

@capflam
Copy link
Member

capflam commented Apr 15, 2019

Thanks

@wtarreau
Copy link
Member

Well, I've never used prometheus but I definitely see how exporting an instant rate is fundamentally different from having a remote component try to measure it over a long period. With your proposed change above, you'll only get an averaged rate over one minute. When you export instant values, you have them with the internal resolution (the second), which is particularly important when you want to report min/max values. This is typically used to size an equipment : if you run an instance which takes peaks of 1000 SSL handshakes per second but your average value shows an average of 60, you will have a hard time figuring why some users complain about slow response times.

@brian-brazil
Copy link

In either case, you'd need to scrape once a second to see it.

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

@wtarreau That was just an example. It highly depends on the scrape frequency for which you collect metrics. What you're suggesting is statistically sketchy. Since the "connection current rate" is only over the last second, you only have a one in scrape interval chance of catching that peak.

Prometheus is perfectly capable of scraping at a 1s interval, so a rate(foo[1s]) is possible. So if you really need 1s peak resolution, you have to scrape every second anyway.

I currently scrape metrics from haproxy via mtail parsing every 5 seconds. We use mtail for haproxy metrics because it can generate histogram data for response latency.

@wtarreau
Copy link
Member

I understand well that if you don't poll it every second you'll only retrieve statistical values, which is why I said it does have merit when combined with min and maxes also exported by haproxy from the measurement point. Also, when polling every second the network time (and sometimes even CPU time) can start to induce important variations that make averages wrong. Simply spending 200ms in a system's buffer on a small retransmit, or renegociating the TLS key once in a while when retrieving data can make your measures vary by 20%. I'm not saying it's terrible, just that it's something to be aware of, as people who poll often are generally the ones who expect accurate metrics.

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

@wtarreau This is one of the reasons why Prometheus works the way it does. The time in transit to collect metrics doesn't affect the accuracy. Only the amount of time in process to gather the snapshot of the data. Prometheus sets the timestamp of the data to the millisecond of the GET /metrics. If the process snapshots the metrics in 2ms, and takes 200ms format it, gzip it, and send it to Prometheus, that part doesn't matter.

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

Here's a first pass at some issues with the current data.

Remove due to duplicate with counters:

  • INF_CONN_RATE
  • INF_SESS_RATE
  • INF_SSL_RATE

Incorrect types:

  • INF_NBTHREAD - gauge
  • INF_NBPROC - gauge
  • INF_PROCESS_NUM - gauge
  • INF_MEMMAX_MB - gauge
  • INF_ULIMIT_N - gauge
  • INF_MAX_SSL_CONNS - gauge
  • INF_MAXPIPES - gauge
  • INF_CONN_RATE_LIMIT - gauge
  • INF_MAX_CONN_RATE - gauge
  • INF_SESS_RATE_LIMIT - gauge
  • INF_MAX_SESS_RATE - gauge
  • INF_SSL_RATE_LIMIT - gauge
  • INF_MAX_SSL_RATE - gauge

Don't calculate uptime

  • INF_UPTIME_SEC - expose start_date.tv_sec raw (process_start_time_seconds)

Exposing wrong units, use bytes. Also gauges with _total in the name.

  • INF_MEMMAX_MB (haproxy_max_memory_bytes)
  • INF_POOL_ALLOC_MB (haproxy_pool_allocated_bytes)
  • INF_POOL_USED_MB (haproxy_pool_used_bytes)

There are probably others. I recommend passing the output of the data through promtool check metrics to validate things.

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

Another issue I found is there are some current rate metrics that are missing counters. This is something that needs to be fixed in haproxy's internal metrics first.

  • INF_SSL_FRONTEND_KEY_RATE
  • INF_SSL_BACKEND_KEY_RATE

Similarly, there are "X average over the last 1024 requests" are missing cumulative counters. This makes them not so useful for high traffic servers.

  • ST_F_QTIME
  • ST_F_CTIME
  • ST_F_RTIME
  • ST_F_TTIME

Really, it would be nice to have the ability to get histogram data out of haproxy. The lack of metrics in haproxy for this kind of data is one of the reasons why I'm trying to find a replacement. 😞

@brian-brazil
Copy link

haproxy_max_memory_bytes

haproxy_memory_bytes_max would be more typical. This keeps related things together alphabetically.

@wtarreau
Copy link
Member

@SuperQ:

Another issue I found is there are some current rate metrics that are missing counters. This is something that needs to be fixed in haproxy's internal metrics first.

* INF_SSL_FRONTEND_KEY_RATE
* INF_SSL_BACKEND_KEY_RATE

The equivalent counters indeed do not exist and could be added. Historically they were not considered useful as the main point was to be able to only use them to detect abnormal situations and rate-limit them if needed.

Similarly, there are "X average over the last 1024 requests" are missing cumulative counters. This makes them not so useful for high traffic servers.

* ST_F_QTIME
* ST_F_CTIME
* ST_F_RTIME
* ST_F_TTIME

These are times, and as such do not add together. They are gauges, just like CPU usage or memory usage for example.

Really, it would be nice to have the ability to get histogram data out of haproxy. The lack of metrics in haproxy for this kind of data is one of the reasons why I'm trying to find a replacement. disappointed

It's always a trade-off. Quite a number of people have already complained that the stats we produce are too large for their use case and that it consumes insane amounts of bandwidth and/or CPU usage to retrieve them. Conversely others would like to have more. I'm personally always in favor of adding more counters as long as they are 1) releavant, 2) accurate enough, 3) reasonably inexpensive to produce, and 4) reasonably easy to exploit.

Just propose. Better, propose some patches.

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

@wtarreau You are mistaken. Times do add together, this is a very common monitoring pattern.

For example, CPU usage is a counter at its core. It is the number of CPU seconds. When you look at /proc/stat or /proc/self/stat, you will get back a counter. Prometheus, and other monitoring systems handle this. Prometheus users expect these raw counter values to be available. This way they can be processed over time without loss of accuracy.

For example: rate(http_request_duration_seconds_total[30s]) / rate(http_requests_total[30s]) can be used to give you the average latency over an arbitrary time range, rather than "the last 1024 requests".

By having raw counters, we're robust against sample loss, and the monitoring operator can choose the resolution they need for their needs. 1024 samples may be too many for low traffic instances, but too few for high traffic instances. It's a lose-lose situation for people monitoring these metrics.

For example, these are the counter values for user, nice, system, idle, etc.

cpu0 6270690 29894 1490770 83931268 114822 0 2211384 0 0 0
cpu1 6266823 27478 1469958 1795122 2988 0 1135348 0 0 0

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

You could say the same thing about byte counters for the same thing you're talking about for time counters. They're both measuring different dimensions about a request. Both are valid for counters.

Worse yet, by having the "last 1024 requests" tracked internally, they're actually more expensive to produce. They take more memory and consume more CPU time to track than simply exposing the raw counter data.

@wtarreau
Copy link
Member

Well, I know how CPU time is accounted, thank you. The difference here is that we're speaking of multiplying time by number of concurrent requests since the same time will be counted for each request. I know that usually this is done do detect expensive URLs for example (halog does this for instance to sort URLs by total wasted time). But here I don't find it this much relevant eventhough I see the value when your tool is able to perform such operations.

Plus your argument about the current implementation using more memory doesn't stand, it's exactly the opposite :-) Right now we're done with a single u32 per history counter, an addition and a bit shift for operations. Keeping the total values requires to double the counter size.

Don't get me wrong, I know we're knit-picking here. In practice you're currently facing the accumulation of 18 years of counters that people have used every day to monitor their production without requiring any external tool. You're starting to discover what needs to be added for better consumption by external tools. It's normal and expected that such data are missing since they were not even usable previously.

However adding all these extra counters will take some time because multiple places will have to be modified to produce the data, then the prometheus exporter will need to be adapted to retrieve and export them. Anyway unless someone is willing to start to assing some time to do this carefully, it's unlikely to move by itself.

@SuperQ
Copy link
Author

SuperQ commented Apr 15, 2019

Cumulative time counters should require no different overhead to the cumulative byte counters. We need these desperately because manually looking at the haproxy status page is not useful to us as we have many instances of haproxy and need to alert on both aggregate statistics and individual instances automatically.

I've been using external tools for decades. I've been using SNMP for a very long time, also Nagios since it was a new thing. The problem is, these tools have reached their limits. Hence new tools like Graphite, InfluxDB, and Prometheus have been built in the last 10 years. It's not like Prometheus is that new a tool either, having been used actively in production environments since 2014.

The haproxy_exporter for Prometheus was one of the first ones written in 2013.

External monitoring tools have been around longer than haproxy. 😁

I understand the difficultly in adding this additional support. I looked at the haproxy code for a couple days several years ago wanting to add Prometheus support. But I ended up not being able to justify the time it would take to do the work. At the time there wasn't a business case for me to spend the time on it.

With the addition of this new plugin, I can probably justify some time to work on things.

@capflam
Copy link
Member

capflam commented Apr 17, 2019

Sorry, I was busy on fixing annoying bugs. So, I've checked quickly. About redundant or duplicated metrics, I found:

  • haproxy_frontend_connections_rate_current (ST_F_CONN_RATE)
  • haproxy_frontend_http_requests_rate_current (ST_F_REQ_RATE)
  • haproxy_*_current_session_rate (ST_F_RATE)

For others, it seems to be good. For INF_CONN_RATE, INF_SESS_RATE and INF_SSL_RATE, there is no cumulative counter to deduce these rates. For instance, INF_CUM_CONN will not be incremented if global.maxsock is reached while INF_CONN_RATE` will be updated.

For metrics type (gauge instead of counter), I will check. But I used the counter type when the metric is never decremented. For me a constant is a counter. For _MB metrics, it is fair enough.

Finally, for all changes on the stats inside HAProxy, please, create corresponding feature requests instead.

@capflam capflam added dev This issue affects the HAProxy development branch. severity: minor This issue is of MINOR severity. and removed status: needs-triage This issue needs to be triaged. type: bug This issue describes a bug. labels Apr 17, 2019
@brian-brazil
Copy link

For me a constant is a counter.

The question is do you care about the absolute value, or how fast it is increasing? This helps distinguish counters and gauges in cases where there's no decrements.

@SuperQ
Copy link
Author

SuperQ commented Apr 17, 2019

@capflam Thanks, that seems correct.

+1 to what @brian-brazil says.

haproxy-mirror pushed a commit that referenced this issue Apr 18, 2019
Following metrics have been removed:

  * haproxy_frontend_connections_rate_current (ST_F_CONN_RATE)
  * haproxy_frontend_http_requests_rate_current (ST_F_REQ_RATE)
  * haproxy_*_current_session_rate (ST_F_RATE)

These rates can be deduced using the total value with this kind of formula:

  rate(haproxy_frontend_connections_total[1m])

No backport needed. See issue #81 on github.
haproxy-mirror pushed a commit that referenced this issue Apr 18, 2019
…able

Some metrics have been renamed and their type adapted to be more usable in
Prometheus:

  * haproxy_process_uptime_seconds -> haproxy_process_start_time_seconds
  * haproxy_process_max_memory -> haproxy_process_max_memory_bytes
  * haproxy_process_pool_allocated_total -> haproxy_process_pool_allocated_bytes
  * haproxy_process_pool_used_total -> haproxy_process_pool_used_bytes
  * haproxy_process_ssl_cache_lookups -> haproxy_process_ssl_cache_lookups_total
  * haproxy_process_ssl_cache_misses -> haproxy_process_ssl_cache_misses_total

No backport needed. See issue #81 on github.
haproxy-mirror pushed a commit that referenced this issue Apr 18, 2019
…cs type

In short, _total metrics are now counters and others are gauges.

No backport needed. See issue #81 on github.
@capflam
Copy link
Member

capflam commented Apr 18, 2019

I pushed some fixes on the exporter. Is it good for you or something is missing ?

@brian-brazil
Copy link

I've left some comments on the commit.

@capflam
Copy link
Member

capflam commented Apr 18, 2019

Thanks. But honestly it is tedious for me to loop on this exporter, especially as I'm not a user of Prometheus. I don't really know what is expected, what is ambiguous or what ever. So I really appreciate your help and you interest about it. But it would be easier to have an exhaustive request about all changes to make in once. Or better a patch. Feel free to send it to the mailing list. I'll be happy to merge it then.

@SuperQ
Copy link
Author

SuperQ commented Apr 18, 2019

@capflam Do you take PRs on github?

@capflam
Copy link
Member

capflam commented Apr 18, 2019

PRs are not merged on github directly. It is just a read-only mirror. But a bot sends all PRs on the mailing list to be reviewed. So, if it is easier for you, it's not a problem. Also, take a look at CONTRIBUTING.

@TimWolla TimWolla added the 2.0 This issue affects the HAProxy 2.0 stable branch. label Jun 17, 2019
@TimWolla TimWolla added the 2.2 This issue affects the HAProxy 2.2 stable branch. label Jul 9, 2020
@TimWolla TimWolla added 2.3 This issue affects the HAProxy 2.3 stable branch. 2.1 This issue affects the HAProxy 2.1 stable branch. labels Jan 10, 2021
@capflam
Copy link
Member

capflam commented Feb 5, 2021

I'm closing this issue. It is a bit old now. @wdauchy already performed many changes on the Prometheus exporter and he will continue to improve it. If anyone has any idea to improve the exporter, feel free to open a new issue based on the current exporter state.

@capflam capflam closed this as completed Feb 5, 2021
@SuperQ
Copy link
Author

SuperQ commented Feb 5, 2021

Yes, things are looking very good now. I'm very happy to see the continued improvement in Prometheus support in haproxy. I'm probably going to deprecate the haproxy_exporter soon. 🎉

Now we just need to get histogram request duration data from haproxy so I can stop using mtail to get this info. 😄

@SuperQ
Copy link
Author

SuperQ commented Feb 5, 2021

I filed #1104 about histograms.

@capflam
Copy link
Member

capflam commented Feb 5, 2021

Thanks !

@wdauchy
Copy link
Contributor

wdauchy commented Feb 5, 2021

thanks @SuperQ !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0 This issue affects the HAProxy 2.0 stable branch. 2.1 This issue affects the HAProxy 2.1 stable branch. 2.2 This issue affects the HAProxy 2.2 stable branch. 2.3 This issue affects the HAProxy 2.3 stable branch. dev This issue affects the HAProxy development branch. severity: minor This issue is of MINOR severity.
Projects
None yet
Development

No branches or pull requests

7 participants