More detailed metrics #156

keis · 2016-01-08T13:52:59Z

It would be great if traefik could provide some more detailed metrics, e.g

Response time quantile to avoid skew from one or two outliers. (like how prometheus does it)
Breakdown by backend
- request count
- response time

thawkins · 2016-05-21T07:32:15Z

Definatly +1 on this....

I would also add response status to that list.

keis · 2016-05-23T07:27:57Z

Most of the heavy lifting for this is done by the logger middleware now. It should be pretty simple to add some prometheus counters to that. Not sure if that's the best place for doing it or if prometheus is the right tool. I do quite like it though :)

samber · 2016-05-25T10:38:14Z

I think most interesting metrics are:

traefik build id
per entrypoint:
- Number of active connections
per backend type (consul catalog, file, etcd, kubernetes...):
- Number of successful backend updates
- Number of failed backend updates
- number catched updates
- number effective reload
per service:
- circuit breaker:
  - number of backends available
  - number failing backends
- frontend:
  - response time (including Traefik processing)
  - number successful requests (1xx, 2xx, 3xx)
  - number failing requests (4xx, 5xx)
  - request size
  - response size
- per backend:
  - response time (excluding Traefik processing)
  - number successful requests (1xx, 2xx, 3xx)
  - number failing requests (4xx, 5xx)
  - retry count
  - circuit breaker:
    - is available (bool)
    - expression (example: NetworkErrorRatio() > 0.5)
    - number network error

Prometheus is probably the more "trendly" time series db. IMO, that's why we should integrate a /metrics route in the web provider, using the Prometheus data format.

Prometheus can aggregate backends stats to get avg or sum of each metrics, then adding "total" counts would only be used for /healthrequests.

thawkins · 2016-05-25T17:48:54Z

+1 on prometheus integration, prometheus will be out of the box on the next release of kubenetes. Having good integration would be excellent.

samber · 2016-05-31T09:47:27Z

Last week, I made a standalone Prometheus exporter: https://github.com/iadvize/traefik-exporter

I will combine that work with the Traefik binary very soon.

As an example of limitation of using an external exporter: we cannot use Prometheus summaries to mesure the duration of 1% slowest requests.

therc · 2016-06-02T15:42:19Z

It would be great to also have the lists and rates by top N client IPs or some other fields defined in the configuration (top Host:, Authentication: headers). Maybe even have an endpoint to add IPs and headers to track manually, even if they're not in the top N. This would allow user to run a DDOS detection/mitigation system or a throttler alongside Traefik. I doubt most people will want either of those built-in or that it would meet all their needs.

brian-brazil · 2016-09-07T07:14:32Z

Top N isn't really well suited to Prometheus, as it'll cause too much label churn - particularly on this type of system which may be talked to by the entire internet.

All other mentioned use cases are perfect for Prometheus, maybe add last update time too? The exposition format is open and there's parsers currently available in Go and Python so it's possible to integrate into other monitoring systems too.

tlvenn · 2016-11-05T13:12:33Z

HAProxy has a pretty useful dashboard when it comes to metrics, probably a good path to follow

http://demo.haproxy.org/

Would it very nice to have something similar out of the box with Traefik.

arehmandev · 2017-01-17T11:41:39Z

Looks like metrics has now been implementedt:

175659a

enxebre · 2017-01-19T14:15:20Z

this is partially covered now by #1022 and #1042. Which should serve as a foundation for adding more detailed metrics.

timoreimann · 2017-04-04T20:59:06Z

I'm going to close this issue as basic support has shipped.

The list in #156 (comment) and @brian-brazil's suggestion in #156 (comment) regarding uptime should provide as a great basis for future additions. Please file dedicated issues for additional metrics.

emilevauge added the kind/enhancement a new or improved feature. label Jan 8, 2016

samber mentioned this issue May 31, 2016

feat(prometheus): Adding a Prometheus exporter on /metrics. #424

Closed

therc mentioned this issue Jun 2, 2016

DDOS detection/mitigation support #431

Open

9 tasks

timoreimann closed this as completed Apr 4, 2017

andrzejressel mentioned this issue Apr 9, 2017

Add more prometheus metrics #1405

Closed

traefik locked and limited conversation to collaborators Sep 1, 2019

traefiker added the status/5-frozen-due-to-age label Sep 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More detailed metrics #156

More detailed metrics #156

keis commented Jan 8, 2016

thawkins commented May 21, 2016

keis commented May 23, 2016

samber commented May 25, 2016 •

edited

thawkins commented May 25, 2016 •

edited

samber commented May 31, 2016 •

edited

therc commented Jun 2, 2016

brian-brazil commented Sep 7, 2016

tlvenn commented Nov 5, 2016

arehmandev commented Jan 17, 2017 •

edited

enxebre commented Jan 19, 2017 •

edited

timoreimann commented Apr 4, 2017

More detailed metrics #156

More detailed metrics #156

Comments

keis commented Jan 8, 2016

thawkins commented May 21, 2016

keis commented May 23, 2016

samber commented May 25, 2016 • edited

thawkins commented May 25, 2016 • edited

samber commented May 31, 2016 • edited

therc commented Jun 2, 2016

brian-brazil commented Sep 7, 2016

tlvenn commented Nov 5, 2016

arehmandev commented Jan 17, 2017 • edited

enxebre commented Jan 19, 2017 • edited

timoreimann commented Apr 4, 2017

samber commented May 25, 2016 •

edited

thawkins commented May 25, 2016 •

edited

samber commented May 31, 2016 •

edited

arehmandev commented Jan 17, 2017 •

edited

enxebre commented Jan 19, 2017 •

edited