Add more prometheus metrics #1405

andrzejressel · 2017-04-09T19:46:27Z

This MR implement more metrics mentioned here.

It implements more detailed backend metrics (it now includes backend server), frontend metrics (see below) and config reload metrics.

Unfortunatelly there doesn't seem to be a clean way to get more information about frontend without rewriting route management.

I'm also not sure about frontend/backend times. I've included debug start/stop messages for backend and frontend and these are the results:

backend start
backend start
frontend start
DEBU[2017-04-09T21:03:27+02:00] Round trip: http://httpbin.org/, code: 200, duration: 165.885424ms 
frontend stop
backend stop
backend stop

Which is very weird because backend is invoked before frontend and I don't know how to add time metrics that will correctly calculate frontend and backend response times.

timoreimann · 2017-04-09T21:42:46Z

@jereksel you cannot change vendored code directly; you have to make a PR against upstream and afterwards revendor the updated library in Traefik (i.e., use glide to bump our referenced version).

For the time being though, I think it's okay to pretend this is the oxy change we want, do the review, and eventually update containous/oxy (our fork of vulcand/oxy) should we all agree to the suggested path.

timoreimann

Great to have more metrics, appreciated!

Left a few comments. One thing I'm not clear on is whether the oxy change can be implemented differently (i.e., without touching the oxy package). I'm just not too deep into that part of the code.
@containous/traefik any thoughts on this one?

timoreimann · 2017-04-11T05:25:14Z

middlewares/metrics.go

@@ -19,13 +24,41 @@ type Metrics interface {
 // given Metrics implementation to expose and monitor Traefik metrics
 type MetricsWrapper struct {
 	Impl Metrics
+


Please remove the blank line.

timoreimann · 2017-04-14T17:17:18Z

server.go

+	server.backendReloadCounter = prometheus.NewCounterFrom(
+		stdprometheus.CounterOpts{
+			Name: "traefik_backend_reload_total",
+			Help: "How many backend requests, partitioned by state and type.",


Description seems wrong?

timoreimann · 2017-04-14T17:20:52Z

middlewares/metrics.go

-func NewMetricsWrapper(impl Metrics) *MetricsWrapper {
+func NewBackendMetricsWrapper(impl Metrics) *MetricsWrapper {
+
+	var f = func(r *http.Request) string {


Can we do f := here?

timoreimann · 2017-04-14T17:26:41Z

middlewares/metrics.go

@@ -19,13 +24,41 @@ type Metrics interface {
 // given Metrics implementation to expose and monitor Traefik metrics
 type MetricsWrapper struct {
 	Impl Metrics
+
+	//frontend/backend
+	typ     string


Apart from the fact that typ isn't proper English, I also find it too generic. We should try to come up with a better name.

Maybe "phase"?

I thought about naming it type, but it's reserved keyword. Phase sound good thought.

timoreimann · 2017-04-14T17:27:26Z

middlewares/metrics.go

+
+	//frontend/backend
+	typ     string
+	getName fn


This should be more specific. What kind of name? getServerName maybe?

timoreimann · 2017-04-15T07:00:58Z

middlewares/transformer.go

+	"net/http"
+)
+
+//Transformer is a hacky way to get metrics middleware with frontend


Space between leading double dashes and first word.

Not sure I get why we need the transformer. Could you explain please?

timoreimann · 2017-04-15T07:01:39Z

middlewares/transformer.go

+	return &Transformer{next, metrics}
+}
+
+func (sb *Transformer) ServeHTTP(rw http.ResponseWriter, r *http.Request) {


Why the name sb?

I copied it from another file 😄

timoreimann · 2017-04-15T08:44:49Z

server.go

@@ -88,6 +92,13 @@ func NewServer(globalConfiguration GlobalConfiguration) *Server {
 		// leadership creation if cluster mode
 		server.leadership = cluster.NewLeadership(server.routinesPool.Ctx(), globalConfiguration.Cluster)
 	}
+	server.backendReloadCounter = prometheus.NewCounterFrom(
+		stdprometheus.CounterOpts{
+			Name: "traefik_backend_reload_total",


This should say "reloads" (plural) according to the Prometheus naming guidelines.

timoreimann · 2017-04-15T08:48:21Z

server.go

@@ -219,17 +230,22 @@ func (server *Server) listenProviders(stop chan bool) {
 			return
 		case configMsg, ok := <-server.configurationChan:
 			if !ok {
+				server.backendReloadCounter.With("state", "failed", "type", configMsg.ProviderName).Add(1)


This is the case where the channel gets closed, which (AFAIU) could be for legitimate reasons. Hence, we should not count this as a failed reload event.

timoreimann · 2017-04-15T09:01:45Z

server.go

 			server.defaultConfigurationValues(configMsg.Configuration)
 			currentConfigurations := server.currentConfigurations.Get().(configs)
 			jsonConf, _ := json.Marshal(configMsg.Configuration)
 			log.Debugf("Configuration received from provider %s: %s", configMsg.ProviderName, string(jsonConf))
 			if configMsg.Configuration == nil || configMsg.Configuration.Backends == nil && configMsg.Configuration.Frontends == nil {
 				log.Infof("Skipping empty Configuration for provider %s", configMsg.ProviderName)
+				labels = []string{"state", "failed", "type", configMsg.ProviderName}


The only parameter that changes between the different occasions where we assign the slice is the state. Can we initialize the slice with the other parameters in line 236 right away and just complete the state in the subsequent lines, respectively?

timoreimann · 2017-04-15T09:09:43Z

middlewares/metrics.go

@@ -19,13 +24,41 @@ type Metrics interface {
 // given Metrics implementation to expose and monitor Traefik metrics
 type MetricsWrapper struct {
 	Impl Metrics
+
+	//frontend/backend


The comment should maybe be a bit more elaborate.

Also, space between slash and first word.

ldez · 2017-04-24T12:52:15Z

@jereksel do you the time to update this PR before the upcoming feature freeze for the 1.3 release?

andrzejressel · 2017-04-24T12:56:49Z

Sure, I'll try to fix this MR today.

andrzejressel · 2017-04-24T13:09:54Z

@idez @timoreimann Maybe I'll close this and redo it with #1408 and #1485

ldez · 2017-04-24T13:21:18Z

@jereksel simply rebase your branch, like that you'll already have the content of the #1408

timoreimann · 2017-04-24T13:47:17Z

Yep, rebase is the way to go.

We have also restructured/moved some packages. (For instance, /server.go now lives in /server/server.go.) My git client got a bit confused by that and tried to merge files into integration/. I managed to solve this by specifying the rename threshold like this: git rebase -X find-renames=25 upstream/master.

andrzejressel · 2017-04-24T18:10:18Z

I'll wait for #1485 - I think I have an idea how to combine it with Prometheus metrics.

andrzejressel added 5 commits April 8, 2017 18:51

Add generic metric for successfull/failing requests

9909e66

Metric for configuration update

d76fe12

Backend metrics have server now

c90a36d

Frontend metrics

9435c78

Fix travis

01942e3

timoreimann suggested changes Apr 15, 2017

View reviewed changes

timoreimann reviewed Apr 15, 2017

View reviewed changes

ldez added contributor/needs-resolve-conflicts contributor/waiting-for-corrections labels Apr 19, 2017

Merge remote-tracking branch 'upstream/master' into more_metrics

82e3b74

andrzejressel closed this Apr 24, 2017

ldez removed contributor/needs-resolve-conflicts contributor/waiting-for-corrections labels Apr 24, 2017

ldez added the area/middleware/metrics label May 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more prometheus metrics #1405

Add more prometheus metrics #1405

andrzejressel commented Apr 9, 2017

timoreimann commented Apr 9, 2017

timoreimann left a comment

timoreimann Apr 11, 2017

timoreimann Apr 14, 2017

timoreimann Apr 14, 2017

timoreimann Apr 14, 2017

andrzejressel Apr 24, 2017

timoreimann Apr 14, 2017

timoreimann Apr 15, 2017

timoreimann Apr 15, 2017

timoreimann Apr 15, 2017

andrzejressel Apr 24, 2017

timoreimann Apr 15, 2017

timoreimann Apr 15, 2017

timoreimann Apr 15, 2017

timoreimann Apr 15, 2017

ldez commented Apr 24, 2017

andrzejressel commented Apr 24, 2017

andrzejressel commented Apr 24, 2017 •

edited

ldez commented Apr 24, 2017

timoreimann commented Apr 24, 2017 •

edited

andrzejressel commented Apr 24, 2017 •

edited

Add more prometheus metrics #1405

Add more prometheus metrics #1405

Conversation

andrzejressel commented Apr 9, 2017

timoreimann commented Apr 9, 2017

timoreimann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ldez commented Apr 24, 2017

andrzejressel commented Apr 24, 2017

andrzejressel commented Apr 24, 2017 • edited

ldez commented Apr 24, 2017

timoreimann commented Apr 24, 2017 • edited

andrzejressel commented Apr 24, 2017 • edited

andrzejressel commented Apr 24, 2017 •

edited

timoreimann commented Apr 24, 2017 •

edited

andrzejressel commented Apr 24, 2017 •

edited