Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more prometheus metrics #1405

Closed
wants to merge 6 commits into from
Closed

Add more prometheus metrics #1405

wants to merge 6 commits into from

Conversation

andrzejressel
Copy link

This MR implement more metrics mentioned here.

It implements more detailed backend metrics (it now includes backend server), frontend metrics (see below) and config reload metrics.

Unfortunatelly there doesn't seem to be a clean way to get more information about frontend without rewriting route management.

I'm also not sure about frontend/backend times. I've included debug start/stop messages for backend and frontend and these are the results:

backend start
backend start
frontend start
DEBU[2017-04-09T21:03:27+02:00] Round trip: http://httpbin.org/, code: 200, duration: 165.885424ms 
frontend stop
backend stop
backend stop

Which is very weird because backend is invoked before frontend and I don't know how to add time metrics that will correctly calculate frontend and backend response times.

@timoreimann
Copy link
Contributor

@jereksel you cannot change vendored code directly; you have to make a PR against upstream and afterwards revendor the updated library in Traefik (i.e., use glide to bump our referenced version).

For the time being though, I think it's okay to pretend this is the oxy change we want, do the review, and eventually update containous/oxy (our fork of vulcand/oxy) should we all agree to the suggested path.

Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to have more metrics, appreciated!

Left a few comments. One thing I'm not clear on is whether the oxy change can be implemented differently (i.e., without touching the oxy package). I'm just not too deep into that part of the code.
@containous/traefik any thoughts on this one?

@@ -19,13 +24,41 @@ type Metrics interface {
// given Metrics implementation to expose and monitor Traefik metrics
type MetricsWrapper struct {
Impl Metrics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the blank line.

server.go Outdated
server.backendReloadCounter = prometheus.NewCounterFrom(
stdprometheus.CounterOpts{
Name: "traefik_backend_reload_total",
Help: "How many backend requests, partitioned by state and type.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description seems wrong?

func NewMetricsWrapper(impl Metrics) *MetricsWrapper {
func NewBackendMetricsWrapper(impl Metrics) *MetricsWrapper {

var f = func(r *http.Request) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do f := here?

@@ -19,13 +24,41 @@ type Metrics interface {
// given Metrics implementation to expose and monitor Traefik metrics
type MetricsWrapper struct {
Impl Metrics

//frontend/backend
typ string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the fact that typ isn't proper English, I also find it too generic. We should try to come up with a better name.

Maybe "phase"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about naming it type, but it's reserved keyword. Phase sound good thought.


//frontend/backend
typ string
getName fn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be more specific. What kind of name? getServerName maybe?

"net/http"
)

//Transformer is a hacky way to get metrics middleware with frontend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space between leading double dashes and first word.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get why we need the transformer. Could you explain please?

return &Transformer{next, metrics}
}

func (sb *Transformer) ServeHTTP(rw http.ResponseWriter, r *http.Request) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the name sb?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied it from another file 😄

server.go Outdated
@@ -88,6 +92,13 @@ func NewServer(globalConfiguration GlobalConfiguration) *Server {
// leadership creation if cluster mode
server.leadership = cluster.NewLeadership(server.routinesPool.Ctx(), globalConfiguration.Cluster)
}
server.backendReloadCounter = prometheus.NewCounterFrom(
stdprometheus.CounterOpts{
Name: "traefik_backend_reload_total",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should say "reloads" (plural) according to the Prometheus naming guidelines.

server.go Outdated
@@ -219,17 +230,22 @@ func (server *Server) listenProviders(stop chan bool) {
return
case configMsg, ok := <-server.configurationChan:
if !ok {
server.backendReloadCounter.With("state", "failed", "type", configMsg.ProviderName).Add(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the case where the channel gets closed, which (AFAIU) could be for legitimate reasons. Hence, we should not count this as a failed reload event.

server.go Outdated
server.defaultConfigurationValues(configMsg.Configuration)
currentConfigurations := server.currentConfigurations.Get().(configs)
jsonConf, _ := json.Marshal(configMsg.Configuration)
log.Debugf("Configuration received from provider %s: %s", configMsg.ProviderName, string(jsonConf))
if configMsg.Configuration == nil || configMsg.Configuration.Backends == nil && configMsg.Configuration.Frontends == nil {
log.Infof("Skipping empty Configuration for provider %s", configMsg.ProviderName)
labels = []string{"state", "failed", "type", configMsg.ProviderName}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only parameter that changes between the different occasions where we assign the slice is the state. Can we initialize the slice with the other parameters in line 236 right away and just complete the state in the subsequent lines, respectively?

@@ -19,13 +24,41 @@ type Metrics interface {
// given Metrics implementation to expose and monitor Traefik metrics
type MetricsWrapper struct {
Impl Metrics

//frontend/backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should maybe be a bit more elaborate.

Also, space between slash and first word.

@ldez
Copy link
Member

ldez commented Apr 24, 2017

@jereksel do you the time to update this PR before the upcoming feature freeze for the 1.3 release?

@andrzejressel
Copy link
Author

Sure, I'll try to fix this MR today.

@andrzejressel
Copy link
Author

andrzejressel commented Apr 24, 2017

@idez @timoreimann Maybe I'll close this and redo it with #1408 and #1485

@ldez
Copy link
Member

ldez commented Apr 24, 2017

@jereksel simply rebase your branch, like that you'll already have the content of the #1408

@timoreimann
Copy link
Contributor

timoreimann commented Apr 24, 2017

Yep, rebase is the way to go.

We have also restructured/moved some packages. (For instance, /server.go now lives in /server/server.go.) My git client got a bit confused by that and tried to merge files into integration/. I managed to solve this by specifying the rename threshold like this: git rebase -X find-renames=25 upstream/master.

@andrzejressel
Copy link
Author

andrzejressel commented Apr 24, 2017

I'll wait for #1485 - I think I have an idea how to combine it with Prometheus metrics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants