New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics output to docker #25820

Merged
merged 1 commit into from Oct 27, 2016

Conversation

@crosbymichael
Contributor

crosbymichael commented Aug 17, 2016

This adds metrics support to docker via the prometheus output. There is a new /metrics endpoint where the prometheus handler is installed.

This adds metrics to the daemon package for basic container, image, and daemon operations.

The client package being used is located here:

https://github.com/docker/go-metrics

This pr is the beginning, any more packages need to be instrumented but help/suggestions from the individual subsystem maintainers is needed to collection relevant and useful information.

Show outdated Hide outdated daemon/metrics.go
Show outdated Hide outdated daemon/daemon.go
Show outdated Hide outdated daemon/health.go
Show outdated Hide outdated daemon/metrics.go
Show outdated Hide outdated daemon/metrics.go
@jhorwit2

This comment has been minimized.

Show comment
Hide comment
@jhorwit2

jhorwit2 Aug 18, 2016

Contributor

Thoughts on adding expvar collector if daemon is in debug mode?

❤️ this PR! 😃

Contributor

jhorwit2 commented Aug 18, 2016

Thoughts on adding expvar collector if daemon is in debug mode?

❤️ this PR! 😃

// UpdateSince will add the duration from the provided starting time to the
// timer's summary with the precisions that was used in creation of the timer
UpdateSince(time.Time)

This comment has been minimized.

@stevvooe

stevvooe Aug 19, 2016

Contributor

StartTimer costs a closure allocation. This interface is the most performant, but simple stuff can use the sugar.

@stevvooe

stevvooe Aug 19, 2016

Contributor

StartTimer costs a closure allocation. This interface is the most performant, but simple stuff can use the sugar.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Aug 25, 2016

Contributor

@xbglowx you might be interested in this PR

Contributor

crosbymichael commented Aug 25, 2016

@xbglowx you might be interested in this PR

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Aug 26, 2016

Contributor

So cool design LGTM

Contributor

cpuguy83 commented Aug 26, 2016

So cool design LGTM

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Aug 26, 2016

Contributor

@crosbymichael Ha, nice! You might want to look into #9130 for what naming I proposed. That follows the prometheus best practices better. In doubt, see: https://prometheus.io/docs/practices/naming/
Would be really great if those metrics followed the best pratices.

@fabxc You might be interested in this as well.

Contributor

discordianfish commented Aug 26, 2016

@crosbymichael Ha, nice! You might want to look into #9130 for what naming I proposed. That follows the prometheus best practices better. In doubt, see: https://prometheus.io/docs/practices/naming/
Would be really great if those metrics followed the best pratices.

@fabxc You might be interested in this as well.

@stevvooe

This comment has been minimized.

Show comment
Hide comment
@stevvooe

stevvooe Aug 27, 2016

Contributor

Would be really great if those metrics followed the best pratices.

@discordianfish We've read through this in detail, a few times now, and we think we are following them. It would be more constructive if you could identify the areas where we don't follow best practices or suggest how making them different could help out.

Contributor

stevvooe commented Aug 27, 2016

Would be really great if those metrics followed the best pratices.

@discordianfish We've read through this in detail, a few times now, and we think we are following them. It would be more constructive if you could identify the areas where we don't follow best practices or suggest how making them different could help out.

Show outdated Hide outdated daemon/metrics.go

@crosbymichael crosbymichael added this to the 1.13.0 milestone Aug 31, 2016

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Sep 1, 2016

Contributor

@stevvooe I referred to my PR for specific naming suggestions. Beside that by quickly skimming through:

  • Standardize on seconds everywhere
  • Use self-explanatory metric names (not network_tx but network_tx_packets for example, you could also try to use similar/same names as in node_exporter or container_exporter)

Since we're migrating away from Docker, I can't really put much more time in this though. Still, if you have specific questions I'm happy to answer.

Contributor

discordianfish commented Sep 1, 2016

@stevvooe I referred to my PR for specific naming suggestions. Beside that by quickly skimming through:

  • Standardize on seconds everywhere
  • Use self-explanatory metric names (not network_tx but network_tx_packets for example, you could also try to use similar/same names as in node_exporter or container_exporter)

Since we're migrating away from Docker, I can't really put much more time in this though. Still, if you have specific questions I'm happy to answer.

@icecrime

This comment has been minimized.

Show comment
Hide comment
@icecrime

icecrime Sep 9, 2016

Contributor

Ping @LK4D4: PTAL!

Contributor

icecrime commented Sep 9, 2016

Ping @LK4D4: PTAL!

Show outdated Hide outdated daemon/metrics.go
@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Sep 26, 2016

Contributor

@crosbymichael I wonder how heavy all registrations in init. Reexec is still used in overlay2 storage driver, chrootarchive and in multiple places in libnetwork.

Contributor

LK4D4 commented Sep 26, 2016

@crosbymichael I wonder how heavy all registrations in init. Reexec is still used in overlay2 storage driver, chrootarchive and in multiple places in libnetwork.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Sep 27, 2016

Contributor

Ok, seems like it quite heavy, but that's supposed way to use prometheus client.
@crosbymichael need rebase.

Contributor

LK4D4 commented Sep 27, 2016

Ok, seems like it quite heavy, but that's supposed way to use prometheus client.
@crosbymichael need rebase.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Sep 29, 2016

Contributor

Rebased and placed this feature behind the experimental flag.

@LK4D4 for the initialization, these counters are not heavy at all and have no out of process calls or allocations. They are just initializing types and they are small

Contributor

crosbymichael commented Sep 29, 2016

Rebased and placed this feature behind the experimental flag.

@LK4D4 for the initialization, these counters are not heavy at all and have no out of process calls or allocations. They are just initializing types and they are small

@stevvooe

This comment has been minimized.

Show comment
Hide comment
@stevvooe

stevvooe Oct 25, 2016

Contributor

Vim-go isn't sufficient for this sort of use case, you need an actual IDE.

I'd recommend you give it another look. It does most things an IDE can. VSCode is also an option if you're struggling with Go code navigation. Anyways, it allows me to bounce around from file to file without these issues. I haven't tried it in awhile, but SourceGraph can help with Go navigation in github.

Contributor

stevvooe commented Oct 25, 2016

Vim-go isn't sufficient for this sort of use case, you need an actual IDE.

I'd recommend you give it another look. It does most things an IDE can. VSCode is also an option if you're struggling with Go code navigation. Anyways, it allows me to bounce around from file to file without these issues. I haven't tried it in awhile, but SourceGraph can help with Go navigation in github.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Oct 25, 2016

Contributor

oh no, not the editor debate ;)

Contributor

crosbymichael commented Oct 25, 2016

oh no, not the editor debate ;)

@brian-brazil

I'm exclusively a vim user for writing code as it happens. SourceGraph looks like the sort of tool I want when doing debugging, thanks. Doesn't directly support opening links in new tabs though via middle click, which'd slow me down a fair bit when debugging.

Show outdated Hide outdated daemon/events/metrics.go
Show outdated Hide outdated daemon/metrics.go
Show outdated Hide outdated cmd/dockerd/metrics.go
Show outdated Hide outdated cmd/dockerd/daemon.go
Add basic prometheus support
This adds a metrics packages that creates additional metrics.  Add the
metrics endpoint to the docker api server under `/metrics`.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add metrics to daemon package

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

api: use standard way for metrics route

Also add "type" query parameter

Signed-off-by: Alexander Morozov <lk4d4@docker.com>

Convert timers to ms

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
@LK4D4

LK4D4 approved these changes Oct 27, 2016

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Oct 27, 2016

Contributor

LGTM

Contributor

LK4D4 commented Oct 27, 2016

LGTM

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Oct 27, 2016

Contributor

ping @thaJeztah

Contributor

LK4D4 commented Oct 27, 2016

ping @thaJeztah

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Oct 27, 2016

Member

Since this is a new API, we need a new section in the docs for this. Let's do so in a follow up before 1.13 is released

/cc @mstanleyjones

Member

thaJeztah commented Oct 27, 2016

Since this is a new API, we need a new section in the docs for this. Let's do so in a follow up before 1.13 is released

/cc @mstanleyjones

@thaJeztah thaJeztah merged commit 33474a1 into moby:master Oct 27, 2016

5 checks passed

docker/dco-signed All commits signed
Details
experimental Jenkins build Docker-PRs-experimental 25556 has succeeded
Details
janky Jenkins build Docker-PRs 34155 has succeeded
Details
vendor Jenkins build Docker-PRs-vendor 2237 has succeeded
Details
windowsRS1 Jenkins build Docker-PRs-WoW-RS1 5037 has succeeded
Details
@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Oct 27, 2016

Member

opened #27843 for tracking docs

Member

thaJeztah commented Oct 27, 2016

opened #27843 for tracking docs

@crosbymichael crosbymichael deleted the crosbymichael:prom branch Oct 28, 2016

@outcoldman

This comment has been minimized.

Show comment
Hide comment
@outcoldman

outcoldman Oct 29, 2016

Contributor

Am I right that in current implementation only prometheus can be used for collecting metrics? Why not to have extensibility as in log drivers? Current implementation is really tight to the prometheus client library without any abstractions.

Contributor

outcoldman commented Oct 29, 2016

Am I right that in current implementation only prometheus can be used for collecting metrics? Why not to have extensibility as in log drivers? Current implementation is really tight to the prometheus client library without any abstractions.

@brian-brazil

This comment has been minimized.

Show comment
Hide comment
@brian-brazil

brian-brazil Oct 30, 2016

Am I right that in current implementation only prometheus can be used for collecting metrics?

For clarity I define "collection" to mean instrumentation, and "ingestion" as the process of getting the collected metric data into a monitoring system.

In that context, the answer is yes as I understand things on the Docker side.

Why not to have extensibility as in log drivers? Current implementation is really tight to the prometheus client library without any abstractions.

The challenge is that metric instrumentation is nowhere near as standardised as logs. Indeed Prometheus client libraries can be viewed as one standardisation option to allow for extensiblity, as they are designed to be an open ecosystem.

You could add an abstraction at the instrumentation level as I believe you are suggesting. You'd likely end up with a lowest-common denominator library, which would lose you the benefits of things like labels, floating point numbers, take a performance hit and not be idiomatic for any monitoring system. In short you'd gain abstraction, at the cost of good instrumentation. I personally do not believe that is a good tradeoff, and have never seen it work out well when attempted.

With Prometheus client libraries there's abstraction at the ingestion level. The Prometheus approach is to instrument full-on with Prometheus client libraries, and then you use a small shim (called a "bridge") to output to other systems like Graphite (currently in code review to be included out of the box for the Go library, already exists for Java/Python), New Relic, InfluxDB, CloudWatch etc. There's no need for a user to run a Prometheus server in such a scenario, see https://www.robustperception.io/exporting-to-graphite-with-the-prometheus-python-client/ for example. There are other ways you can plumb that too.

Prometheus client libraries put all the smarts in the server rather than the instrumentation and have a data model that's on the powerful end of the scale, so it's almost always possible to automagically convert to something sensible in other monitoring/instrumentation systems. The reverse is unfortunately rarely true.

brian-brazil commented Oct 30, 2016

Am I right that in current implementation only prometheus can be used for collecting metrics?

For clarity I define "collection" to mean instrumentation, and "ingestion" as the process of getting the collected metric data into a monitoring system.

In that context, the answer is yes as I understand things on the Docker side.

Why not to have extensibility as in log drivers? Current implementation is really tight to the prometheus client library without any abstractions.

The challenge is that metric instrumentation is nowhere near as standardised as logs. Indeed Prometheus client libraries can be viewed as one standardisation option to allow for extensiblity, as they are designed to be an open ecosystem.

You could add an abstraction at the instrumentation level as I believe you are suggesting. You'd likely end up with a lowest-common denominator library, which would lose you the benefits of things like labels, floating point numbers, take a performance hit and not be idiomatic for any monitoring system. In short you'd gain abstraction, at the cost of good instrumentation. I personally do not believe that is a good tradeoff, and have never seen it work out well when attempted.

With Prometheus client libraries there's abstraction at the ingestion level. The Prometheus approach is to instrument full-on with Prometheus client libraries, and then you use a small shim (called a "bridge") to output to other systems like Graphite (currently in code review to be included out of the box for the Go library, already exists for Java/Python), New Relic, InfluxDB, CloudWatch etc. There's no need for a user to run a Prometheus server in such a scenario, see https://www.robustperception.io/exporting-to-graphite-with-the-prometheus-python-client/ for example. There are other ways you can plumb that too.

Prometheus client libraries put all the smarts in the server rather than the instrumentation and have a data model that's on the powerful end of the scale, so it's almost always possible to automagically convert to something sensible in other monitoring/instrumentation systems. The reverse is unfortunately rarely true.

@peterbourgon

This comment has been minimized.

Show comment
Hide comment
@peterbourgon

peterbourgon Nov 1, 2016

Contributor

You could add an abstraction at the instrumentation level . . .

Self-plug: one candidate is go-kit/kit/metrics, which provides exactly this abstraction and suffers none of the drawbacks you mention. I'm not necessarily advocating for it in this circumstance—it seems exceedingly unlikely that Docker should or would use a different metrics backend than Prometheus—but at least the option is there.

Contributor

peterbourgon commented Nov 1, 2016

You could add an abstraction at the instrumentation level . . .

Self-plug: one candidate is go-kit/kit/metrics, which provides exactly this abstraction and suffers none of the drawbacks you mention. I'm not necessarily advocating for it in this circumstance—it seems exceedingly unlikely that Docker should or would use a different metrics backend than Prometheus—but at least the option is there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment