Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker metrics (read metrics from cgroups for specified container) #8886

Closed
wants to merge 1 commit into from
Closed

docker metrics (read metrics from cgroups for specified container) #8886

wants to merge 1 commit into from

Conversation

monsterzz
Copy link
Contributor

Fixes #8842.

This is first and very simple implementation of docker metrics command. A lot of work needed to push it to production.

Roadmap:

  • make find of cgroup paths more reliable (not just try to search predefined locations, but get exact value from procfs)
  • specify set of metrics (from cpuacct, memory, blkio, ... controllers)
  • store metrics in struct instead of map[string]string (typed data)
  • -f FORMAT support
  • documentation

PS. PR created for further discussion and this branch will be constantly updated.

/cc @shykes @tobegit3hub @thaJeztah

Signed-off-by: Gleb M Borisov <borisov.gleb@gmail.com>
return nil
}

indented := new(bytes.Buffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use json.Unmarshal and a struct or map for this, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's copy-paste from CmdInspect. I've marked both functions with TODO (will
extract common things and make their code less ugly).

Thanks for the tip, just a first week with Go :)
On Fri 31 Oct 2014 at 10:22 pm Erik Hollensbe notifications@github.com
wrote:

In api/client/commands.go:

@@ -812,6 +812,46 @@ func (cli *DockerCli) CmdInspect(args ...string) error {
return nil
}

+func (cli *DockerCli) CmdMetrics(args ...string) error {

  • cmd := cli.Subcmd("metrics", "CONTAINER", "Return runtime information on a container")
  • if err := cmd.Parse(args); err != nil {
  •   return nil
    
  • }
  • if cmd.NArg() < 1 {
  •   cmd.Usage()
    
  •   return nil
    
  • }
  • indented := new(bytes.Buffer)

use json.Unmarshal and a struct or map for this, please.


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/8886/files#r19688050.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. thanks. just let us know when it's ready for review again.

@SvenDowideit
Copy link
Contributor

needs documentation in cli.md, a man page (in docs/man) and in runmetrics.md

@jeremyeder
Copy link

Hi @crosbymichael and @monsterzz this is along the lines of what we're doing with nsinit at the moment. Nice to see it potentially in docker itself. This patch currently does not work on RHEL-like systems, and I think @monsterzz eluded to the cgroup path config needing some work, which is fine. I didn't look into it deeply but hope that can be resolved.

    "Metrics": {
        "CpuUsage": "Error: cgroup subsystem 'cpuacct' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
        "MemoryLimit": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
        "MemoryMaxUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
        "MemoryUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'"
    },

I would expect a monitoring system will want a global "watch" on all containers, existing and new. Do you agree ? Basically docker would continuously pump all it's stats to a websocket at a certain interval, and the monitoring system would consume at it's own pace.

You could have i.e. " docker metrics --all" for the cli.

Agreed the stat calculation interval should be tunable; if I could suggest you default to 10s rather than 1s. 1s stat gathering on busy, dense nodes is a costly burden we should reserve for field-debug/troubleshooting. Even 10s might be too aggressive depending on the business.

The other thing is the patch itself decides on arbitrary names that closely resemble their cgroup counterparts. Why not get rid of all ambiguity and use the precise names used by the kernel ?

@vishh
Copy link
Contributor

vishh commented Nov 13, 2014

I wonder if it might be better to fork a separate process from the docker
daemon that handles stats acquisition and processing. This new process can
share code with docker daemon and can remain an internal component. This
might help scale docker daemon in the long run. We can continue to have a
single API and have the docker daemon be the source of metrics. WDYT
@crosbymichael?

On Thu, Nov 13, 2014 at 12:53 PM, Jeremy Eder notifications@github.com
wrote:

Hi @crosbymichael https://github.com/crosbymichael and @monsterzz
https://github.com/monsterzz this is along the lines of what we're
doing with nsinit at the moment. Nice to see it potentially in docker
itself. This patch currently does not work on RHEL-like systems, and I
think @monsterzz https://github.com/monsterzz eluded to the cgroup path
config needing some work, which is fine. I didn't look into it deeply but
hope that can be resolved.

"Metrics": {
    "CpuUsage": "Error: cgroup subsystem 'cpuacct' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
    "MemoryLimit": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
    "MemoryMaxUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'",
    "MemoryUsage": "Error: cgroup subsystem 'memory' directory not found for container 'a7412e396560bcc886009472fdcd0433203598b0c4a11a9dfd4223e33c1c9173'"
},

I would expect a monitoring system will want a global "watch" on all
containers, existing and new. Do you agree ? Basically docker would
continuously pump all it's stats to a websocket at a certain interval, and
the monitoring system would consume at it's own pace.

You could have i.e. " docker metrics --all" for the cli.

Agreed the stat calculation interval should be tunable; if I could suggest
you default to 10s rather than 1s. 1s stat gathering on busy, dense nodes
is a costly burden we should reserve for field-debug/troubleshooting. Even
10s might be too aggressive depending on the business.

The other thing is the patch itself decides on arbitrary names that
closely resemble their cgroup counterparts. Why not get rid of all
ambiguity and use the precise names used by the kernel ?


Reply to this email directly or view it on GitHub
#8886 (comment).

@crosbymichael
Copy link
Contributor

@vishh do you not think the Go scheduler could handle this? As long as this is async I think go should be able to handle the load within one process. We can always break this out into another process later if we find that this is not the case and we are having performance issues.

@tobegit3hub
Copy link

We really need this. Can anyone help to review and merge it 😃

@monsterzz
Copy link
Contributor Author

Unfortunately I have no time to implement push approach at this time. I
think I will have more free time to work on this after Christmas.
On Wed 17 Dec 2014 at 1:02 pm tobe notifications@github.com wrote:

We really need this. Can anyone help to review and merge it [image:
😃]


Reply to this email directly or view it on GitHub
#8886 (comment).

@crosbymichael
Copy link
Contributor

@monsterzz no problem. I'll take care of finishing this feature.

Closing this in favor of #9984

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Get metrics from docker daemon
10 participants