docker stats live container resource metrics #9984

Merged
merged 9 commits into from Jan 21, 2015

Projects

None yet
@crosbymichael
Member

This PR allows you to receive live container metrics for your containers. You can use the docker stats <containers...> cli command to get a live top like interface. This only displays a few of the metrics available.

docker stats insurgency1 insurgency2 insurgency3 minecraft-family redis

CONTAINER           CPU %               MEM USAGE/LIMIT     MEM %               NET I/O
insurgency1         3.62%               244.4 MB/2.099 GB   11.64%              0 B/0 B
insurgency2         4.65%               135.6 MB/2.099 GB   6.46%               0 B/0 B
insurgency3         3.65%               79.18 MB/2.099 GB   3.77%               0 B/0 B
minecraft-family    14.13%              408.6 MB/2.099 GB   19.47%              0 B/0 B
redis               0.17%               6.558 MB/67.11 MB   9.77%               648 B/648 B

For people who require more information they can subscribe to the container's stats stream and receive more information such as blkio information.

GET /v1/containers/redis/stats

{
   "read" : "2015-01-08T22:57:31.547920715Z",
   "network" : {
      "rx_dropped" : 0,
      "rx_bytes" : 648,
      "rx_errors" : 0,
      "tx_packets" : 8,
      "tx_dropped" : 0,
      "rx_packets" : 8,
      "tx_errors" : 0,
      "tx_bytes" : 648
   },
   "memory_stats" : {
      "stats" : {
         "total_pgmajfault" : 0,
         "cache" : 0,
         "mapped_file" : 0,
         "total_inactive_file" : 0,
         "pgpgout" : 414,
         "rss" : 6537216,
         "total_mapped_file" : 0,
         "writeback" : 0,
         "unevictable" : 0,
         "pgpgin" : 477,
         "total_unevictable" : 0,
         "pgmajfault" : 0,
         "total_rss" : 6537216,
         "total_rss_huge" : 6291456,
         "total_writeback" : 0,
         "total_inactive_anon" : 0,
         "rss_huge" : 6291456,
         "hierarchical_memory_limit" : 67108864,
         "total_pgfault" : 964,
         "total_active_file" : 0,
         "active_anon" : 6537216,
         "total_active_anon" : 6537216,
         "total_pgpgout" : 414,
         "total_cache" : 0,
         "inactive_anon" : 0,
         "active_file" : 0,
         "pgfault" : 964,
         "inactive_file" : 0,
         "total_pgpgin" : 477
      },
      "max_usage" : 6651904,
      "usage" : 6537216,
      "failcnt" : 0,
      "limit" : 67108864
   "blkio_stats" : {},
   "cpu_stats" : {
      "cpu_usage" : {
         "percpu_usage" : [
            16970827,
            1839451,
            7107380,
            10571290
         ],
         "usage_in_usermode" : 10000000,
         "total_usage" : 36488948,
         "usage_in_kernelmode" : 20000000
      },
      "system_cpu_usage" : 20091722000000000,
      "throttling_data" : {}
   }
}
@vieux vieux commented on an outdated diff Jan 8, 2015
docker/flags.go
@@ -97,6 +97,7 @@ func init() {
{"save", "Save an image to a tar archive"},
{"search", "Search for an image on the Docker Hub"},
{"start", "Start a stopped container"},
+ {"stats", "Receive container stasts"},
@vieux
vieux Jan 8, 2015 Member

typo here

@vieux vieux and 2 others commented on an outdated diff Jan 8, 2015
daemon/execdriver/lxc/driver.go
@@ -524,3 +524,8 @@ func (t *TtyConsole) Close() error {
func (d *driver) Exec(c *execdriver.Command, processConfig *execdriver.ProcessConfig, pipes *execdriver.Pipes, startCallback execdriver.StartCallback) (int, error) {
return -1, ErrExec
}
+
+func (d *driver) Stats(id string) (*execdriver.ResourceStats, error) {
+ return nil, fmt.Errorf("container stats are not support with LXC")
@vieux
vieux Jan 8, 2015 Member

not supported ?

@crosbymichael
crosbymichael Jan 8, 2015 Member

I'll have to see if I can recreate the cgroup hierarchy then we will be able to read out the stats with libcontainer.

@icecrime
icecrime Jan 8, 2015 Member

I think he just meant "typo"

@vieux
vieux Jan 8, 2015 Member

yes, sorry if I wasn't clear

@bobrik
Contributor
bobrik commented Jan 8, 2015

What about showing stats for all running containers if <containers...> is not specified? You can run docker stats $(docker ps -q), but showing stats for everything that is running looks like a sane default.

@vieux
Member
vieux commented Jan 8, 2015

can you go up the exact number of lines and rewrite on top of it, instead of clearing the screen.

(that's how we do for pull)

@crosbymichael
Member

@bobrik I guess it would depend on how many containers you have. You can always do

docker stats `docker ps -q`
@vieux
Member
vieux commented Jan 8, 2015

exited containers aren't handled properly

@crosbymichael
Member

@vieux I think exited containers should get zeroed out then when they start again you start getting metrics. What do you think of that functionality ?

@vieux vieux commented on an outdated diff Jan 8, 2015
daemon/stats_collector.go
+ cd.subs = append(cd.subs[:i], cd.subs[i+1:]...)
+ close(ch)
+ }
+ }
+ // if there are no more subscribers then remove the entire container
+ // from collection.
+ if len(cd.subs) == 0 {
+ delete(s.containers, c.ID)
+ }
+ s.m.Unlock()
+}
+
+func (s *statsCollector) start() {
+ go func() {
+ for _ = range time.Tick(s.interval) {
+ log.Debugf("starting collection of container stats")
@vieux
vieux Jan 8, 2015 Member

this debug is printed way too often, I feel like the -D is useless with it.

@icecrime icecrime and 1 other commented on an outdated diff Jan 8, 2015
api/client/commands.go
+ )
+ for {
+ var v *stats.Stats
+ if err := dec.Decode(&v); err != nil {
+ log.Error("decode container stat: %v", err)
+ return
+ }
+ var (
+ memPercent = float64(v.MemoryStats.Usage) / float64(v.MemoryStats.Limit) * 100.0
+ cpuPercent = 0.0
+ )
+ if !start {
+ cpuPercent = calcuateCpuPercent(previousCpu, previousSystem, v)
+ }
+ start = false
+ d := data[name]
@icecrime
icecrime Jan 8, 2015 Member

I'm not familiar with map concurrency guarantees, but is that a safe thing to do without locking?

@icecrime icecrime commented on an outdated diff Jan 8, 2015
api/stats/stats.go
+ "time"
+
+ "github.com/docker/libcontainer"
+ "github.com/docker/libcontainer/cgroups"
+)
+
+type ThrottlingData struct {
+ // Number of periods with throttling active
+ Periods uint64 `json:"periods,omitempty"`
+ // Number of periods when the container hit its throttling limit.
+ ThrottledPeriods uint64 `json:"throttled_periods,omitempty"`
+ // Aggregate time the container was throttled for in nanoseconds.
+ ThrottledTime uint64 `json:"throttled_time,omitempty"`
+}
+
+// All CPU stats are aggregate since container inception.
@icecrime
icecrime Jan 8, 2015 Member

s/aggregate/aggregated

@vieux
Member
vieux commented Jan 8, 2015

@crosbymichael I mean the cli is hanging and waiting for

@icecrime icecrime commented on an outdated diff Jan 8, 2015
daemon/delete.go
@@ -49,6 +49,9 @@ func (daemon *Daemon) ContainerRm(job *engine.Job) engine.Status {
}
if container != nil {
+ // stop collection of stats for the container reguardless
@icecrime
icecrime Jan 8, 2015 Member

s/reguardless/regardless/

@icecrime icecrime and 1 other commented on an outdated diff Jan 8, 2015
daemon/execdriver/native/driver.go
sync.Mutex
}
-func NewDriver(root, initPath string) (*driver, error) {
+func NewDriver(root, initPath string, machineMemory int64) (*driver, error) {
@icecrime
icecrime Jan 8, 2015 Member

Any reason to make machineMemory an argument rather than retrieving it here?

@crosbymichael
crosbymichael Jan 8, 2015 Member

No good reason

@vieux
Member
vieux commented Jan 8, 2015

I got a panic when trying to delete a container:

DEBU[0010] Calling DELETE /containers/{name:.*}
INFO[0010] DELETE /v1.16/containers/b0808d936b19?force=1
INFO[0010] +job rm(b0808d936b19)
INFO[0010] -job rm(b0808d936b19)
2015/01/08 23:28:30 http: panic serving @: runtime error: invalid memory address or nil pointer dereference
goroutine 42 [running]:
net/http.func·011()
    /usr/local/go/src/net/http/server.go:1130 +0xbb
github.com/docker/docker/daemon.(*statsCollector).stopCollection(0xc20822b780, 0xc208035ba0)
    /go/src/github.com/docker/docker/daemon/stats_collector.go:66 +0x141
github.com/docker/docker/daemon.(*Daemon).ContainerRm(0xc2080dbad0, 0xc208031100, 0x2)
    /go/src/github.com/docker/docker/daemon/delete.go:54 +0x7a2
github.com/docker/docker/daemon.*Daemon.ContainerRm·fm(0xc208031100, 0x7ffbfc292270)
    /go/src/github.com/docker/docker/daemon/daemon.go:118 +0x31
github.com/docker/docker/engine.(*Job).Run(0xc208031100, 0x0, 0x0)
    /go/src/github.com/docker/docker/engine/job.go:83 +0x936
github.com/docker/docker/api/server.deleteContainers(0xc2080db4a0, 0xc2081e9f89, 0x4, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10, 0xc208144570, 0x0, 0x0)
    /go/src/github.com/docker/docker/api/server/server.go:768 +0x39e
github.com/docker/docker/api/server.func·002(0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /go/src/github.com/docker/docker/api/server/server.go:1243 +0x940
net/http.HandlerFunc.ServeHTTP(0xc2080563c0, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /usr/local/go/src/net/http/server.go:1265 +0x41
github.com/gorilla/mux.(*Router).ServeHTTP(0xc208095220, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /go/src/github.com/docker/docker/vendor/src/github.com/gorilla/mux/mux.go:98 +0x2b9
net/http.serverHandler.ServeHTTP(0xc208054fc0, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /usr/local/go/src/net/http/server.go:1703 +0x19a
net/http.(*conn).serve(0xc208202e60)
    /usr/local/go/src/net/http/server.go:1204 +0xb57
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:1751 +0x35e
@crosbymichael
Member

Fixed the panic

@crosbymichael
Member

@vieux fixed the issue where you request stats for a non running container, it will go ahead and add it then when it's started you see the stats start flowing in.

@bfirsh bfirsh added the UX label Jan 9, 2015
@tonistiigi tonistiigi commented on an outdated diff Jan 12, 2015
api/client/commands.go
+ d.MemoryLimit = float64(v.MemoryStats.Limit)
+ d.MemoryPercentage = memPercent
+ d.NetworkRx = float64(v.Network.RxBytes)
+ d.NetworkTx = float64(v.Network.TxBytes)
+ data[name] = d
+ m.Unlock()
+
+ previousCpu = v.CpuStats.CpuUsage.TotalUsage
+ previousSystem = v.CpuStats.SystemUsage
+ }
+ return nil
+}
+
+func calcuateCpuPercent(previousCpu, previousSystem uint64, v *stats.Stats) float64 {
+ cpuPercent := 0.0
+ cpuDelta := float64(v.CpuStats.CpuUsage.TotalUsage) - float64(previousCpu)
@tonistiigi
tonistiigi Jan 12, 2015 Contributor

Check previousCpu < TotalUsage, otherwise negative numbers appear on container restart.

@tonistiigi tonistiigi commented on an outdated diff Jan 12, 2015
daemon/stats_collector.go
+ }
+ delete(s.containers, id)
+ continue
+ }
+ stats.SystemUsage = systemUsage
+ for _, sub := range s.containers[id].subs {
+ sub <- stats
+ }
+ }
+ s.m.Unlock()
+ }
+ }()
+}
+
+// getSystemdCpuUSage returns the host system's cpu usage
+// in nanoseconds.
@tonistiigi
tonistiigi Jan 12, 2015 Contributor

Seems to return nanoseconds * CLK_TCK. Also, s/USage/Usage/

@tonistiigi tonistiigi commented on an outdated diff Jan 12, 2015
daemon/stats_collector.go
+ return 0, err
+ }
+ defer f.Close()
+ sc := bufio.NewScanner(f)
+ for sc.Scan() {
+ parts := strings.Fields(sc.Text())
+ switch parts[0] {
+ case "cpu":
+ if len(parts) < 8 {
+ return 0, fmt.Errorf("invalid number of cpu fields")
+ }
+ var total uint64
+ for _, i := range parts[1:8] {
+ v, err := strconv.ParseUint(i, 10, 64)
+ if err != nil {
+ return 0.0, fmt.Errorf("Unable to convert value %s to int: %s", i, err)
@tonistiigi
tonistiigi Jan 12, 2015 Contributor

0.0 ?

@tonistiigi tonistiigi commented on an outdated diff Jan 12, 2015
daemon/stats_collector.go
+ for sc.Scan() {
+ parts := strings.Fields(sc.Text())
+ switch parts[0] {
+ case "cpu":
+ if len(parts) < 8 {
+ return 0, fmt.Errorf("invalid number of cpu fields")
+ }
+ var total uint64
+ for _, i := range parts[1:8] {
+ v, err := strconv.ParseUint(i, 10, 64)
+ if err != nil {
+ return 0.0, fmt.Errorf("Unable to convert value %s to int: %s", i, err)
+ }
+ total += v
+ }
+ return total * 1000000000, nil
@tonistiigi
tonistiigi Jan 12, 2015 Contributor

1e9 is more readable

@tonistiigi tonistiigi commented on an outdated diff Jan 12, 2015
daemon/stats_collector.go
+func (s *statsCollector) getSystemCpuUsage() (uint64, error) {
+ f, err := os.Open("/proc/stat")
+ if err != nil {
+ return 0, err
+ }
+ defer f.Close()
+ sc := bufio.NewScanner(f)
+ for sc.Scan() {
+ parts := strings.Fields(sc.Text())
+ switch parts[0] {
+ case "cpu":
+ if len(parts) < 8 {
+ return 0, fmt.Errorf("invalid number of cpu fields")
+ }
+ var total uint64
+ for _, i := range parts[1:8] {
@tonistiigi
tonistiigi Jan 12, 2015 Contributor

Are you deliberately ignoring rest of the values. Why?

@tonistiigi tonistiigi commented on an outdated diff Jan 12, 2015
api/client/commands.go
+ d.NetworkTx = float64(v.Network.TxBytes)
+ data[name] = d
+ m.Unlock()
+
+ previousCpu = v.CpuStats.CpuUsage.TotalUsage
+ previousSystem = v.CpuStats.SystemUsage
+ }
+ return nil
+}
+
+func calcuateCpuPercent(previousCpu, previousSystem uint64, v *stats.Stats) float64 {
+ cpuPercent := 0.0
+ cpuDelta := float64(v.CpuStats.CpuUsage.TotalUsage) - float64(previousCpu)
+ systemDelta := float64(int(v.CpuStats.SystemUsage)/v.ClockTicks) - float64(int(previousSystem)/v.ClockTicks)
+ if systemDelta > 0.0 {
+ cpuPercent = (cpuDelta / systemDelta) * float64(v.ClockTicks*len(v.CpuStats.CpuUsage.PercpuUsage))
@tonistiigi
tonistiigi Jan 12, 2015 Contributor

Seems to me that v.ClockTicks is used here to get a value to percentages, so for clarity it should be constant 100 instead.

@tonistiigi tonistiigi commented on an outdated diff Jan 12, 2015
api/client/commands.go
+ }
+ w.Flush()
+ }
+ return nil
+}
+
+func (cli *DockerCli) streamStats(name string, data map[string]containerStats, m *sync.Mutex) error {
+ m.Lock()
+ data[name] = containerStats{
+ Name: name,
+ }
+ m.Unlock()
+
+ stream, _, err := cli.call("GET", "/containers/"+name+"/stats", nil, false)
+ if err != nil {
+ return err
@tonistiigi
tonistiigi Jan 12, 2015 Contributor

We should probably at least do delete(data, name). No point of leaving empty row in the output in case of an error. Ideally the whole command would error when there are no valid containers.

@tonistiigi
Contributor

When I run processes inside the container with docker exec the memory and network is accounted for in docker stats but CPU is not. Not sure if exec error instead.

edit: made a new issue for this #10046

@jessfraz jessfraz added this to the 1.5.0 milestone Jan 12, 2015
@SvenDowideit
Collaborator

is this the begining of #9130 or a replacement for #8886 ?

@crosbymichael
Member

Updated docs and everything like that

@jessfraz
Contributor

Heres a binary if you wanna try it out:

http://jesss.s3.amazonaws.com/docker/pr9984/docker
http://jesss.s3.amazonaws.com/docker/pr9984/docker-1.4.1-dev
http://jesss.s3.amazonaws.com/docker/pr9984/docker-1.4.1-dev.md5
http://jesss.s3.amazonaws.com/docker/pr9984/docker-1.4.1-dev.sha256

On Mon, Jan 19, 2015 at 4:31 PM, Michael Crosby notifications@github.com
wrote:

Updated docs and everything like that


Reply to this email directly or view it on GitHub
#9984 (comment).

@jessfraz
Contributor

Those are linux amd64

On Mon, Jan 19, 2015 at 6:05 PM, Jessica Frazelle jess@docker.com wrote:

Heres a binary if you wanna try it out:

http://jesss.s3.amazonaws.com/docker/pr9984/docker
http://jesss.s3.amazonaws.com/docker/pr9984/docker-1.4.1-dev
http://jesss.s3.amazonaws.com/docker/pr9984/docker-1.4.1-dev.md5
http://jesss.s3.amazonaws.com/docker/pr9984/docker-1.4.1-dev.sha256

On Mon, Jan 19, 2015 at 4:31 PM, Michael Crosby notifications@github.com
wrote:

Updated docs and everything like that


Reply to this email directly or view it on GitHub
#9984 (comment).

@LK4D4 LK4D4 and 1 other commented on an outdated diff Jan 20, 2015
daemon/stats_collector.go
+func (s *statsCollector) stopCollection(c *Container) {
+ s.m.Lock()
+ if publisher, exists := s.publishers[c]; exists {
+ publisher.Close()
+ delete(s.publishers, c)
+ }
+ s.m.Unlock()
+}
+
+// unsubscribe removes a specific subscriber from receiving updates for a container's stats.
+func (s *statsCollector) unsubscribe(c *Container, ch chan interface{}) {
+ s.m.Lock()
+ publisher := s.publishers[c]
+ if publisher != nil {
+ publisher.Evict(ch)
+ }
@LK4D4
LK4D4 Jan 20, 2015 Contributor

I wonder if it make sense to stop collection if we evict last listener.

@crosbymichael
crosbymichael Jan 20, 2015 Member

Maybe, maybe not. I'm not sure. We could or it should be pretty cheap to leave it in if that listener reconnects

@LK4D4
LK4D4 Jan 20, 2015 Contributor

I don't know, seems like Stats is not so cheap(read file, deserialize, creating pretty big structures) to do it without purpose. Creating new publisher much cheaper.
Up to you though.

@LK4D4 LK4D4 commented on the diff Jan 20, 2015
pkg/pubsub/publisher.go
+
+// NewPublisher creates a new pub/sub publisher to broadcast messages.
+// The duration is used as the send timeout as to not block the publisher publishing
+// messages to other clients if one client is slow or unresponsive.
+// The buffer is used when creating new channels for subscribers.
+func NewPublisher(publishTimeout time.Duration, buffer int) *Publisher {
+ return &Publisher{
+ buffer: buffer,
+ timeout: publishTimeout,
+ subscribers: make(map[subscriber]struct{}),
+ }
+}
+
+type subscriber chan interface{}
+
+type Publisher struct {
@LK4D4
LK4D4 Jan 20, 2015 Contributor

Now I think that Len() method can be useful for my comment above.

@LK4D4
Contributor
LK4D4 commented Jan 20, 2015

I still wanna see integration-cli for cli or api. At least this code should be called in one way or another just to be sure that it's not panicking :) We can write simpe api test, where we'll just unmarshal first message from stream and check that there is some fields and code is 200.

@crosbymichael
Member

I added an integration test to this.

@crosbymichael
Member

@LK4D4 I removed the publisher if publisher.Len() == 0

@icecrime
Member

LGTM

@jessfraz
Contributor

LGTM

@tobegit3hub

We're really looking forward to this feature 👍

@cpuguy83 cpuguy83 commented on an outdated diff Jan 21, 2015
daemon/stats_collector.go
+ stats, err := container.Stats()
+ if err != nil {
+ if err != execdriver.ErrNotRunning {
+ log.Errorf("collecting stats for %s: %v", container.ID, err)
+ }
+ continue
+ }
+ stats.SystemUsage = systemUsage
+ publisher.Publish(stats)
+ }
+ }
+}
+
+const nanoSeconds = 1e9
+
+// getSystemdCpuUSage returns the host system's cpu usage in nanoseconds
@cpuguy83
cpuguy83 Jan 21, 2015 Contributor

Systemd?

crosbymichael and others added some commits Jan 7, 2015
@crosbymichael crosbymichael Implement container stats collection in daemon
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
65f58e2
@crosbymichael crosbymichael Implement client side display for stats
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2640a10
@crosbymichael crosbymichael Evict stopped containers
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
4f174aa
@LK4D4 @crosbymichael LK4D4 Refactor cli for stats
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
cc65880
@crosbymichael crosbymichael Refactor usage calc for CPU and system usage
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2d4fc1d
@crosbymichael crosbymichael Add pubsub package to handle robust publisher
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2f46b76
@crosbymichael crosbymichael Add documentation for stats feature
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
76141a0
@crosbymichael crosbymichael Remove publisher if no one is listening
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
217a2bd
@crosbymichael
Member

Rebased again to make the drone gods happy

@jessfraz
Contributor

ping @SvenDowideit @fredlf for docs

@tiborvass
Contributor

LGTM

@SvenDowideit SvenDowideit commented on an outdated diff Jan 21, 2015
docs/man/docker-stats.1.md
@@ -0,0 +1,32 @@
+% DOCKER(1) Docker User Manuals
+% Docker Community
+% JUNE 2014
+# NAME
+docker-stats - Display live container stats based on resource usage.
+
+# SYNOPSIS
+**docker top**
@SvenDowideit
SvenDowideit Jan 21, 2015 Collaborator

I don't think you mean to spell docker top here?

@SvenDowideit SvenDowideit commented on an outdated diff Jan 21, 2015
docs/man/docker-stats.1.md
+
+# OPTIONS
+**--help**
+ Print usage statement
+
+# EXAMPLES
+
+Run **docker stats** with multiple containers.
+
+ $ sudo docker stats redis1 redis2
+ CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O
+ redis1 0.07% 796 KiB/64 MiB 1.21% 788 B/648 B
+ redis2 0.07% 2.746 MiB/64 MiB 4.29% 1.266 KiB/648 B
+
+# HISTORY
+April 2014, Originally compiled by William Henry (whenry at redhat dot com)
@SvenDowideit
SvenDowideit Jan 21, 2015 Collaborator

oh go on, tell the truth - who wrote this? :)

@SvenDowideit
Collaborator

small niggles, once fixed, LGTM - @fredlf @jamtur01

@crosbymichael crosbymichael Exit cli when all containers when no more containers to monitor
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
4b17319
@icecrime icecrime merged commit d8a1cbe into docker:master Jan 21, 2015

1 check passed

default The build succeeded on drone.io
Details
@crosbymichael crosbymichael deleted the crosbymichael:metrics branch Jan 21, 2015
@fredlf
Contributor
fredlf commented Jan 21, 2015

This was closed with only one docs LGTM.

@icecrime
Member

Oh god, my bad @fredlf... Please add your comments on this PR, I'll fix whatever needs fixing in a new PR.

@fredlf fredlf commented on the diff Jan 21, 2015
docker/flags.go
@@ -98,6 +98,7 @@ func init() {
{"save", "Save an image to a tar archive"},
{"search", "Search for an image on the Docker Hub"},
{"start", "Start a stopped container"},
+ {"stats", "Display live container stats based on resource usage"},
@fredlf
fredlf Jan 21, 2015 Contributor

Misplaced modifier made this a little ambiguous/unclear. Suggest: "Display a live stream of a container's resource usage statistics"

@icecrime
icecrime Jan 21, 2015 Member

Considering that stats accepts multiple containers ID as argument, would the following work for you?

"Display a live stream of one or more containers resource usage statistics"

@fredlf
fredlf Jan 21, 2015 Contributor

Yes, perfect. Don't forget the possessive: containers'

@fredlf fredlf commented on the diff Jan 21, 2015
docs/sources/reference/api/docker_remote_api.md
@@ -68,6 +68,12 @@ New endpoint to rename a container `id` to a new name.
(`ReadonlyRootfs`) can be passed in the host config to mount the container's
root filesystem as read only.
+`GET /containers/(id)/stats`
+
+**New!**
+This endpoint returns a stream of container stats based on resource usage.
@fredlf
fredlf Jan 21, 2015 Contributor

clarify to "This endpoint returns a live stream of a container's resource usage statistics."

@fredlf fredlf commented on the diff Jan 21, 2015
docs/sources/reference/api/docker_remote_api_v1.17.md
@@ -514,6 +514,94 @@ Status Codes:
- **404** – no such container
- **500** – server error
+### Get container stats based on resource usage
+
+`GET /containers/(id)/stats`
+
+Returns a stream of json objects of the container's stats
@fredlf
fredlf Jan 21, 2015 Contributor

Returns a live stream of JSON objects describing a container's resource usage.

@fredlf fredlf commented on the diff Jan 21, 2015
docs/sources/reference/commandline/cli.md
@@ -2001,8 +2001,28 @@ more details on finding shared images from the command line.
-a, --attach=false Attach container's STDOUT and STDERR and forward all signals to the process
-i, --interactive=false Attach container's STDIN
-When run on a container that has already been started,
-takes no action and succeeds unconditionally.
+## stats
+
+ Usage: docker stats [CONTAINERS]
+
+ Display live container stats based on resource usage
@fredlf
fredlf Jan 21, 2015 Contributor

As above.

@fredlf fredlf commented on the diff Jan 21, 2015
docs/sources/reference/commandline/cli.md
@@ -2001,8 +2001,28 @@ more details on finding shared images from the command line.
-a, --attach=false Attach container's STDOUT and STDERR and forward all signals to the process
-i, --interactive=false Attach container's STDIN
-When run on a container that has already been started,
-takes no action and succeeds unconditionally.
+## stats
+
+ Usage: docker stats [CONTAINERS]
+
+ Display live container stats based on resource usage
+
+ --help=false Print usage
+
+Running `docker stats` on two redis containers
@fredlf
fredlf Jan 21, 2015 Contributor

Redis

@fredlf fredlf commented on the diff Jan 21, 2015
docs/sources/reference/commandline/cli.md
+
+ Display live container stats based on resource usage
+
+ --help=false Print usage
+
+Running `docker stats` on two redis containers
+
+ $ sudo docker stats redis1 redis2
+ CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O
+ redis1 0.07% 796 KiB/64 MiB 1.21% 788 B/648 B
+ redis2 0.07% 2.746 MiB/64 MiB 4.29% 1.266 KiB/648 B
+
+
+When run on running containers live container stats will be streamed
+back and displayed to the client. Stopped containers will not
+receive any updates to their stats unless the container is started again.
@fredlf
fredlf Jan 21, 2015 Contributor

The docker stats command will only return a live stream of data for running containers. Stopped containers will not return any data.

@fredlf fredlf commented on the diff Jan 21, 2015
docs/sources/reference/commandline/cli.md
+ --help=false Print usage
+
+Running `docker stats` on two redis containers
+
+ $ sudo docker stats redis1 redis2
+ CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O
+ redis1 0.07% 796 KiB/64 MiB 1.21% 788 B/648 B
+ redis2 0.07% 2.746 MiB/64 MiB 4.29% 1.266 KiB/648 B
+
+
+When run on running containers live container stats will be streamed
+back and displayed to the client. Stopped containers will not
+receive any updates to their stats unless the container is started again.
+
+> **Note:**
+> If you want more in depth resource usage for a container use the API endpoint
@fredlf
fredlf Jan 21, 2015 Contributor

If you want more detailed information about a container's resource usage, use the API endpoint.

@fredlf
Contributor
fredlf commented Jan 21, 2015

No worries, @icecrime . Needs a few copy edits, otherwise LGTM.

@HuKeping
Contributor

really love this feature!!! +1 +1 +1

@HuKeping
Contributor

hi @crosbymichael can we just only get the info of the very moment when execute docker stats container_id, for now it will keep freshing the screen, or am I missing sth?

@LK4D4
Contributor
LK4D4 commented Jan 23, 2015

@HuKeping Yes, you're right, it is like top.

@HuKeping
Contributor

People may use some tools like "Chrome-extensions-Advanced REST client, Firefox-extensions-RESTClient" to get the real-time info of a container, but the keeping output may block the processing.

Is there possible to add an --flag to make docker stats just to do its job for only onetime?

@LK4D4
Contributor
LK4D4 commented Jan 23, 2015

Makes sense for me. Maybe we can take behaviour of dstat with --count

@HuKeping
Contributor

appreciate for that and Thanks

@tobegit3hub

We're waiting for the feature like --count, too. The remote could provide the parameter to get the current stats.

@crosbymichael
Member

@tobegit3hub what would count do?

@tobegit3hub

@crosbymichael It's like what @LK4D4 said. We may want to get the one-time status of containers.

@ekuric
Contributor
ekuric commented Jan 29, 2015

I noticed if I run
$ docker stats $(docker ps -q) and in second console start new container above view will not be automatically refreshed to include container(s) started after
$ docker stats $(docker ps -q) was started
tested with binaries attached earlier by @jfrazelle

@cpuguy83
Contributor

@ekuric This is expected $(docker ps -q) is only run once, and is only pulling those ids.

@ekuric
Contributor
ekuric commented Jan 29, 2015

@cpuguy83 ah,yes, thank you,sorry for noise

@gbelur
gbelur commented Feb 11, 2015

Thanks! This is a great addition. Are there plans to let users control the frequency with which they can extract stats data from the api endpoint? 1 second may be too aggressive when there are a lot of containers running on a host.

@ghost
ghost commented Feb 11, 2015

The CPU usage % seem to be stuck at 0% for me. Being able to monitor the growing size of a running container and its disk I/O would also be very useful.

@cicoub13

Thanks for this new feature. Works like a charm

@crosbymichael
Member

@UserTaken if you want disk IO for monitoring hit the API, the CLI is very dumbed down for a quick and simple view. If you are using this for monitoring use the api

@tehmaspc
Contributor

very nice stuff!

@coulix coulix referenced this pull request in DataDog/dd-agent Feb 12, 2015
Closed

[docker] TLS support in docker connection / docker-py #1299

@alicek106

I'm also waiting for the option like --count or something that just capture the stats once. It will be very useful when getting remote parameter.

@lars2893

+1 to the count option. I'd love to create a New Relic plugin for this data but there is a lot of extra work I have to do if I don't have control over the polling interval (plus the consumption of a continuous stream is much more involved process)

@tobegit3hub

For me, I would like to get the current stats of the containers and no need to return the whole stream. It would be great to support this feature 😃

@alicek106

I agree :)

@SamSaffron

+1 for a simply way of just getting current stats as opposed to subscribing and disconnecting

@SamSaffron

@SvenDowideit API docs need a bit more info on what the cpu numbers mean and how you go about converting them to % cpu

@bobrik
Contributor
bobrik commented Feb 14, 2015

+1 for just getting current numbers without subscription. @crosbymichael I can create a separate issue.

@kenzodeluxe

+1 :)

@noisy
noisy commented Feb 16, 2015

👍

@sprin
sprin commented Feb 25, 2015

Stats are nice. Thanks.

Here's a one-liner that does what you would expect docker stats to do:

docker stats `docker ps | tail -n+2 | awk '{ print $NF }'`

However, the flicker is distinctly unpleasant.

@jovandeginste

I love and appreciate the concept and current possibilities, however I have some thoughts:

I don't understand why it is repeating (or why it doesn't have the count flag). I could eg. use 'watch' and its features to show me a constant update if I want that.

I also find it hard to imagine writing a simple script to parse the json data if it has to be running forever. I'd rather query once a minute and send those stats to graphite (just an example), but I don't see how I could accomplish this in a simple way (someone has any thoughts?)

I would also appreciate the presence of an 'all' option, especially for the json-part (where you would have to run several queries to get all container data)

@km4rcus
km4rcus commented Mar 20, 2015

+1 for a count option.

@thaJeztah
Member

Please stop adding new feature requests to a closed/merged PR. If you have an enhancement/feature request, open a new issue.

For those looking for a "count" option; there is a PR that implements a --no-stream flag, which will return the stats only once; Allow pulling stats once and disconnecting. #10766

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment