Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker stats returns "0" despite valid cgroup data. #43387

Open
Destarianon opened this issue Mar 17, 2022 · 3 comments
Open

Docker stats returns "0" despite valid cgroup data. #43387

Destarianon opened this issue Mar 17, 2022 · 3 comments
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.

Comments

@Destarianon
Copy link

Destarianon commented Mar 17, 2022

Description
Docker stats on my fresh 20.04.4 installation are all reporting 0. This has been reproduced on 6 servers thus far, all reporting the same issue. I have confirmed cgroup settings, and manually edited my grub configuration based on other recommendations found, with no success. I don't have any other compute environment to test this on, but have successfully deployed this on AMD64 previously without issues. Any help would really be appreciated I am at an absolute loss.

Steps to reproduce the issue:

  1. Install ubuntu server 20.04.4 on VMWare ESXi 7.0
  2. Install docker-ce using the "convenience script" (curl -sSL https://get.docker.com/ | CHANNEL=stable bash)
  3. Start a container
  4. Run "docker stats"

Describe the results you received:
All stats are 0.

CONTAINER ID   NAME                                   CPU %     MEM USAGE / LIMIT   MEM %     NET I/O   BLOCK I/O   PIDS
de09efeb2eb8   8a388389-a1fd-41b3-a98d-bd3c2dc90f3b   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0

Describe the results you expected:
Expected to see valid results reported

Additional information you deem important (e.g. issue happens only occasionally):
Occurs on all containers at all times.

Output of docker version:

Client: Docker Engine - Community
 Version:           20.10.13
 API version:       1.41
 Go version:        go1.16.15
 Git commit:        a224086
 Built:             Thu Mar 10 14:07:51 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.13
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.15
  Git commit:       906f57f
  Built:            Thu Mar 10 14:05:44 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.5.10
  GitCommit:        2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc:
  Version:          1.0.3
  GitCommit:        v1.0.3-0-gf46b6ba
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Output of docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.0-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 5
  Running: 0
  Paused: 0
  Stopped: 5
 Images: 11
 Server Version: 20.10.13
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc version: v1.0.3-0-gf46b6ba
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-104-generic
 Operating System: Ubuntu 20.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 62.81GiB
 Name: wing1
 ID: 3ACN:M4HV:YB2E:MLCK:25JW:QVEN:LBTW:TAEG:VBTC:EM2A:7JPW:N6R3
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):
This is running on ESXi 7.0 on an AMD Threadripper Pro. The only customization done to the VM settings beyond VMWare's 'ubuntu 64-bit' defaults was switching from BIOS to EFI.

Grub config:

GRUB_CMDLINE_LINUX_DEFAULT="cgroup_enable=cpuset cgroup_enable=memory swapaccount=1"

cat /proc/cgroups

#subsys_name	hierarchy	num_cgroups	enabled
cpuset	7	2	1
cpu	3	55	1
cpuacct	3	55	1
blkio	8	55	1
memory	4	102	1
devices	6	55	1
freezer	11	3	1
net_cls	10	2	1
perf_event	9	2	1
net_prio	10	2	1
hugetlb	12	2	1
pids	2	59	1
rdma	5	1	1

Output from check-config.sh script:

warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-5.4.0-104-generic ...

Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- apparmor: enabled and tools installed
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_IPVS: enabled (as module)
- CONFIG_NETFILTER_XT_MARK: enabled (as module)
- CONFIG_IP_NF_NAT: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_POSIX_MQUEUE: enabled
- CONFIG_CGROUP_BPF: enabled

Optional Features:
- CONFIG_USER_NS: enabled
- CONFIG_SECCOMP: enabled
- CONFIG_SECCOMP_FILTER: enabled
- CONFIG_CGROUP_PIDS: enabled
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_MEMCG_SWAP_ENABLED: missing
    (cgroup swap accounting is currently enabled)
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: enabled
- CONFIG_NET_CLS_CGROUP: enabled (as module)
- CONFIG_CGROUP_NET_PRIO: enabled
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: missing
- CONFIG_IP_NF_TARGET_REDIRECT: enabled (as module)
- CONFIG_IP_VS: enabled (as module)
- CONFIG_IP_VS_NFCT: enabled
- CONFIG_IP_VS_PROTO_TCP: enabled
- CONFIG_IP_VS_PROTO_UDP: enabled
- CONFIG_IP_VS_RR: enabled (as module)
- CONFIG_SECURITY_SELINUX: enabled
- CONFIG_SECURITY_APPARMOR: enabled
- CONFIG_EXT4_FS: enabled
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
  - "overlay":
    - CONFIG_VXLAN: enabled (as module)
    - CONFIG_BRIDGE_VLAN_FILTERING: enabled
      Optional (for encrypted networks):
      - CONFIG_CRYPTO: enabled
      - CONFIG_CRYPTO_AEAD: enabled
      - CONFIG_CRYPTO_GCM: enabled
      - CONFIG_CRYPTO_SEQIV: enabled
      - CONFIG_CRYPTO_GHASH: enabled
      - CONFIG_XFRM: enabled
      - CONFIG_XFRM_USER: enabled (as module)
      - CONFIG_XFRM_ALGO: enabled (as module)
      - CONFIG_INET_ESP: enabled (as module)
  - "ipvlan":
    - CONFIG_IPVLAN: enabled (as module)
  - "macvlan":
    - CONFIG_MACVLAN: enabled (as module)
    - CONFIG_DUMMY: enabled (as module)
  - "ftp,tftp client in container":
    - CONFIG_NF_NAT_FTP: enabled (as module)
    - CONFIG_NF_CONNTRACK_FTP: enabled (as module)
    - CONFIG_NF_NAT_TFTP: enabled (as module)
    - CONFIG_NF_CONNTRACK_TFTP: enabled (as module)
- Storage Drivers:
  - "aufs":
    - CONFIG_AUFS_FS: enabled (as module)
  - "btrfs":
    - CONFIG_BTRFS_FS: enabled (as module)
    - CONFIG_BTRFS_FS_POSIX_ACL: enabled
  - "devicemapper":
    - CONFIG_BLK_DEV_DM: enabled
    - CONFIG_DM_THIN_PROVISIONING: enabled (as module)
  - "overlay":
    - CONFIG_OVERLAY_FS: enabled (as module)
  - "zfs":
    - /dev/zfs: present
    - zfs command: missing
    - zpool command: missing

Limits:
- /proc/sys/kernel/keys/root_maxkeys: 1000000

And most importantly, memory.stat shows valid information:

cat /sys/fs/cgroup/memory/docker/de09efeb2eb874bb875312ddcf4767683b8714f1ded71679d999f3c490c066d2/memory.stat 
cache 1074864128
rss 1676713984
rss_huge 0
shmem 0
mapped_file 64069632
dirty 135168
writeback 0
swap 0
pgpgin 869484
pgpgout 197557
pgfault 748407
pgmajfault 594
inactive_anon 0
active_anon 1676759040
inactive_file 990240768
active_file 84750336
unevictable 0
hierarchical_memory_limit 8399998976
hierarchical_memsw_limit 8399998976
total_cache 1074864128
total_rss 1676713984
total_rss_huge 0
total_shmem 0
total_mapped_file 64069632
total_dirty 135168
total_writeback 0
total_swap 0
total_pgpgin 869484
total_pgpgout 197557
total_pgfault 748407
total_pgmajfault 594
total_inactive_anon 0
total_active_anon 1676759040
total_inactive_file 990240768
total_active_file 84750336
total_unevictable 0
@AkihiroSuda AkihiroSuda added the kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. label Mar 17, 2022
@aldun
Copy link

aldun commented Mar 18, 2022

I'm seeing the same issue on dedicated hardware running Ubuntu 20.04.4

CONTAINER ID   NAME                                   CPU %     MEM USAGE / LIMIT   MEM %     NET I/O   BLOCK I/O   PIDS
afdac9bc819c   44b288aa-caf9-4afa-a427-bc511b4a7fc3   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
fadc283349b9   920f3461-8f81-4472-ad8e-269e508e4d81   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
45d90c6fa6b2   acd068bb-c3a5-4363-82e8-0068765adfd3   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
32447a888e04   e40075dc-af72-46f7-88ff-62006b58a0ab   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
25c22cb82cf9   b35cfdfd-81c6-49e6-92cc-269fa0dcaa50   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
171c0e345a48   4300cac0-159f-4c60-a038-d6a295424c6a   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
1229c876e44f   a37991f5-1c03-4e41-a990-7949a70f8a6c   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
fdf727b1ca92   f215d3db-c540-4406-bd80-557df4469295   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
3757a0d75fc0   b77bc5f5-3422-41cb-a34f-35807ea2a495   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
f11e0ff29feb   e121f20c-3c78-446c-b4bc-5261ec566cdd   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
9df469a870e3   59b557cc-e8a2-4996-9e89-cd4504996aae   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
e9f85e40980e   7b6c86e2-9012-4259-b91b-c8db648f6f37   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
7834cb5322ad   6d0172dc-2aaf-46d7-b956-fc9f880f724c   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
4bde6fcee264   0b0e8f04-c900-4600-8e66-626d5375200e   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
16ea1d695597   86581df3-b49f-4b0a-826e-0f874e508e38   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0
66c00a703851   1028cc0d-e334-4a1f-bc91-46e3ae8e4743   0.00%     0B / 0B             0.00%     0B / 0B   0B / 0B     0

Docker Version output:

Client: Docker Engine - Community
 Version:           20.10.13
 API version:       1.41
 Go version:        go1.16.15
 Git commit:        a224086
 Built:             Thu Mar 10 14:07:51 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.13
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.15
  Git commit:       906f57f
  Built:            Thu Mar 10 14:05:44 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.5.10
  GitCommit:        2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc:
  Version:          1.0.3
  GitCommit:        v1.0.3-0-gf46b6ba
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker Info output:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.0-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 28
  Running: 16
  Paused: 0
  Stopped: 12
 Images: 127
 Server Version: 20.10.13
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc version: v1.0.3-0-gf46b6ba
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-104-generic
 Operating System: Ubuntu 20.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 32
 Total Memory: 251.9GiB
 Name: atlas
 ID: EN6S:APPI:4WLA:AIIE:W56U:R5V4:WQ2X:P7WO:BPQG:QLY3:U6GN:K3CG
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

memory.stat output:

cache 22437888
rss 388341760
rss_huge 0
shmem 811008
mapped_file 675840
dirty 675840
writeback 0
pgpgin 326403
pgpgout 225237
pgfault 3938979
pgmajfault 0
inactive_anon 0
active_anon 388743168
inactive_file 14462976
active_file 5812224
unevictable 0
hierarchical_memory_limit 1177600000
total_cache 22437888
total_rss 388341760
total_rss_huge 0
total_shmem 811008
total_mapped_file 675840
total_dirty 675840
total_writeback 0
total_pgpgin 326403
total_pgpgout 225237
total_pgfault 3938979
total_pgmajfault 0
total_inactive_anon 0
total_active_anon 388743168
total_inactive_file 14462976
total_active_file 5812224
total_unevictable 0

cat /proc/cgroups

#subsys_name	hierarchy	num_cgroups	enabled
cpuset	11	18	1
cpu	4	162	1
cpuacct	4	162	1
blkio	3	162	1
memory	6	848	1
devices	5	164	1
freezer	10	20	1
net_cls	2	18	1
perf_event	9	18	1
net_prio	2	18	1
hugetlb	7	18	1
pids	8	166	1
rdma	12	1	1

@Destarianon
Copy link
Author

Destarianon commented Mar 18, 2022

Rolling back package versions I was able to find the culprit.
It looks like containerd.io version 1.5.10-1 is the issue. Manually installing containerd.io version 1.4.13-1 or older brings all of my stats back.

For other's benefit:
apt-cache policy containerd.io will tell you the version currently installed as well as available candidates.
apt-get install containerd.io=1.4.13-1 will roll back to a known working version on ubuntu 20.04.4 as of right now.
apt-mark hold containerd.io will freeze the package version for now to prevent updates from causing issues.

For future reference, sudo apt-mark unhold will allow you to unfreeze that package later.

I did not need to downgrade any other docker related packages.

42wim added a commit to 42wim/moby that referenced this issue May 7, 2022
Fixes
- docker/for-linux#1284
- containerd/containerd#6700
- moby#43387

Update to cgroups v1.0.1 which has the current proto for cgroupsv1
Need to update cilium/ebpf dependency to v0.4.0
42wim added a commit to 42wim/moby that referenced this issue May 7, 2022
Fixes
- docker/for-linux#1284
- containerd/containerd#6700
- moby#43387

Update to cgroups v1.0.1 which has the current proto for cgroupsv1
Need to update cilium/ebpf dependency to v0.4.0

Signed-off-by: Wim <wim@42.be>
lmbarros pushed a commit to balena-os/balena-engine that referenced this issue Nov 22, 2022
Fixes
- docker/for-linux#1284
- containerd/containerd#6700
- moby/moby#43387

Update to cgroups v1.0.1 which has the current proto for cgroupsv1
Need to update cilium/ebpf dependency to v0.4.0

Signed-off-by: Wim <wim@42.be>
lmbarros pushed a commit to balena-os/balena-engine that referenced this issue Nov 22, 2022
Fixes
- docker/for-linux#1284
- containerd/containerd#6700
- moby/moby#43387

Update to cgroups v1.0.1 which has the current proto for cgroupsv1
Need to update cilium/ebpf dependency to v0.4.0

Signed-off-by: Wim <wim@42.be>
lmbarros pushed a commit to balena-os/balena-engine that referenced this issue Dec 8, 2022
Fixes
- docker/for-linux#1284
- containerd/containerd#6700
- moby/moby#43387

Update to cgroups v1.0.1 which has the current proto for cgroupsv1
Need to update cilium/ebpf dependency to v0.4.0

Signed-off-by: Wim <wim@42.be>
lmbarros pushed a commit to balena-os/balena-engine that referenced this issue Dec 21, 2022
Fixes
- docker/for-linux#1284
- containerd/containerd#6700
- moby/moby#43387

Update to cgroups v1.0.1 which has the current proto for cgroupsv1
Need to update cilium/ebpf dependency to v0.4.0

Signed-off-by: Wim <wim@42.be>
lmbarros pushed a commit to balena-os/balena-engine that referenced this issue Feb 7, 2023
Fixes
- docker/for-linux#1284
- containerd/containerd#6700
- moby/moby#43387

Update to cgroups v1.0.1 which has the current proto for cgroupsv1
Need to update cilium/ebpf dependency to v0.4.0

Signed-off-by: Wim <wim@42.be>
@tommysitehost
Copy link

tommysitehost commented Mar 23, 2023

We are seeing the same issue however, only for a subset of containers?

CONTAINER ID   NAME               CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O   PIDS
f79fee4b7fd3   proxy        0.10%     43.5MiB / 7.748GiB    0.55%     247GB / 251GB     0B / 0B     33
4ced89d0682e   web   0.00%     0B / 0B               0.00%     0B / 0B           0B / 0B     0
e763dffa8759   db            0.05%     689.5MiB / 7.748GiB   8.69%     8.19GB / 52.4GB   0B / 0B     40

We just upgraded to the latest docker:

  • docker-ce: 20.10.6 to 23.0.1
  • containerd.io: 1.4.4 to 1.6.18

Server is running 20.04.4 LTS OS

Seems like the container stopped reporting metrics after processes within the web container were killed by oom killer. We may have found the culprit:

Mar 23 20:11:08 SERVER kernel: containerd-shim invoked oom-killer: gfp_mask=0x1140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=-998

Is this expected? The metrics work again after the container was restarted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
Projects
None yet
Development

No branches or pull requests

4 participants