Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containerd fails to detect OOM events in kernel 4.19+ #74

Closed
thaJeztah opened this issue Dec 11, 2018 · 5 comments
Closed

containerd fails to detect OOM events in kernel 4.19+ #74

thaJeztah opened this issue Dec 11, 2018 · 5 comments

Comments

@thaJeztah
Copy link
Member

As reported in moby/moby#38352 by @Farrukh-Aftab (thanks!), containerd fails to detect OOM events on kernel 4.19, likely related to the following changes in the kernel; I'll copy the information here;

Description
Hello,

Apologies if the reported bug is a duplicate of another issue. I tried searching through the issues but didn't find anything similar.

There were some improvements made to OOM killer in kernel 4.19 to make it more 'cgroup aware'. You can find the relevant commits under [1] in the additional information section. After the change went in, Docker isn't setting the .State.OOMKilled flag correctly despite OOM killer being invoked.

To reproduce this, I have created a sample image named fakhan/sl7:oom-test. The entrypoint of the image is a program that consumes around 1024MB of memory. Creating this container with anything lower should trigger OOM killer. I have provided more information below on the bug.

BUG REPORT INFORMATION shown below:

Steps to reproduce the issue:

  1. First on an older kernel:
# docker image pull fakhan/sl7:oom-test
oom-test: Pulling from fakhan/sl7
Digest: sha256:6d0f60ad535ba61767073c8414a26416a1304651e3621689a4a8aac621eb39d2
Status: Image is up to date for fakhan/sl7:oom-test

# uname -r
3.10.0-862.11.6.el7.x86_64

# docker container run --network host --name fkhan-test1 --memory 512m --memory-swap 512m fakhan/sl7:oom-test

# docker container inspect fkhan-test1 --format='{{.State.OOMKilled}}'
true
  1. Then on a newer kernel
# docker image pull fakhan/sl7:oom-test
oom-test: Pulling from fakhan/sl7
Digest: sha256:6d0f60ad535ba61767073c8414a26416a1304651e3621689a4a8aac621eb39d2
Status: Image is up to date for fakhan/sl7:oom-test

# uname -r
4.19.4-1.el7.elrepo.x86_64

# docker container run --network host --name fkhan-test2 --memory 512m --memory-swap 512m fakhan/sl7:oom-test

# docker container inspect fkhan-test2 --format='{{.State.OOMKilled}}'
false

I have posted the snippets from /var/log/messages for both these instances under the additional information section.

Describe the results you received:
As shown above, OOMKilled evaluates to 'False' on kernel 4.19

Describe the results you expected:
OOMKilled should have evaluated to 'True' on kernel 4.19 just like the on the previous versions

Additional information you deem important (e.g. issue happens only occasionally):
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dc0b58643aff8b378086f25cce6789ccba68cbcb
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5989ad7b5ede38d605c588981f634c08252abfc3
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3d8b38eb81cac81395f6a823f6bf401b327268e6

[2]
/var/log/messages snippets are as follows. First for the older kernel

Dec 11 10:48:08 fkhan-test-kernelv3 dockerd: time="2018-12-11T10:48:08-06:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/91091d5382d54a44e697e224ebccba7f587c43184bfb51b123cf8
c16920598d9/shim.sock" debug=false pid=11532
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: thrash invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: thrash cpuset=91091d5382d54a44e697e224ebccba7f587c43184bfb51b123cf8c16920598d9 mems_allowed=0-1
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: CPU: 2 PID: 11554 Comm: thrash Kdump: loaded Tainted: G          I    ------------ T 3.10.0-862.11.6.el7.x86_64 #1
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: Hardware name: System Manufacturer System Product Name/KFSN4-DRE, BIOS 1013 11/18/2008
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: Call Trace:
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa21135d4>] dump_stack+0x19/0x1b
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa210e79f>] dump_header+0x90/0x229
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa1b9a7b6>] ? find_lock_task_mm+0x56/0xc0
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa1c0f5b8>] ? try_get_mem_cgroup_from_mm+0x28/0x60
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa1b9ac64>] oom_kill_process+0x254/0x3d0
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa1c133c6>] mem_cgroup_oom_synchronize+0x546/0x570
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa1c12840>] ? mem_cgroup_charge_common+0xc0/0xc0
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa1b9b4f4>] pagefault_out_of_memory+0x14/0x90
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa210c941>] mm_fault_error+0x6a/0x157
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa2120846>] __do_page_fault+0x496/0x4f0
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa21208d5>] do_page_fault+0x35/0x90
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa211ca96>] ? error_swapgs+0xa7/0xbd
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [<ffffffffa211c758>] page_fault+0x28/0x30
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: Task in /docker/91091d5382d54a44e697e224ebccba7f587c43184bfb51b123cf8c16920598d9 killed as a result of limit of /docker/91091d5382d54a44e697e224ebccba7f587c43184bfb51b123c
f8c16920598d9
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: memory: usage 524288kB, limit 524288kB, failcnt 17
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: memory+swap: usage 524288kB, limit 524288kB, failcnt 0
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: kmem: usage 820kB, limit 9007199254740988kB, failcnt 0
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: Memory cgroup stats for /docker/91091d5382d54a44e697e224ebccba7f587c43184bfb51b123cf8c16920598d9: cache:0KB rss:523468KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:46080KB active_anon:477388KB inactive_file:0KB active_file:0KB unevictable:0KB
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: [11554]     0 11554   263196   130965     263        0             0 thrash
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: Memory cgroup out of memory: Kill process 11554 (thrash) score 1001 or sacrifice child
Dec 11 10:48:09 fkhan-test-kernelv3 kernel: Killed process 11554 (thrash) total-vm:1052784kB, anon-rss:523464kB, file-rss:396kB, shmem-rss:0kB
Dec 11 10:48:10 fkhan-test-kernelv3 dockerd: time="2018-12-11T10:48:10-06:00" level=info msg="shim reaped" id=91091d5382d54a44e697e224ebccba7f587c43184bfb51b123cf8c16920598d9
Dec 11 10:48:10 fkhan-test-kernelv3 dockerd: time="2018-12-11T10:48:10.336767875-06:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

Then for the newer kernel

Dec 11 10:49:37 fkhan-test-kernelv4 kernel: thrash invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: thrash cpuset=cc6b369add7d52448ae406800daa01f4836195e51190d9ee922fd685446f8368 mems_allowed=0-1
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: CPU: 40 PID: 32276 Comm: thrash Not tainted 4.19.4-1.el7.elrepo.x86_64 #1
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Hardware name: Supermicro Super Server/X10DRL-iT, BIOS 2.0a 01/13/2017
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Call Trace:
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: dump_stack+0x63/0x88
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: dump_header+0x78/0x2a4
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: ? mem_cgroup_scan_tasks+0x9c/0xf0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: oom_kill_process+0x262/0x290
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: out_of_memory+0x140/0x4b0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: mem_cgroup_out_of_memory+0x4b/0x80
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: try_charge+0x667/0x6f0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: ? __alloc_pages_nodemask+0xf8/0x260
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: mem_cgroup_try_charge+0x8c/0x1e0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: mem_cgroup_try_charge_delay+0x22/0x50
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: do_anonymous_page+0x11a/0x650
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: __handle_mm_fault+0xc5e/0xef0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: handle_mm_fault+0x102/0x220
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: __do_page_fault+0x212/0x4e0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: do_page_fault+0x37/0x140
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: ? page_fault+0x8/0x30
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: page_fault+0x1e/0x30
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: RIP: 0033:0x400728
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Code: eb 4d 48 c7 45 e8 00 00 00 00 eb 34 48 c7 45 e0 00 00 00 00 eb 1b 48 8b 45 e0 48 c1 e0 0c 48 03 45 e8 48 03 45 d0 48 8b 55 e8 <88> 10 48 83 45 e0 01 48 8b 45 e0 48 3
b 45 f0 7c db 48 83 45 e8 01
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: RSP: 002b:00007ffe86af6ca0 EFLAGS: 00010206
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: RAX: 0000000021b18000 RBX: 0000000000000000 RCX: 0000000000000012
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: RDX: 0000000000000000 RSI: 000000007fffffed RDI: 0000000000000000
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: RBP: 00007ffe86af6ce0 R08: 0000000000000000 R09: 00007fcf1403214d
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: R10: 0000000000000022 R11: 0000000000000000 R12: 0000000000400530
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: R13: 00007ffe86af6de0 R14: 0000000000000000 R15: 0000000000000000
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Task in /docker/cc6b369add7d52448ae406800daa01f4836195e51190d9ee922fd685446f8368 killed as a result of limit of /docker/cc6b369add7d52448ae406800daa01f4836195e51190d9ee922
fd685446f8368
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: memory: usage 524288kB, limit 524288kB, failcnt 0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: memory+swap: usage 524288kB, limit 524288kB, failcnt 34
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: kmem: usage 1888kB, limit 9007199254740988kB, failcnt 0
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Memory cgroup stats for /docker/cc6b369add7d52448ae406800daa01f4836195e51190d9ee922fd685446f8368: cache:0KB rss:522208KB rss_huge:520192KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:226260KB active_anon:296032KB inactive_file:0KB active_file:0KB unevictable:0KB
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Tasks state (memory values in pages):
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: [  32276]     0 32276   263199   130814  1097728        0             0 thrash
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Memory cgroup out of memory: Kill process 32276 (thrash) score 1000 or sacrifice child
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: Killed process 32276 (thrash) total-vm:1052796kB, anon-rss:522108kB, file-rss:1148kB, shmem-rss:0kB
Dec 11 10:49:37 fkhan-test-kernelv4 kernel: oom_reaper: reaped process 32276 (thrash), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Dec 11 10:49:37 fkhan-test-kernelv4 dockerd: time="2018-12-11T10:49:37-06:00" level=info msg="shim reaped" id=cc6b369add7d52448ae406800daa01f4836195e51190d9ee922fd685446f8368
Dec 11 10:49:37 fkhan-test-kernelv4 dockerd: time="2018-12-11T10:49:37.430951185-06:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

Output of docker version:
Docker version on both machines is the same (18.06)

# uname -r && docker version
3.10.0-862.11.6.el7.x86_64
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:03 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:25:29 2018
  OS/Arch:          linux/amd64
  Experimental:     false

# uname -r && docker version
4.19.4-1.el7.elrepo.x86_64
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:23:03 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:25:29 2018
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:

# docker info
Containers: 1
 Running: 0
 Paused: 0
 Stopped: 1
Images: 1
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-862.11.6.el7.x86_64
Operating System: Scientific Linux 7.5 (Nitrogen)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 23.39GiB
Name: fkhan-test-kernelv3.fnal.gov
ID: RHNY:DX45:JKGJ:EDBY:IENF:E5DT:VX7R:MLGL:M765:43XU:3V2W:NTLP
Docker Root Dir: /storage/local/data1/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
# docker info
Containers: 1
 Running: 0
 Paused: 0
 Stopped: 1
Images: 1
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.19.4-1.el7.elrepo.x86_64
Operating System: Scientific Linux 7.5 (Nitrogen)
OSType: linux
Architecture: x86_64
CPUs: 56
Total Memory: 251.8GiB
Name: fkhan-test-kernelv4.fnal.gov
ID: ITUE:KSPJ:33PN:3NHA:XPJ6:C2K7:DSNY:6WKW:BSR5:5BAM:BQ25:QAFP
Docker Root Dir: /storage/local/data1/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Additional environment details (AWS, VirtualBox, physical, etc.):
Nothing special about the environment. I am running these commands on two bare metal boxes with different kernels.

@thaJeztah
Copy link
Member Author

/cc @crosbymichael

@holzman
Copy link

holzman commented Dec 21, 2018

Hi,

This is a kernel bug introduced in v4.19, but not from the commits Farrukh referenced, but https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29ef680a -- this broke cgroupvs-v1 eventfd notification completely for OOM events.

I have notified the appropriate kernel maintainers.

@thaJeztah
Copy link
Member Author

Thanks @holzman - appreciated!

@farrukh-aftab-khan
Copy link

A patch to fix this was accepted last week: torvalds/linux@7056d3a

Looks like it'll make it into the next release. Thank you @holzman !

@farrukh-aftab-khan
Copy link

This has been fixed in version 4.20.2-1. The issue can be closed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants