Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker OOM event handling inappropriate #670

Open
3 tasks done
aleozlx opened this issue May 16, 2019 · 14 comments
Open
3 tasks done

Docker OOM event handling inappropriate #670

aleozlx opened this issue May 16, 2019 · 14 comments

Comments

@aleozlx
Copy link

aleozlx commented May 16, 2019

  • This is a bug report
  • This is a feature request
  • I searched existing issues before opening this one

Expected behavior

Not logging things blindly, be more cautious about outputing.

Blindly logging everything can pose a severe denial of service risk by exhausting system resources. Suggest aggregating similar log messages in some way to handle (especially) OOM event more appropriately.

Actual behavior

containerd continues to bubble up OOM events and dockerd blindly log such events.

Referring to:
https://github.com/docker/docker-ce/blob/98ddba151eebbff519cc86342443c3a09f8ce334/components/engine/daemon/monitor.go#L50

For us this is a huge issue because the consequence has been dockerd generating a continuous rate of 10Mbps of log output observed on our systems and eventually exhausted all 80GB of system RAM per node + all disk spaces if not caught in time by us, which leads to a denial of service.

We limit our user's resources but we have no control over what they run, but once OOM occurs by resource limits, dockerd is at fault for consuming the rest and majority of system resources by logging recurring OOM events.

Steps to reproduce the behavior

Make a container with resource limits (we did this through RKE Kubernetes) and continually allocate memory till OOM triggers.

An actual example we caught was setting a container memory limit at 4G and load a 5G csv file using read.csv() from R language. And it has caused dockerd to continually generate a massive amount of logs eventually jammed our system.

Output of docker version:

Client:
Version: 18.09.0
API version: 1.39
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:48:22 2018
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 18.09.4
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: d14af54
Built: Wed Mar 27 18:04:46 2019
OS/Arch: linux/amd64
Experimental: false

Output of docker info:

Containers: 42
Running: 23
Paused: 0
Stopped: 19
Images: 61
Server Version: 18.09.4
Storage Driver: devicemapper
Pool Name: docker-thinpool
Pool Blocksize: 524.3kB
Base Device Size: 10.74GB
Backing Filesystem: xfs
Udev Sync Supported: true
Data Space Used: 112.2GB
Data Space Total: 765GB
Data Space Available: 652.9GB
Metadata Space Used: 70.76MB
Metadata Space Total: 8.049GB
Metadata Space Available: 7.978GB
Thin Pool Minimum Free Space: 76.5GB
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.149-RHEL7 (2018-07-20)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 5.0.7-1.el7.elrepo.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 18
Total Memory: 78.66GiB
Name: k-prod-cpu-6.dsa.lan
ID: S65O:V6T3:PQ2L:KKKW:PSC6:DTDJ:6ITH:KICE:EFWX:DQT3:NBSP:XSND
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: bridge-nf-call-ip6tables is disabled
WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release.

Additional environment details (AWS, VirtualBox, physical, etc.)

RKE+oVirt.

@aleozlx
Copy link
Author

aleozlx commented May 16, 2019

An example of aforementioned logs that have insatiably devoured our system resources! We want a more appropriate handling, please.

2019-05-04T18:05:44.998251824-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.000606300-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.003101140-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.005668214-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.008445576-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.011072877-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.013621187-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.015794704-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.018143759-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.020684984-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)
2019-05-04T18:05:45.023195599-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation...

Edit: Now imagine the following 58G is filled with that stuff. How is this not a problem of dockerd?

# ps aux --sort -rss | head
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      5107  8.8 70.8 65050280 58401604 ?   Ssl  May08 4857:05 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

dockeroom

At 1:29pm user reported the service is down, till 3:12pm is what happens if we don't intervene.

@cpuguy83
Copy link
Collaborator

Why not throttle the stdout/stderr of dockerd rather than have dockerd assume it needs to be throttled at a random rate?

@aleozlx
Copy link
Author

aleozlx commented May 16, 2019

@cpuguy83 Let me rephrase, it's not about the output rate, that'll be just kicking the can, given more time, the systems will still be down, imo it should've been implemented as something like a multi-set in the first place.

Would someone go so far as suggesting dockerd taking away 79G/80G of mem and hundreds of GBs of disk due to an unresolved event (which itself is still under cgroup control, merely notification) and somehow bears no responsibility? From what I can tell, containerd just gave the FYI and dockerd simply screwed it up receiving that message.

@vilhelmen
Copy link

vilhelmen commented May 17, 2019

The timeline of events in the journal suggests that something terrible is happening between containerd, the kernel, and docker during a container stop that triggers the logging incident.

At 4:15, an OOM event occurs in a student container, which is rather unsurprising. Everything seems quiet.
At 4:48, docker reports that the same container failed to exit within the time limit and it's going to use force, then it reports that failed and it's going to use SIGKILL. It's unclear if this container stop was requested by a student or initiated by the system.
Dockerd then reports: level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" but the target of said deletion is unclear.
Docker repeats the "failed to exit in the time limit, using force, that failed using SIGKILL" loop until 6:08.
At 6:08 there is a warning about being unable to remove a mount point in /var/lib/docker because it is busy, but it's unclear if this is related to this container or an unrelated one.

The following message then repeats with a sub-millisecond delay until we are able to wrestle control from the system:

level=error msg="failed to process event" container=29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5
error="no such container: 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5"
event=oom event-info="{29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5  0 0 0001-01-01 00:00:00 +0000 UTC true <nil>}" module=libcontainerd namespace=moby

The docker events stream emits the corresponding message:

2019-05-04T18:05:45.013621187-05:00 container oom 29d72966e0beb17d9b0b9fbcbfc734e2df2eb5428690115f9b24430357de08e5 (annotation.io.kubernetes.container.hash=cad60de0, annotation.io.kubernetes.container.ports=[{"name":"notebook-port","containerPort":8888,"protocol":"TCP"}], annotation.io.kubernetes.container.restartCount=0, annotation.io.kubernetes.container.terminationMessagePath=/dev/termination-log, annotation.io.kubernetes.container.terminationMessagePolicy=File, annotation.io.kubernetes.pod.terminationGracePeriod=30, image=sha256:abb449eb148762e1a8dbf45fb1aa8fce99d6f1ca92f91f5e409fea9a05055aca, io.kubernetes.container.logpath=/var/log/pods/efa75591-6e89-11e9-bf85-001a4a16016d/notebook/0.log, io.kubernetes.container.name=notebook, io.kubernetes.docker.type=container, io.kubernetes.pod.name=jupyter-******, io.kubernetes.pod.namespace=jhub-prod, io.kubernetes.pod.uid=efa75591-6e89-11e9-bf85-001a4a16016d, io.kubernetes.sandbox.id=67102bbdc496acefcbb81ee736cb25d4ee7c2753a8660ab54b776e0be321b4f6, maintainer=MUDSA, name=k8s_notebook_jupyter-******_jhub-prod_efa75591-6e89-11e9-bf85-001a4a16016d_0)

The rate at which this event occurs is mind boggling. It's not clear to me whether the events logging itself or the strain from attempting this failed operation so rapidly causes the system to eat through all available memory and grind to a halt. Messages in the journal about suppressing 500,000 to 1,000,000 messages from docker are very common while this is happening.

After this event occurs, it takes a surprisingly long amount of time for the system to release the memory eaten up during this time, in the realm of hours I believe.

@kolyshkin
Copy link

I can only see a single "oom" event, using this as a repro:

time docker run -m 500m --memory-swap=500m python python -c 'a = "asdfsadf".join(map(str, range(1, 1000000000)))'

@aleozlx
Copy link
Author

aleozlx commented Jun 14, 2019

@kolyshkin Thank you for your attention & input.

So he's only seeing one because the event producer (say containerd) generated one, and that's because the process received sig9 (as the exit code is 137) and died in time, which is great.


But our processes did not die, I guess they are supposed to? (Is it so? @vilhelmen ) We tried to manually kill through Kubernetes and docker, both fail, this we were assuming is because the dockerd (not the container) is busy printing the logs, so did any kill sig went through at all? (We don't know). That concept is strengthened when we are able to kill all processes (by literally kill -9) from the container and observe dockerd continue to panic and log about oom even left with the empty container, 0 processes inside. Finally, we were able to murder it by docker rm --force and that finally shuts up 'docker events' and Kubernetes indicates it turned from Terminating to Terminated.

So have the failure to kill played a part in this?
My understanding is Kubernetes relies on docker to send the kill sig. And Docker missed it? Or too busy "panicking" to manage it? In which case, it needs the event producer to calm down so dockerd has time to make sure it's getting killed and that requires the kill to happen first so things can calm down... (I could be completely wrong here) How did this start?


Now, regardless, here's why failure to kill is not the key issue of interest and killing container is another angle we are already working on, we are running an education program and user codes that don't die upon oom should be allowed too. Because the user processes are already properly contained themselves and are no danger to our system. However, the user oom triggers dockerd+containerd to panic which causes a secondary system-wide oom by dockerd is what we are concerning about.
Like @vilhelmen said,

Messages in the journal about suppressing 500,000 to 1,000,000 messages from docker are very common while this is happening.

journal-suppress-msgs

Which is to say many logging systems are frequently aware of the risk of logging and are capable of suppressing messages, we still consider this as a dockerd vulnerability. And hoping someone can propose a fix. (Or we'll have to attempt to fix for our case or explore docker replacements) It is not reasonable to rely on the oom offender being killed. What if other runtime are used and have behaviors that can trigger an incessant stream of events too? I think dockerd should fix the logging.

@aleozlx
Copy link
Author

aleozlx commented Jun 19, 2019

Update: coming back to investigating this again today. I further found out that @kolyshkin was right, normally there's only one message and the process gets killed, as I tried the other day on my laptop, however, on our systems the exact same code produces an infinite amount of messages:

[root@k-prod-cpu-8 ~]# docker run -m 500m --memory-swap=500m python python -c 'a = "asdfsadf".join(map(str, range(1, 1000000000)))'
[root@k-prod-cpu-8 bin]# docker events | head
2019-06-19T15:24:02.855477531-05:00 container create 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:02.857333163-05:00 container attach 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:02.934636329-05:00 network connect 209bcc4c10469e226727ab99eedcd5dfe30508e4ed68d7d76bda5078242e7fed (container=23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613, name=bridge, type=bridge)
2019-06-19T15:24:03.221100516-05:00 container start 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:05.138325157-05:00 container oom 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:05.140344253-05:00 container oom 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:05.142527728-05:00 container oom 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:05.144687641-05:00 container oom 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:05.146630906-05:00 container oom 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)
2019-06-19T15:24:05.148670272-05:00 container oom 23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613 (image=python, name=vigilant_golick)

So what is wrong with our containers??

root@k-prod-cpu-8 bin]# docker inspect vigilant_golick
[
    {
        "Id": "23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613",
        "Created": "2019-06-19T20:24:02.625552395Z",
        "Path": "python",
        "Args": [
            "-c",
            "a = \"asdfsadf\".join(map(str, range(1, 1000000000)))"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 27219,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2019-06-19T20:24:03.218644595Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:34a518642c76e77f0cace72bd993352ac99802c3295931f70a407b735ecb6e27",
        "ResolvConfPath": "/var/lib/docker/containers/23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613/hostname",
        "HostsPath": "/var/lib/docker/containers/23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613/hosts",
        "LogPath": "/var/lib/docker/containers/23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613/23bd12134164fed0a9f6f46968bd65f9b60815a847a30bd938696bbf147ac613-json.log",
        "Name": "/vigilant_golick",
        "RestartCount": 0,
        "Driver": "devicemapper",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "default",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "shareable",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 524288000,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": [],
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DeviceCgroupRules": null,
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 524288000,
            "MemorySwappiness": null,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": {
                "DeviceId": "3220",
                "DeviceName": "docker-252:0-67206420-9c191d6afbf9b154ce89527bda7dca00afa0b3f4e5c70fad814162a46e01becc",
                "DeviceSize": "10737418240"
            },
            "Name": "devicemapper"
        },
        "Mounts": [],
        "Config": {
            "Hostname": "23bd12134164",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": true,
            "AttachStderr": true,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "LANG=C.UTF-8",
                "GPG_KEY=0D96DF4D4110E5C43FBFB17F2D347EA6AA65421D",
                "PYTHON_VERSION=3.7.3",
                "PYTHON_PIP_VERSION=19.1.1"
            ],
            "Cmd": [
                "python",
                "-c",
                "a = \"asdfsadf\".join(map(str, range(1, 1000000000)))"
            ],
            "Image": "python",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {}
        },
        "NetworkSettings": {
              *** FILTERED ***
        }
    }
]

Can anybody help?

@aleozlx
Copy link
Author

aleozlx commented Jun 20, 2019

And actually, it doesn't matter what we run that hit memory limit. Initially, we thought I/O needs to involved too. But now I'm using something as simple as this to replicate the issue as quickly as possible. And now we know for some reason, it so far only happens on our system...

docker run --name mem_test --init -d --rm -m 50m --memory-swap 50m python python -c "import itertools; list(itertools.count())"

And the procedure to recover from this is:

docker kill mem_test  # which will knock out all the processes in the container, but fails to destroy the container as it's busy printing oom messages??
docker rm --force mem_test # which removes the empty container
kill -9 <pid for the above docker run> # as the docker run command will be still frozen

Why is this happening on our system and not on others?
Why docker events continues to print oom message even when we kill all processes from a container i.e. no one should be still OOMing? It only stops when the (empty) container is forcefully removed.

@ariesdevil
Copy link

ariesdevil commented Jun 21, 2019

What is the (docker daemon and containerd)'s oom_score_adj value you set?

@aleozlx
Copy link
Author

aleozlx commented Jun 23, 2019

@ariesdevil I believe it's listed in the inspect output above (0) if you do a Ctrl-F. So?

@ariesdevil
Copy link

Sorry for missing it.

@tambetliiv
Copy link

Crosspost here also my issue #693 (comment)

Anybody here have any fix/workaround for this problem?

@vilhelmen
Copy link

@aleozlx What are we doing anyway? I know we discussed patching the docker binary to NOP the event function, but I think you whipped up something to murder containers before they OOM'd?

I mean, it's a terrible solution, but we don't have a lot of choices. Other systems respawn the containers on kill so it works for now.

@erichorwath
Copy link

erichorwath commented Feb 9, 2021

Hi I'm new to the game, but have an easy setup to reproduce this issue:
nginx 1.18.0 with
"nginx.conf: worker_processes auto;" (don't do this in production!)
No CPU request/limit, 36Mi/96Mi memory request/limit.
min. 32 CPU VM

Nginx immediately starts to consume a lot of CPU and soon reaching the memory limit. The continuous OOM events bubble up causing dockerd growing in memory endlessly.
image

More details: kubernetes/kubernetes#69364 (comment)
(Kubernetes 1.18.12, Istio 1.7.4, Container Runtime Version docker://19.3.13, GardenLinux 184.0 / Kernel 5.4.0-5-cloud-amd64)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants