Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containerd-shim hangs on reboot/shutdown (live restore + runc v2 runtime) #41831

Closed
zhangyoufu opened this issue Dec 21, 2020 · 12 comments
Closed
Labels
area/runtime kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. version/20.10

Comments

@zhangyoufu
Copy link
Contributor

zhangyoufu commented Dec 21, 2020

Description
After upgrading docker-ce to 20.10.x, reboot/shutdown the machine hangs 90s due to containerd-shim.

[  OK  ] Reached target Shutdown.
[  OK  ] Reached target Final Step.
[  OK  ] Finished Reboot.
[  OK  ] Reached target Reboot.
[  214.337805] systemd-shutdown[1]: Waiting for process: containerd-shim

Steps to reproduce the issue:

  1. fresh install docker-ce from download.docker.com on Ubuntu 20.04 LTS
  2. enable live-restore
  3. docker run -d k8s.gcr.io/pause
  4. sudo reboot

Describe the results you received:
The shutdown/reboot process stuck for 90s, due to containerd-shim.

Describe the results you expected:
containerd-shim should not interfere with shutdown/reboot process.

Additional information you deem important (e.g. issue happens only occasionally):
Since #41210, the default runtime is runc v2. The old runc v1 runtime does not have this issue. Tested with the following commands.

docker run -d --name runc-v1 --runtime=io.containerd.runtime.v1.linux k8s.gcr.io/pause
docker run -d --name runc-v2 --runtime=io.containerd.runc.v2 k8s.gcr.io/pause

Output of docker version:

Client: Docker Engine - Community
 Version:           20.10.1
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        831ebea
 Built:             Tue Dec 15 04:34:58 2020
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.1
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       f001486
  Built:            Tue Dec 15 04:32:52 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.3
  GitCommit:        269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc:
  Version:          1.0.0-rc92
  GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Output of docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.0-docker)

Server:
 Containers: 2
  Running: 1
  Paused: 0
  Stopped: 1
 Images: 5
 Server Version: 20.10.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-58-generic
 Operating System: Ubuntu 20.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 981.2MiB
 Name: hk.zju.co
 ID: EZ3R:GFZG:PCIO:CEYW:Z3P5:MIND:EDKS:2VDJ:HVPF:SAWD:BKQY:OP2H
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: true

WARNING: No swap limit support
WARNING: No blkio weight support
WARNING: No blkio weight_device support

Additional environment details (AWS, VirtualBox, physical, etc.):
N/A

@thaJeztah thaJeztah added area/runtime kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. version/20.10 labels Jan 21, 2021
@thaJeztah thaJeztah added this to Needs triage in 20.10.x bugs/regressions via automation Jan 21, 2021
@thaJeztah
Copy link
Member

@AkihiroSuda @cpuguy83 any ideas?

@lantica
Copy link

lantica commented May 13, 2021

Hi, I am also facing this issue.
However, changing runtime doesn't work for me.

I finally find a workaround from this issue: containerd/containerd#386 (comment)

@zhangyoufu
Copy link
Contributor Author

zhangyoufu commented May 14, 2021

I don't think KillMode=mixed is a proper workaround, as systemctl stop containerd would stop containerd-shim, which may happen if you're upgrading packages.

Per https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/, I think that containerd-shim should move to a standalone cgroup node (a systemd.scope or integrate with systemd-machined), instead of reusing /system.slice/containerd.service.


My workaround: (for unless-stopped restart policy & proper init process that handles SIGTERM)

# /etc/systemd/system/containerd-shim-v2-workaround.service
[Unit]
Description=containerd-shim v2 workaround
Before=docker.service
Requires=containerd.service
After=containerd.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStop=-/bin/sh -c '[ "$(systemctl is-system-running)" = "stopping" ] || exit 0; ctr -n moby tasks ls -q | xargs -r -L1 ctr -n moby tasks kill; ctr -n moby containers ls -q | xargs -r ctr -n moby containers rm'

[Install]
WantedBy=containerd.service

@tsafs
Copy link

tsafs commented May 28, 2021

@zhangyoufu I just wanted to note that your workaround will not allow restarting the containers upon reboot, if previously the --restart option was given within docker run. Your solution removes/deletes all running containers. Has anyone had any luck using a solution similar to @zhangyoufu that only stops given containers?

Edit: Simply leaving out the ctr -n moby containers rm part does not solve the problem. The shutdown still hangs and in addition the container will not get restarted upon boot.

@zhangyoufu
Copy link
Contributor Author

@sebastianFast I'm using unless-stopped restart policy and dockerd brings up containers after reboot. I didn't run into any problem after reboot.

@tsafs
Copy link

tsafs commented Jun 2, 2021

@zhangyoufu Sadly, I have to use on-failure[:max-retries] for security reasons. Additionally, please correct me, but it seems that ctr -n moby containers rm removes my container permanently

@zhangyoufu
Copy link
Contributor Author

@sebastianFast AFAIK, ctr -n moby containers rm removes the container on containerd side. It does not remove container on docker side.

I'm not familiar with on-failure restart policy. According to document, it relies on non-zero exit code. I don't think this is sane for autostart purpose with live-restore on, because docker has no chance to record a non-zero exit code. Correct me if I'm wrong.

@git-developer
Copy link

Thanks for sharing your workaround, @zhangyoufu! It didn't work for me out-of-the-box, but with a little modification.

My containers are configured for restart policy unless-stopped. Before adding the workaround, systemd-shutdown waited for 15 containers, with your workaround only for 5 containers. I found messages by systemd-shutdown in the journal stating cannot delete a non stopped container.

I added a delay between the termination and removal of containers. Furthermore, I inspected the man pages of ctr and found that tasks kill allows a switch -a to send the signal to all processes within the container. The following combination of the delay and -a prevents the delay at shutdown:

ExecStop=-/bin/sh -c '[ "$(systemctl is-system-running)" = "stopping" ] || exit 0; ctr -n moby tasks ls -q | xargs -r -L1 ctr -n moby tasks kill -a; sleep 5; ctr -n moby containers ls -q | xargs -r ctr -n moby containers rm'

I don't know the exact reason why it works this way, but maybe it is helpful for others. I really hope that containerd/containerd#5502 gets fixed to get rid of the workaround.

@zhangyoufu
Copy link
Contributor Author

@git-developer Glad to see that helps.

It seems that the PID 1 process of your container did not handle SIGTERM properly. You can test its behavior with docker stop and see if the container needs 10 seconds to be stopped (killed).

IMHO, the PID 1 process is responsible for graceful shutdown of the whole container. That's why I didn't send signal to all process inside container, nor sending a SIGKILL.

@git-developer
Copy link

Thanks for your fast response.

I'm aware of the topic that processes don't handle SIGTERM properly. I don't think this is the case for my containers. I'm using docker compose with init: true to make sure the integrated process reaper (tini) takes care of orphaned children. The containers shutdown fast when using docker stop. Another indication is that a sleep time of 5s is enough to solve the problem. A process that does not respond to SIGTERM would need 10s because that's the delay before the system kills it.

I'm not sure in which situations the use of -a is required, maybe spawned background processes within the container or something. If I understand correctly the shim v1 code that you investigated, it also sends the signal to all processes.

@zhangyoufu
Copy link
Contributor Author

zhangyoufu commented Mar 8, 2022

FYI, containerd fixed this issue in v1.6.0, also backported to v1.5.10. while docker-ce still ships containerd v1.4.x.

EDIT: 20.10 branch is switching to containerd v1.5.X, looking forward to next release. 180f3b9

20.10.x bugs/regressions automation moved this from Needs triage to Closed Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. version/20.10
Projects
No open projects
Development

No branches or pull requests

5 participants