shareProcessNamespace will cause the container no response, and restart docker will waiting for "Loading containers" #92214

xieyanker · 2020-06-17T06:58:22Z

What happened:

shareProcessNamespace will cause the container no response, and restart docker will waiting for "Loading containers"

What you expected to happen:

container has response, and restart docker no problem.

How to reproduce it (as minimally and precisely as possible):

Create nginx pod as follows:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
      name: nginx
    spec:
      shareProcessNamespace: true
      nodeName: master1
      containers:
      - image: registry.icp.com:5000/library/common/nginx-amd64:1.17.5
        name: nginx

Find the nginx: master process nginx -g daemon off process, and kill it

root@xuewei81-tgquxn9spy-master-0:~/xjs# kubectl get pod -owide
NAME                     READY   STATUS    RESTARTS   AGE   IP              NODE      NOMINATED NODE   READINESS GATES
nginx-699575b7df-2bxfd   1/1     Running   0          7s    10.151.161.19   master1   <none>           <none>
root@xuewei81-tgquxn9spy-master-0:~/xjs# docker ps | grep nginx-699575b7df-2bxfd
6e2231761cbc        540a289bab6c                                                             "nginx -g 'daemon of…"   14 seconds ago      Up 12 seconds                           k8s_nginx_nginx-699575b7df-2bxfd_default_24c7f7b4-b064-11ea-94b0-fa163e279982_0
849f76682e0a        registry.icp.com:5000/library/cke/kubernetes/pause-amd64:3.1             "/pause"                 17 seconds ago      Up 16 seconds                           k8s_POD_nginx-699575b7df-2bxfd_default_24c7f7b4-b064-11ea-94b0-fa163e279982_0
root@xuewei81-tgquxn9spy-master-0:~/xjs# 
root@xuewei81-tgquxn9spy-master-0:~/xjs# docker inspect 6e2231761cbc | grep -i pid
            "Pid": 2119,
            "PidMode": "container:849f76682e0a322d0109250d0d5eb3767248f18c2ee7457593729339fb400eef",
            "PidsLimit": 0,
root@xuewei81-tgquxn9spy-master-0:~/xjs# ps -ef | grep 2119 | grep -v color
root      2119  2093  0 14:31 ?        00:00:00 nginx: master process nginx -g daemon off;
systemd+  2152  2119  0 14:31 ?        00:00:00 nginx: worker process

Then, the 2152 's parent process will become "/pause"

root@xuewei81-tgquxn9spy-master-0:~/xjs# kill -9 2119
root@xuewei81-tgquxn9spy-master-0:~/xjs# 
root@xuewei81-tgquxn9spy-master-0:~/xjs# ps -f 2152
UID        PID  PPID  C STIME TTY      STAT   TIME CMD
systemd+  2152  1897  0 14:31 ?        S      0:00 nginx: worker process
root@xuewei81-tgquxn9spy-master-0:~/xjs# ps -f 1897
UID        PID  PPID  C STIME TTY      STAT   TIME CMD
root      1897  1863  0 14:31 ?        Ss     0:00 /pause
root@xuewei81-tgquxn9spy-master-0:~/xjs#

At this moment, the commands "docker inspect" or "docker exec" or "docker logs" will no response:

root@xuewei81-tgquxn9spy-master-0:~/xjs# docker ps | grep nginx-699575b7df-2bxfd
6e2231761cbc        540a289bab6c                                                             "nginx -g 'daemon of…"   About a minute ago   Up About a minute                       k8s_nginx_nginx-699575b7df-2bxfd_default_24c7f7b4-b064-11ea-94b0-fa163e279982_0
849f76682e0a        registry.icp.com:5000/library/cke/kubernetes/pause-amd64:3.1             "/pause"                 About a minute ago   Up About a minute                       k8s_POD_nginx-699575b7df-2bxfd_default_24c7f7b4-b064-11ea-94b0-fa163e279982_0
root@xuewei81-tgquxn9spy-master-0:~/xjs# docker inspect 6e2231761cbc

^C
root@xuewei81-tgquxn9spy-master-0:~/xjs# docker exec -it 6e2231761cbc sh

^C
root@xuewei81-tgquxn9spy-master-0:~/xjs# docker logs 6e2231761cbc

^C

And it will wait for "Loading containers" if restart docker:

root@xuewei81-tgquxn9spy-master-0:~/xjs# systemctl restart docker
Job for docker.service failed because a timeout was exceeded.
See "systemctl status docker.service" and "journalctl -xe" for details.

Jun 17 14:33:55 xuewei81-tgquxn9spy-master-0 dockerd[2814]: time="2020-06-17T14:33:55.525517831+08:00" level=info msg="Loading containers: start."
Jun 17 14:34:55 xuewei81-tgquxn9spy-master-0 systemd[1]: docker.service: Start operation timed out. Terminating.
Jun 17 14:34:55 xuewei81-tgquxn9spy-master-0 dockerd[2814]: time="2020-06-17T14:34:55.503401172+08:00" level=info msg="Processing signal 'terminated'"

The docker will restart successfully only I kill the containerd-shim process

root@xuewei81-tgquxn9spy-master-0:~# ps -ef | grep 6e2231761cbc | grep containerd-shim
root      2093 14254  0 14:31 ?        00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/6e2231761cbc4b6ceff36dbcc4cfae67530377616aed47ba34cf0d057f37301d -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc -systemd-cgroup
root@xuewei81-tgquxn9spy-master-0:~# kill -9 2093
root@xuewei81-tgquxn9spy-master-0:~# 
root@xuewei81-tgquxn9spy-master-0:~# journalctl -u docker -f
-- Logs begin at Wed 2020-01-01 11:46:34 CST. --
Jun 17 14:35:36 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:36.054660644+08:00" level=warning msg="Your kernel does not support cgroup rt runtime"
Jun 17 14:35:36 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:36.054673960+08:00" level=warning msg="Your kernel does not support cgroup blkio weight"
Jun 17 14:35:36 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:36.054684371+08:00" level=warning msg="Your kernel does not support cgroup blkio weight_device"
Jun 17 14:35:36 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:36.055271704+08:00" level=info msg="Loading containers: start."
Jun 17 14:35:37 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:37.806790934+08:00" level=info msg="There are old running containers, the network config will not take affect"
Jun 17 14:35:39 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:39.038308094+08:00" level=info msg="Loading containers: done."
Jun 17 14:35:39 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:39.178666784+08:00" level=info msg="Docker daemon" commit=0dd43dd graphdriver(s)=overlay2 version=18.09.8
Jun 17 14:35:39 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:39.178815965+08:00" level=info msg="Daemon has completed initialization"
Jun 17 14:35:39 xuewei81-tgquxn9spy-master-0 dockerd[3409]: time="2020-06-17T14:35:39.258160782+08:00" level=info msg="API listen on /var/run/docker.sock"
Jun 17 14:35:39 xuewei81-tgquxn9spy-master-0 systemd[1]: Started Docker Application Container Engine.

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
1.14.3
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):

root@xuewei81-tgquxn9spy-master-0:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Kernel (e.g. uname -a):

root@xuewei81-tgquxn9spy-master-0:~# uname -a
Linux xuewei81-tgquxn9spy-master-0 5.0.0-29-generic #31~18.04.1-Ubuntu SMP Thu Sep 12 18:29:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Install tools:
Network plugin and version (if this is a network-related bug):
Others:
My docker version is:

root@xuewei81-tgquxn9spy-master-0:~# docker version
Client:
 Version:           18.09.8
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        0dd43dd87f
 Built:             Wed Jul 17 17:41:19 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.8
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       0dd43dd
  Built:            Wed Jul 17 17:07:25 2019
  OS/Arch:          linux/amd64
  Experimental:     false

The text was updated successfully, but these errors were encountered:

xieyanker · 2020-06-17T07:04:00Z

/sig usability

zjj2wry · 2020-06-18T02:41:36Z

/cc
/sig node

xieyanker · 2020-06-18T10:05:55Z

/sig node

qingwave · 2020-07-28T02:36:25Z

Same promblem, run kill -9 nginx, the nginx daemon exited, but nginx worker is also running (became orphaned process， ppid is containerd-shim), docker will hang when operate it, unless delete all orphaned process or delete pod(destroy pid namespace) or delete parent（containerd-shim）.

It seems a docker bug, but I cannot reproduce it by docker with docker run -d --pid ...

fejta-bot · 2020-10-26T02:56:26Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-11-25T03:39:36Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-12-25T04:24:00Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-12-25T04:24:12Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

liupeng0518 · 2021-04-24T03:01:15Z

1.18/1.19 no such problem.

xieyanker added the kind/bug Categorizes issue or PR as related to a bug. label Jun 17, 2020

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 17, 2020

k8s-ci-robot added sig/usability Categorizes an issue or PR as relevant to SIG Usability. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 17, 2020

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jun 18, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 26, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 25, 2020

k8s-ci-robot closed this as completed Dec 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shareProcessNamespace will cause the container no response, and restart docker will waiting for "Loading containers" #92214

shareProcessNamespace will cause the container no response, and restart docker will waiting for "Loading containers" #92214

xieyanker commented Jun 17, 2020 •

edited

xieyanker commented Jun 17, 2020

zjj2wry commented Jun 18, 2020 •

edited

xieyanker commented Jun 18, 2020

qingwave commented Jul 28, 2020

fejta-bot commented Oct 26, 2020

fejta-bot commented Nov 25, 2020

fejta-bot commented Dec 25, 2020

k8s-ci-robot commented Dec 25, 2020

liupeng0518 commented Apr 24, 2021

shareProcessNamespace will cause the container no response, and restart docker will waiting for "Loading containers" #92214

shareProcessNamespace will cause the container no response, and restart docker will waiting for "Loading containers" #92214

Comments

xieyanker commented Jun 17, 2020 • edited

xieyanker commented Jun 17, 2020

zjj2wry commented Jun 18, 2020 • edited

xieyanker commented Jun 18, 2020

qingwave commented Jul 28, 2020

fejta-bot commented Oct 26, 2020

fejta-bot commented Nov 25, 2020

fejta-bot commented Dec 25, 2020

k8s-ci-robot commented Dec 25, 2020

liupeng0518 commented Apr 24, 2021

xieyanker commented Jun 17, 2020 •

edited

zjj2wry commented Jun 18, 2020 •

edited