Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot stop container XXXXXXXXXXXX: [2] Container does not exist: container destroyed #12738

Closed
ernetas opened this issue Apr 24, 2015 · 121 comments
Assignees
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. priority/P1 Important: P1 issues are a top priority and a must-have for the next release. version/1.6

Comments

@ernetas
Copy link

ernetas commented Apr 24, 2015

Docker fails to stop a container:

root@haswell:/home/supervisor# docker ps | grep e16
e16c5c80ab4e        ernetas/local:latest   "/sbin/my_init -- /b   10 days ago         Up 3 minutes                            romantic_kirch      
root@haswell:/home/supervisor# docker stop e16c5c80ab4e
Error response from daemon: Cannot stop container e16c5c80ab4e: [2] Container does not exist: container destroyed
FATA[0000] Error: failed to stop one or more containers 
root@haswell:/home/supervisor# 

Docker version and info:

supervisor@haswell:~$ sudo docker -D info
Containers: 8
Images: 40
Storage Driver: aufs
 Root Dir: /storage2/docker/aufs
 Backing Filesystem: extfs
 Dirs: 56
 Dirperm1 Supported: false
Execution Driver: native-0.2
Kernel Version: 3.13.0-45-generic
Operating System: Ubuntu 14.04.2 LTS
CPUs: 4
Total Memory: 31.35 GiB
Name: haswell
ID: T2LH:VCTV:CZGW:WBD6:7FQ6:TNLY:BEB4:ATGH:CBZZ:FEPU:JTRY:XNVQ
Debug mode (server): false
Debug mode (client): true
Fds: 35
Goroutines: 73
System Time: Fri Apr 24 11:45:27 EEST 2015
EventsListeners: 0
Init SHA1: 9145575052383dbf64cede3bac278606472e027c
Init Path: /usr/bin/docker
Docker Root Dir: /storage2/docker
supervisor@haswell:~$ sudo docker -D version
Client version: 1.6.0
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 4749651
OS/Arch (client): linux/amd64
Server version: 1.6.0
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 4749651
OS/Arch (server): linux/amd64
supervisor@haswell:~$ 
@alexeyfrank
Copy link

I've got this error when I use docker remote api v1.17 and docker 1.6


alexander.v@ip-10-0-1-224:~$  docker ps -a | grep hexletjava_m3e1-6810
a73191529f41        hexletboy/hexletjava_m3e1:568                  "/usr/bin/supervisor   38 hours ago        Up 16 hours                      172.17.42.1:32926->8000/tcp, 172.17.42.1:32927->8080/tcp                    hexletjava_m3e1-6810
alexander.v@ip-10-0-1-224:~$ docker stop a73191529f41
Error response from daemon: Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
FATA[0000] Error: failed to stop one or more containers

docker logs:


INFO[80220] POST /v1.18/containers/a73191529f41/stop?t=10
INFO[80220] +job stop(a73191529f41)
INFO[80220] Failed to send SIGTERM to the process, force killing
Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
INFO[80220] -job stop(a73191529f41) = ERR (1)
ERRO[80220] Handler for POST /containers/{name:.*}/stop returned error: Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
ERRO[80220] HTTP Error: statusCode=500 Cannot stop container a73191529f41: [2] Container does not exist: container destroyed

@PlugIN73
Copy link

👍

2 similar comments
@zzet
Copy link

zzet commented Apr 24, 2015

👍

@PanfilovDenis
Copy link

👍

@arindamchoudhury
Copy link

😒

@coolljt0725
Copy link
Contributor

can you provide some steps to reproduce this?

@aimxhaisse
Copy link
Contributor

👍

@hugochinchilla
Copy link

Happened to me today

@AduchiMergen
Copy link

👍

@discordianfish
Copy link
Contributor

@icecrime Can you get this prioritized?
Same error in our env, but with a unprivileged container so seems unrelated to that.

I also see that the old process is still around, so this looks like a new 'ghost container' issue.

@maspwr
Copy link

maspwr commented Apr 30, 2015

We have also been hitting this issue. I'm trying to narrow down steps to reproduce.

@tiborvass
Copy link
Contributor

If someone can reproduce this in a VM and snapshot the VM that'd be really fantastic, these are bugs that are hard to reproduce :(

@LK4D4
Copy link
Contributor

LK4D4 commented Apr 30, 2015

reproducible with

while ID=$(docker run -d busybox true) && docker stop $ID; do :; done

@icecrime
Copy link
Contributor

icecrime commented May 1, 2015

The reproduction case seems to imply a race condition, but then, this only happen if you issue docker stop at the same time that the container process dies by itself. This is quite unlikely, and probably don't explain all the occurrences reported here.

@discordianfish Could you tell us more about the way this happened for you? In particular: is there a chance that the container process died at the same time that you issued the docker stop command, or was it a long running process that had absolutely no reason to exit by itself?

@discordianfish
Copy link
Contributor

@icecrime No sorry, I don't have more details. I found those systems in that state and some containers were running according to docker ps but couldn't be stopped. Additionally to that, I saw older processes still running.
We upgraded Docker from 1.5.0 to 1.6.0 when @avinson first saw this issue, maybe he knows more. But I could imaging that this is how we got into the race condition: Maybe stopping docker tries to stop the container multiple times and sometimes right in that moment to trigger the race condition?

Beside that, it looks like we have no 'stress tests' in the docker test suite which would have caught that issue. As @LK4D4 shows, at least some race condition is easily found by this simple test.

@ambroisemaupate
Copy link

Happened to me since 1.6.0 upgrade :

$ docker rm -f 46f4675d50c1 
Error response from daemon: Could not kill running container, cannot remove - [2] Container does not exist: container destroyed
FATA[0000] Error: failed to remove one or more containers 
Linux xxxxxxxxxxxx.com 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:08:34 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
$ docker inspect 46f4675d50c1
[{
    "AppArmorProfile": "",
    "Args": [
        "-c",
        "/config/loop"
    ],
    "Config": {
        "AttachStderr": true,
        "AttachStdin": false,
        "AttachStdout": true,
        "Cmd": [
            "/bin/sh",
            "-c",
            "/config/loop"
        ],
        "CpuShares": 0,
        "Cpuset": "",
        "Domainname": "",
        "Entrypoint": null,
        "Env": [
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            "DEBIAN_FRONTEND=noninteractive",
            "ROADIZ_BRANCH=master"
        ],
        "ExposedPorts": {
            "80/tcp": {}
        },
        "Hostname": "46f4675d50c1",
        "Image": "roadiz/roadiz",
        "Labels": null,
        "MacAddress": "",
        "Memory": 0,
        "MemorySwap": 0,
        "NetworkDisabled": false,
        "OnBuild": null,
        "OpenStdin": false,
        "PortSpecs": null,
        "StdinOnce": false,
        "Tty": true,
        "User": "",
        "Volumes": null,
        "WorkingDir": ""
    },
    "Created": "2015-04-27T21:54:17.310271095Z",
    "Driver": "aufs",
    "ExecDriver": "native-0.2",
    "ExecIDs": [
        "55c474e453bff882f08f6074a3974d50321add4b039cdb99ac7462acc3f66b08"
    ],
    "HostConfig": {
        "Binds": null,
        "CapAdd": null,
        "CapDrop": null,
        "CgroupParent": "",
        "ContainerIDFile": "",
        "CpuShares": 0,
        "CpusetCpus": "",
        "Devices": null,
        "Dns": null,
        "DnsSearch": null,
        "ExtraHosts": null,
        "IpcMode": "",
        "Links": [
            "/mariadb:/roadiz/mariadb"
        ],
        "LogConfig": {
            "Config": null,
            "Type": "json-file"
        },
        "LxcConf": [],
        "Memory": 0,
        "MemorySwap": 0,
        "NetworkMode": "bridge",
        "PidMode": "",
        "PortBindings": {
            "80/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49153"
                }
            ]
        },
        "Privileged": false,
        "PublishAllPorts": false,
        "ReadonlyRootfs": false,
        "RestartPolicy": {
            "MaximumRetryCount": 0,
            "Name": ""
        },
        "SecurityOpt": null,
        "Ulimits": null,
        "VolumesFrom": [
            "data-roadiz"
        ]
    },
    "HostnamePath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/hostname",
    "HostsPath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/hosts",
    "Id": "46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d",
    "Image": "bc28093328c52de86f698f654f0926ca897c2a4ff95ebc768f1337bec4a42965",
    "LogPath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d-json.log",
    "MountLabel": "",
    "Name": "/roadiz",
    "NetworkSettings": {
        "Bridge": "docker0",
        "Gateway": "172.17.42.1",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "IPAddress": "172.17.0.4",
        "IPPrefixLen": 16,
        "IPv6Gateway": "",
        "LinkLocalIPv6Address": "fe80::42:acff:fe11:4",
        "LinkLocalIPv6PrefixLen": 64,
        "MacAddress": "02:42:ac:11:00:04",
        "PortMapping": null,
        "Ports": {
            "80/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49153"
                }
            ]
        }
    },
    "Path": "/bin/sh",
    "ProcessLabel": "",
    "ResolvConfPath": "/etc/resolv.conf",
    "RestartCount": 0,
    "State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "2015-05-02T00:17:33.063361405Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 16345,
        "Restarting": false,
        "Running": true,
        "StartedAt": "2015-05-02T00:23:43.684876553Z"
    },
    "Volumes": {
        "/data": "/var/lib/docker/vfs/dir/09b9b1dab4a8f6e4ad7e2d6651a794e94d154e3618acc8a43b339489911e1070"
    },
    "VolumesRW": {
        "/data": true
    }
}
]

@enguerran
Copy link
Contributor

Is someone has a solution to stop those kind of ghost containers without restarting the daemon? (I even don't know if restarting the daemon will stop those ghost containers)

@discordianfish
Copy link
Contributor

@enguerran The workaround for me was to stop docker, kill the ghost containers manually, remove /var/lib/docker/containers/[container-id] and start docker again.

@jessfraz jessfraz self-assigned this May 5, 2015
@jessfraz jessfraz added the kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. label May 5, 2015
@jessfraz
Copy link
Contributor

jessfraz commented May 5, 2015

is everyone here using aufs trying to reproduce

@PlugIN73
Copy link

PlugIN73 commented May 5, 2015

@jfrazelle I've got this error exactly on aufs! But my containers starts and stops automatically - i can't get any steps to reproduce :(
May be you have some guess that i can try?

@jessfraz
Copy link
Contributor

jessfraz commented May 5, 2015

So the next time anyone encounters this can you check:

  1. if the PID from inspect is still in ps -aux to see if the process is still running
  2. How long the process has been running for & how long the container has been running for
  3. if there were any errors when you started the container

all of this information would be very very helpful

@PlugIN73
Copy link

PlugIN73 commented May 5, 2015

@jfrazelle in my case the process with pid from container's inspect does not exist.
p.s. if it's helpful:
I returned to docker 1.5 (with aufs) and sometimes i see this error:

Error response from daemon: Cannot stop container 4a1afeeafd12: no such process
FATA[0000] Error: failed to stop one or more containers

wherein no process with pid from container's inspect and docker ps show me running container.

4a1afeeafd12        hexletboy/hexletjava_m3e1:568                "/usr/bin/supervisor   7 days ago          Up 7 days           172.17.42.1:49153->8000/tcp, 172.17.42.1:49154->8080/tcp   hexletjava_m3e1-7268

@LK4D4
Copy link
Contributor

LK4D4 commented May 5, 2015

@PlugIN73 It is superimportant information.
I see that you using supervisor. Can you also try to find if processes which was spawned by supervisor still alive too?

@PlugIN73
Copy link

PlugIN73 commented May 5, 2015

@LK4D4 i saw them less than a week ago. Now i can't find this processes because i killed them. I didn't understand how it possible (and I could not compare these processes with these dead containers - not enough information).

@LK4D4
Copy link
Contributor

LK4D4 commented May 5, 2015

@PlugIN73 Yeah, we don't understand how it possible too :/

@LK4D4
Copy link
Contributor

LK4D4 commented May 6, 2015

Also, all who encounter this issue: how do you run your processes inside? Is it like with supervisord or maybe sh script which starts something. I'm in particular interested if there is problems with simple one-processes inside container.
Also feel free to provide some scripts to run your containers if it's not very secret.

@discordianfish
Copy link
Contributor

@LK4D4 Here it happened to our collins image which uses the official collins image as FROM[1]. We run reefer[2] as entrypoint which then exec's java, so it's single process and no bash is involved. I'm also using btrfs, so it's not limited to aufs.

  1. https://registry.hub.docker.com/u/tumblr/collins/
  2. https://github.com/docker-infra/reefer

@heyman
Copy link

heyman commented May 6, 2015

It has happened twice to me using the image found here (it also uses supervisor): https://github.com/heyman/graphite_docker

@riyadparvez
Copy link

+1

@jorgemarey
Copy link

I think the same happened to me on docker 1.10.1:

Client:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

docker logs:

Feb 26 08:46:01 host docker[22141]: time="2016-02-26T08:46:01.451368355Z" level=info msg="Failed to send SIGTERM to the process, force killing"
Feb 26 08:46:01 host docker[22141]: time="2016-02-26T08:46:01.451666523Z" level=error msg="Handler for POST /containers/6ce45e61d9dac86a2a52b54145abbb15cad17b398bf1d821e5eb68c30c5cd020/stop returned error: Cannot stop container 6ce45e61d9dac86a2a52b54145abbb15cad17b398bf1d821e5eb68c30c5cd020: [2] Container does not exist: container destroyed\n"

I send a docker rm -f to a container that's up. Docker reports it ending ok, but the container is still there.

No way of making it dissapear.

@mpalmer
Copy link

mpalmer commented Feb 29, 2016

If you've got a container that just won't die after you docker rm -f it, you can force-nuke it with

lscgroup | grep <shortID> | xargs cgdelete

Where <shortID> is the blob of hex that identifies the container in docker ps.

It's not a fix, but it's an effective workaround.

@Jarema
Copy link

Jarema commented Mar 17, 2016

happend to me too on osx with latest docker machine.

Only working walkaround was one provided by @dkinzer

@atemerev
Copy link

Just happened to me. In production! Docker-machine deployed on AWS, OS X as dev environment.

I really start to doubt my decision to embrace Docker as a solution for structuring our service.

@atemerev
Copy link

(Repro: one of the containers was hanging with 100% cpu, attempted to shutdown, failed, tried to restart docker-container, got stuck.)

Now having to restart docker-machine and regenerate certs, as IP has changed. This takes time.

@thaJeztah
Copy link
Member

@atemerev if you're using devicemapper, you may want to check if the thin pool didn't run out of space, because that may lead to issues. Docker 1.11 will include an option to reserve space to prevent this from happening; #20786

@trupty
Copy link

trupty commented Apr 11, 2016

I recently faced this issue on docker 1.10.x

My error being:
sudo docker stop <container_name>
Failed to stop container (<container_name>): Error response from daemon: Cannot stop container <container_name>: [2] Container does not exist: container destroyed

Resolution:

  • restarted the docker service [Yes, not great, but works fine for my case]
  • redeployed the container

@cpuguy83
Copy link
Member

This is difficult to reproduce since it relies on some weird area where libcontainer's state differs from what we expect.

We also no longer use libcontainer directly (in Docker 1.11) but instead execute containers via runc. If this is still an issue, it will definitely be a different error message since the container destroyed error does not appear in runc codebase anymore.

Issue is probably actually fixed in this commit: https://github.com/opencontainers/runc/commit/556f798a19ecf23b20c518db5fa5fcec4c5034b6#diff-7b8effb45402944e445a664e4d9c296dL1025

I think we can close this but will leave open for others to comment.

@ctrlhxj
Copy link

ctrlhxj commented Apr 13, 2016

Is there any workaround without restarting the docker daemon? this is quite annoying in the production. We are using docker 1.9.1. We also see some cgroup settings are left over

@SirUrban
Copy link

We had the same problem here today. The comment of @mpalmer did help a lot but not completely. We found out that you can do the following:

lscgroup | grep  <ID> | xargs cgdelete
docker restart <ID>
docker kill <ID>
docker rm <ID>

@rwrnet
Copy link

rwrnet commented Apr 20, 2016

Did not work for us. The

lscgroup | grep <ID> | xargs cgdelete

command did delete the entries form lscgroup, but the container ist still in docker ps and all the 3 steps from @SirUrban did just fail as usual Container does not exist: container destroyed.

Docker engine restart fixes the problem for us, but this is not an acceptable workaround.

Just as an addition to the cases before. We triggered it a couple of times with using docker-compose up ..." and then killed it with double CTRL-C.

for the record:

$ docker kill 69ee03602684
Failed to kill container (69ee03602684): Error response from daemon: Cannot kill container 69ee03602684: [2] Container does not exist: container destroyed

$ lscgroup | grep 69ee03602684
cpuset:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
cpu:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
cpuacct:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
memory:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
devices:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
freezer:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
blkio:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
perf_event:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
hugetlb:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564

# lscgroup | grep 69ee03602684 | xargs cgdelete

# lscgroup | grep 69ee03602684

# docker ps
CONTAINER ID        IMAGE                 COMMAND                   CREATED             STATUS              PORTS                         NAMES
69ee03602684        sessiondb_sink        "start-flume"             2 days ago          Up 2 days                                         sessiondb_sink_1

# docker restart 69ee03602684
Failed to kill container (69ee03602684): Error response from daemon: Cannot restart container 69ee03602684: [2] Container does not exist: container destroyed

@dmyerscough
Copy link

I had the same issue, when I inspect the container the Pid that Docker knows about no longer exists:-

root@XXXX:/var/lib/docker# docker inspect -f "{{ .State.Pid }}" 9a13e6af5414
52428
root@XXXX:/var/lib/docker# ps aux | grep -i 52428
root      39393  0.0  0.0  12728  2172 pts/4    S+   00:11   0:00 grep -i 52428

I am not sure how Docker go into this issue but shouldn't Docker check to see if the Pid exists before trying to stop the container?

@gaumire
Copy link

gaumire commented May 8, 2016

I am having the same issue as @rwrnet mentioned, after suggesting the solution as suggested by @mpalmer the entries disappear from lscgroup but the containers are still listed by docker ps . Anyone got around this ?

@sandyskies
Copy link
Contributor

sandyskies commented May 16, 2016

Containers: 5
Images: 292
Storage Driver: aufs
Root Dir: /data/docker/aufs
Backing Filesystem: extfs
Dirs: 330
Dirperm1 Supported: true
Execution Driver: native-0.2
Kernel Version: 3.10.83
Operating System:
CPUs: 8
Total Memory: 15.38 GiB
ID: BJGD:S5Y5:ONTS:N6LE:4QTC:GSZI:SMK5:JOD3:KJMX:LYQT:NEWG:VWGT

docker version
Client version: 1.6.2
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 7c8fca2/1.6.2
OS/Arch (client): linux/amd64
Server version: 1.6.2
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 7c8fca2/1.6.2
OS/Arch (server): linux/amd64

same issue

@cpuguy83
Copy link
Member

Closing as this is no longer an issue for 1.11.

krallin added a commit to krallin/captain-comeback that referenced this issue Sep 8, 2016
When a container exists right around when Docker was attempting to
restart it, it might get into an awkward state that makes it impossible
to restart. Resolving the issue involves restarting the Docker daemon,
which isn't particularly desirable for us.

So, in order to be able to send SIGTERM to all processes in a cgroup,
without making Docker crazy, we'll just wait for the cgroup to exit
ourselves. If it doesn't, then we'll ask Docker to kill it with
prejudice immediately (i.e. `docker kill -t 0`).

This appears to be resolved in 1.11 (and to be most common on 1.9), but
not all our infrastructure is there yet.

See: moby/moby#12738
krallin added a commit to krallin/captain-comeback that referenced this issue Sep 8, 2016
When a container exists right around when Docker was attempting to
restart it, it might get into an awkward state that makes it impossible
to restart. Resolving the issue involves restarting the Docker daemon,
which isn't particularly desirable for us.

So, in order to be able to send SIGTERM to all processes in a cgroup,
without making Docker crazy, we'll just wait for the cgroup to exit
ourselves. If it doesn't, then we'll ask Docker to kill it with
prejudice immediately (i.e. `docker kill -t 0`).

This appears to be resolved in 1.11 (and to be most common on 1.9), but
not all our infrastructure is there yet.

See: moby/moby#12738
@suchakra012
Copy link

Hi all,

Facing an issue while stopping a running container that is using NFS client to communicate with storage mounted on a separate hosts through NFS.

When I run docker stop <container_id>, it stays hang for quite sometime without showing any change in state process. Any fix for this ??

@thaJeztah
Copy link
Member

@suchakra012 please don't comment on closed issues with unrelated questions; if you suspect there's a bug, open a new issue; for questions about running docker, either https://forums.docker.com, the #docker IRC channel, or StackOverflow are probably better

@suchakra012
Copy link

@thaJeztah Ok fine.

@truedat101
Copy link

truedat101 commented Apr 5, 2019

I'm going to comment on a closed issue. Forgive me Father. It is most definitely my sin that I use docker 18.06.1-ce and I still experience this problem.

Since there was not a single solution posted to people who experience this problem in the wild (a lot of try this and that), here is a rock solid approach.

  • None of the usual commands will work. docker stop or docker kill or docker rm just exit without error, yet the process lives in zombie state.
  • Use this hint: https://stackoverflow.com/a/48922355/796514 . The command systemd-cgls nicely displays all your cgroups and processes, in a tree view.
  • pick out the docker container that is associated with the port number your service is using. Note the pid.
  • Look for the pid of the docker-containerd-shim with a containerid hash that matches your docker container.
  • kill them both
  • check docker ps (process has finally exited)

I apologize for the snark. It is sin that causes me to do things that are bad, but this was particularly unfortunate to encounter in production with the normal things not working, and no way to really detect this without attaching a logger to syslog to catch the crashed docker instance.

@saeed0808
Copy link

docker rm -f 3ed2
Error response from daemon: Could not kill running container 3ed2f28a240cab77c902656789649330b93a2db62fb65b552cdbf3c2a44e77bd, cannot remove - Cannot kill container 3ed2f28a240cab77c902656789649330b93a2db62fb65b552cdbf3c2a44e77bd: unknown error after kill: docker-runc did not terminate sucessfully: container_linux.go:393: signaling init process caused "permission denied"
: unknown

what the problem here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. priority/P1 Important: P1 issues are a top priority and a must-have for the next release. version/1.6
Projects
None yet
Development

No branches or pull requests