Cannot stop container XXXXXXXXXXXX: [2] Container does not exist: container destroyed #12738

Closed
ernetas opened this Issue Apr 24, 2015 · 119 comments

Comments

Projects
None yet
@ernetas

ernetas commented Apr 24, 2015

Docker fails to stop a container:

root@haswell:/home/supervisor# docker ps | grep e16
e16c5c80ab4e        ernetas/local:latest   "/sbin/my_init -- /b   10 days ago         Up 3 minutes                            romantic_kirch      
root@haswell:/home/supervisor# docker stop e16c5c80ab4e
Error response from daemon: Cannot stop container e16c5c80ab4e: [2] Container does not exist: container destroyed
FATA[0000] Error: failed to stop one or more containers 
root@haswell:/home/supervisor# 

Docker version and info:

supervisor@haswell:~$ sudo docker -D info
Containers: 8
Images: 40
Storage Driver: aufs
 Root Dir: /storage2/docker/aufs
 Backing Filesystem: extfs
 Dirs: 56
 Dirperm1 Supported: false
Execution Driver: native-0.2
Kernel Version: 3.13.0-45-generic
Operating System: Ubuntu 14.04.2 LTS
CPUs: 4
Total Memory: 31.35 GiB
Name: haswell
ID: T2LH:VCTV:CZGW:WBD6:7FQ6:TNLY:BEB4:ATGH:CBZZ:FEPU:JTRY:XNVQ
Debug mode (server): false
Debug mode (client): true
Fds: 35
Goroutines: 73
System Time: Fri Apr 24 11:45:27 EEST 2015
EventsListeners: 0
Init SHA1: 9145575052383dbf64cede3bac278606472e027c
Init Path: /usr/bin/docker
Docker Root Dir: /storage2/docker
supervisor@haswell:~$ sudo docker -D version
Client version: 1.6.0
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 4749651
OS/Arch (client): linux/amd64
Server version: 1.6.0
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 4749651
OS/Arch (server): linux/amd64
supervisor@haswell:~$ 
@alexeyfrank

This comment has been minimized.

Show comment
Hide comment
@alexeyfrank

alexeyfrank Apr 24, 2015

I've got this error when I use docker remote api v1.17 and docker 1.6


alexander.v@ip-10-0-1-224:~$  docker ps -a | grep hexletjava_m3e1-6810
a73191529f41        hexletboy/hexletjava_m3e1:568                  "/usr/bin/supervisor   38 hours ago        Up 16 hours                      172.17.42.1:32926->8000/tcp, 172.17.42.1:32927->8080/tcp                    hexletjava_m3e1-6810
alexander.v@ip-10-0-1-224:~$ docker stop a73191529f41
Error response from daemon: Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
FATA[0000] Error: failed to stop one or more containers

docker logs:


INFO[80220] POST /v1.18/containers/a73191529f41/stop?t=10
INFO[80220] +job stop(a73191529f41)
INFO[80220] Failed to send SIGTERM to the process, force killing
Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
INFO[80220] -job stop(a73191529f41) = ERR (1)
ERRO[80220] Handler for POST /containers/{name:.*}/stop returned error: Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
ERRO[80220] HTTP Error: statusCode=500 Cannot stop container a73191529f41: [2] Container does not exist: container destroyed

I've got this error when I use docker remote api v1.17 and docker 1.6


alexander.v@ip-10-0-1-224:~$  docker ps -a | grep hexletjava_m3e1-6810
a73191529f41        hexletboy/hexletjava_m3e1:568                  "/usr/bin/supervisor   38 hours ago        Up 16 hours                      172.17.42.1:32926->8000/tcp, 172.17.42.1:32927->8080/tcp                    hexletjava_m3e1-6810
alexander.v@ip-10-0-1-224:~$ docker stop a73191529f41
Error response from daemon: Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
FATA[0000] Error: failed to stop one or more containers

docker logs:


INFO[80220] POST /v1.18/containers/a73191529f41/stop?t=10
INFO[80220] +job stop(a73191529f41)
INFO[80220] Failed to send SIGTERM to the process, force killing
Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
INFO[80220] -job stop(a73191529f41) = ERR (1)
ERRO[80220] Handler for POST /containers/{name:.*}/stop returned error: Cannot stop container a73191529f41: [2] Container does not exist: container destroyed
ERRO[80220] HTTP Error: statusCode=500 Cannot stop container a73191529f41: [2] Container does not exist: container destroyed

@PlugIN73

This comment has been minimized.

Show comment
Hide comment

👍

@zzet

This comment has been minimized.

Show comment
Hide comment

zzet commented Apr 24, 2015

👍

@PanfilovDenis

This comment has been minimized.

Show comment
Hide comment

👍

@arindamchoudhury

This comment has been minimized.

Show comment
Hide comment
@coolljt0725

This comment has been minimized.

Show comment
Hide comment
@coolljt0725

coolljt0725 Apr 24, 2015

Contributor

can you provide some steps to reproduce this?

Contributor

coolljt0725 commented Apr 24, 2015

can you provide some steps to reproduce this?

@aimxhaisse

This comment has been minimized.

Show comment
Hide comment
@aimxhaisse

aimxhaisse Apr 28, 2015

Contributor

👍

Contributor

aimxhaisse commented Apr 28, 2015

👍

@hugochinchilla

This comment has been minimized.

Show comment
Hide comment
@hugochinchilla

hugochinchilla Apr 29, 2015

Happened to me today

Happened to me today

@AduchiMergen

This comment has been minimized.

Show comment
Hide comment

👍

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Apr 29, 2015

Contributor

@icecrime Can you get this prioritized?
Same error in our env, but with a unprivileged container so seems unrelated to that.

I also see that the old process is still around, so this looks like a new 'ghost container' issue.

Contributor

discordianfish commented Apr 29, 2015

@icecrime Can you get this prioritized?
Same error in our env, but with a unprivileged container so seems unrelated to that.

I also see that the old process is still around, so this looks like a new 'ghost container' issue.

@maspwr

This comment has been minimized.

Show comment
Hide comment
@maspwr

maspwr Apr 30, 2015

We have also been hitting this issue. I'm trying to narrow down steps to reproduce.

maspwr commented Apr 30, 2015

We have also been hitting this issue. I'm trying to narrow down steps to reproduce.

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Apr 30, 2015

Collaborator

If someone can reproduce this in a VM and snapshot the VM that'd be really fantastic, these are bugs that are hard to reproduce :(

Collaborator

tiborvass commented Apr 30, 2015

If someone can reproduce this in a VM and snapshot the VM that'd be really fantastic, these are bugs that are hard to reproduce :(

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Apr 30, 2015

Contributor

reproducible with

while ID=$(docker run -d busybox true) && docker stop $ID; do :; done
Contributor

LK4D4 commented Apr 30, 2015

reproducible with

while ID=$(docker run -d busybox true) && docker stop $ID; do :; done
@icecrime

This comment has been minimized.

Show comment
Hide comment
@icecrime

icecrime May 1, 2015

Contributor

The reproduction case seems to imply a race condition, but then, this only happen if you issue docker stop at the same time that the container process dies by itself. This is quite unlikely, and probably don't explain all the occurrences reported here.

@discordianfish Could you tell us more about the way this happened for you? In particular: is there a chance that the container process died at the same time that you issued the docker stop command, or was it a long running process that had absolutely no reason to exit by itself?

Contributor

icecrime commented May 1, 2015

The reproduction case seems to imply a race condition, but then, this only happen if you issue docker stop at the same time that the container process dies by itself. This is quite unlikely, and probably don't explain all the occurrences reported here.

@discordianfish Could you tell us more about the way this happened for you? In particular: is there a chance that the container process died at the same time that you issued the docker stop command, or was it a long running process that had absolutely no reason to exit by itself?

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish May 1, 2015

Contributor

@icecrime No sorry, I don't have more details. I found those systems in that state and some containers were running according to docker ps but couldn't be stopped. Additionally to that, I saw older processes still running.
We upgraded Docker from 1.5.0 to 1.6.0 when @avinson first saw this issue, maybe he knows more. But I could imaging that this is how we got into the race condition: Maybe stopping docker tries to stop the container multiple times and sometimes right in that moment to trigger the race condition?

Beside that, it looks like we have no 'stress tests' in the docker test suite which would have caught that issue. As @LK4D4 shows, at least some race condition is easily found by this simple test.

Contributor

discordianfish commented May 1, 2015

@icecrime No sorry, I don't have more details. I found those systems in that state and some containers were running according to docker ps but couldn't be stopped. Additionally to that, I saw older processes still running.
We upgraded Docker from 1.5.0 to 1.6.0 when @avinson first saw this issue, maybe he knows more. But I could imaging that this is how we got into the race condition: Maybe stopping docker tries to stop the container multiple times and sometimes right in that moment to trigger the race condition?

Beside that, it looks like we have no 'stress tests' in the docker test suite which would have caught that issue. As @LK4D4 shows, at least some race condition is easily found by this simple test.

@ambroisemaupate

This comment has been minimized.

Show comment
Hide comment
@ambroisemaupate

ambroisemaupate May 2, 2015

Happened to me since 1.6.0 upgrade :

$ docker rm -f 46f4675d50c1 
Error response from daemon: Could not kill running container, cannot remove - [2] Container does not exist: container destroyed
FATA[0000] Error: failed to remove one or more containers 
Linux xxxxxxxxxxxx.com 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:08:34 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
$ docker inspect 46f4675d50c1
[{
    "AppArmorProfile": "",
    "Args": [
        "-c",
        "/config/loop"
    ],
    "Config": {
        "AttachStderr": true,
        "AttachStdin": false,
        "AttachStdout": true,
        "Cmd": [
            "/bin/sh",
            "-c",
            "/config/loop"
        ],
        "CpuShares": 0,
        "Cpuset": "",
        "Domainname": "",
        "Entrypoint": null,
        "Env": [
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            "DEBIAN_FRONTEND=noninteractive",
            "ROADIZ_BRANCH=master"
        ],
        "ExposedPorts": {
            "80/tcp": {}
        },
        "Hostname": "46f4675d50c1",
        "Image": "roadiz/roadiz",
        "Labels": null,
        "MacAddress": "",
        "Memory": 0,
        "MemorySwap": 0,
        "NetworkDisabled": false,
        "OnBuild": null,
        "OpenStdin": false,
        "PortSpecs": null,
        "StdinOnce": false,
        "Tty": true,
        "User": "",
        "Volumes": null,
        "WorkingDir": ""
    },
    "Created": "2015-04-27T21:54:17.310271095Z",
    "Driver": "aufs",
    "ExecDriver": "native-0.2",
    "ExecIDs": [
        "55c474e453bff882f08f6074a3974d50321add4b039cdb99ac7462acc3f66b08"
    ],
    "HostConfig": {
        "Binds": null,
        "CapAdd": null,
        "CapDrop": null,
        "CgroupParent": "",
        "ContainerIDFile": "",
        "CpuShares": 0,
        "CpusetCpus": "",
        "Devices": null,
        "Dns": null,
        "DnsSearch": null,
        "ExtraHosts": null,
        "IpcMode": "",
        "Links": [
            "/mariadb:/roadiz/mariadb"
        ],
        "LogConfig": {
            "Config": null,
            "Type": "json-file"
        },
        "LxcConf": [],
        "Memory": 0,
        "MemorySwap": 0,
        "NetworkMode": "bridge",
        "PidMode": "",
        "PortBindings": {
            "80/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49153"
                }
            ]
        },
        "Privileged": false,
        "PublishAllPorts": false,
        "ReadonlyRootfs": false,
        "RestartPolicy": {
            "MaximumRetryCount": 0,
            "Name": ""
        },
        "SecurityOpt": null,
        "Ulimits": null,
        "VolumesFrom": [
            "data-roadiz"
        ]
    },
    "HostnamePath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/hostname",
    "HostsPath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/hosts",
    "Id": "46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d",
    "Image": "bc28093328c52de86f698f654f0926ca897c2a4ff95ebc768f1337bec4a42965",
    "LogPath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d-json.log",
    "MountLabel": "",
    "Name": "/roadiz",
    "NetworkSettings": {
        "Bridge": "docker0",
        "Gateway": "172.17.42.1",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "IPAddress": "172.17.0.4",
        "IPPrefixLen": 16,
        "IPv6Gateway": "",
        "LinkLocalIPv6Address": "fe80::42:acff:fe11:4",
        "LinkLocalIPv6PrefixLen": 64,
        "MacAddress": "02:42:ac:11:00:04",
        "PortMapping": null,
        "Ports": {
            "80/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49153"
                }
            ]
        }
    },
    "Path": "/bin/sh",
    "ProcessLabel": "",
    "ResolvConfPath": "/etc/resolv.conf",
    "RestartCount": 0,
    "State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "2015-05-02T00:17:33.063361405Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 16345,
        "Restarting": false,
        "Running": true,
        "StartedAt": "2015-05-02T00:23:43.684876553Z"
    },
    "Volumes": {
        "/data": "/var/lib/docker/vfs/dir/09b9b1dab4a8f6e4ad7e2d6651a794e94d154e3618acc8a43b339489911e1070"
    },
    "VolumesRW": {
        "/data": true
    }
}
]

Happened to me since 1.6.0 upgrade :

$ docker rm -f 46f4675d50c1 
Error response from daemon: Could not kill running container, cannot remove - [2] Container does not exist: container destroyed
FATA[0000] Error: failed to remove one or more containers 
Linux xxxxxxxxxxxx.com 3.13.0-51-generic #84-Ubuntu SMP Wed Apr 15 12:08:34 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
$ docker inspect 46f4675d50c1
[{
    "AppArmorProfile": "",
    "Args": [
        "-c",
        "/config/loop"
    ],
    "Config": {
        "AttachStderr": true,
        "AttachStdin": false,
        "AttachStdout": true,
        "Cmd": [
            "/bin/sh",
            "-c",
            "/config/loop"
        ],
        "CpuShares": 0,
        "Cpuset": "",
        "Domainname": "",
        "Entrypoint": null,
        "Env": [
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            "DEBIAN_FRONTEND=noninteractive",
            "ROADIZ_BRANCH=master"
        ],
        "ExposedPorts": {
            "80/tcp": {}
        },
        "Hostname": "46f4675d50c1",
        "Image": "roadiz/roadiz",
        "Labels": null,
        "MacAddress": "",
        "Memory": 0,
        "MemorySwap": 0,
        "NetworkDisabled": false,
        "OnBuild": null,
        "OpenStdin": false,
        "PortSpecs": null,
        "StdinOnce": false,
        "Tty": true,
        "User": "",
        "Volumes": null,
        "WorkingDir": ""
    },
    "Created": "2015-04-27T21:54:17.310271095Z",
    "Driver": "aufs",
    "ExecDriver": "native-0.2",
    "ExecIDs": [
        "55c474e453bff882f08f6074a3974d50321add4b039cdb99ac7462acc3f66b08"
    ],
    "HostConfig": {
        "Binds": null,
        "CapAdd": null,
        "CapDrop": null,
        "CgroupParent": "",
        "ContainerIDFile": "",
        "CpuShares": 0,
        "CpusetCpus": "",
        "Devices": null,
        "Dns": null,
        "DnsSearch": null,
        "ExtraHosts": null,
        "IpcMode": "",
        "Links": [
            "/mariadb:/roadiz/mariadb"
        ],
        "LogConfig": {
            "Config": null,
            "Type": "json-file"
        },
        "LxcConf": [],
        "Memory": 0,
        "MemorySwap": 0,
        "NetworkMode": "bridge",
        "PidMode": "",
        "PortBindings": {
            "80/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49153"
                }
            ]
        },
        "Privileged": false,
        "PublishAllPorts": false,
        "ReadonlyRootfs": false,
        "RestartPolicy": {
            "MaximumRetryCount": 0,
            "Name": ""
        },
        "SecurityOpt": null,
        "Ulimits": null,
        "VolumesFrom": [
            "data-roadiz"
        ]
    },
    "HostnamePath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/hostname",
    "HostsPath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/hosts",
    "Id": "46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d",
    "Image": "bc28093328c52de86f698f654f0926ca897c2a4ff95ebc768f1337bec4a42965",
    "LogPath": "/var/lib/docker/containers/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d/46f4675d50c13ac965fcd420cc15d63a2338e64820d62d474f468ac1ecd6ac0d-json.log",
    "MountLabel": "",
    "Name": "/roadiz",
    "NetworkSettings": {
        "Bridge": "docker0",
        "Gateway": "172.17.42.1",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "IPAddress": "172.17.0.4",
        "IPPrefixLen": 16,
        "IPv6Gateway": "",
        "LinkLocalIPv6Address": "fe80::42:acff:fe11:4",
        "LinkLocalIPv6PrefixLen": 64,
        "MacAddress": "02:42:ac:11:00:04",
        "PortMapping": null,
        "Ports": {
            "80/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "49153"
                }
            ]
        }
    },
    "Path": "/bin/sh",
    "ProcessLabel": "",
    "ResolvConfPath": "/etc/resolv.conf",
    "RestartCount": 0,
    "State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "2015-05-02T00:17:33.063361405Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 16345,
        "Restarting": false,
        "Running": true,
        "StartedAt": "2015-05-02T00:23:43.684876553Z"
    },
    "Volumes": {
        "/data": "/var/lib/docker/vfs/dir/09b9b1dab4a8f6e4ad7e2d6651a794e94d154e3618acc8a43b339489911e1070"
    },
    "VolumesRW": {
        "/data": true
    }
}
]
@enguerran

This comment has been minimized.

Show comment
Hide comment
@enguerran

enguerran May 5, 2015

Contributor

Is someone has a solution to stop those kind of ghost containers without restarting the daemon? (I even don't know if restarting the daemon will stop those ghost containers)

Contributor

enguerran commented May 5, 2015

Is someone has a solution to stop those kind of ghost containers without restarting the daemon? (I even don't know if restarting the daemon will stop those ghost containers)

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish May 5, 2015

Contributor

@enguerran The workaround for me was to stop docker, kill the ghost containers manually, remove /var/lib/docker/containers/[container-id] and start docker again.

Contributor

discordianfish commented May 5, 2015

@enguerran The workaround for me was to stop docker, kill the ghost containers manually, remove /var/lib/docker/containers/[container-id] and start docker again.

@jessfraz jessfraz self-assigned this May 5, 2015

@jessfraz jessfraz added the kind/bug label May 5, 2015

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz May 5, 2015

Contributor

is everyone here using aufs trying to reproduce

Contributor

jessfraz commented May 5, 2015

is everyone here using aufs trying to reproduce

@PlugIN73

This comment has been minimized.

Show comment
Hide comment
@PlugIN73

PlugIN73 May 5, 2015

@jfrazelle I've got this error exactly on aufs! But my containers starts and stops automatically - i can't get any steps to reproduce :(
May be you have some guess that i can try?

PlugIN73 commented May 5, 2015

@jfrazelle I've got this error exactly on aufs! But my containers starts and stops automatically - i can't get any steps to reproduce :(
May be you have some guess that i can try?

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz May 5, 2015

Contributor

So the next time anyone encounters this can you check:

  1. if the PID from inspect is still in ps -aux to see if the process is still running
  2. How long the process has been running for & how long the container has been running for
  3. if there were any errors when you started the container

all of this information would be very very helpful

Contributor

jessfraz commented May 5, 2015

So the next time anyone encounters this can you check:

  1. if the PID from inspect is still in ps -aux to see if the process is still running
  2. How long the process has been running for & how long the container has been running for
  3. if there were any errors when you started the container

all of this information would be very very helpful

@PlugIN73

This comment has been minimized.

Show comment
Hide comment
@PlugIN73

PlugIN73 May 5, 2015

@jfrazelle in my case the process with pid from container's inspect does not exist.
p.s. if it's helpful:
I returned to docker 1.5 (with aufs) and sometimes i see this error:

Error response from daemon: Cannot stop container 4a1afeeafd12: no such process
FATA[0000] Error: failed to stop one or more containers

wherein no process with pid from container's inspect and docker ps show me running container.

4a1afeeafd12        hexletboy/hexletjava_m3e1:568                "/usr/bin/supervisor   7 days ago          Up 7 days           172.17.42.1:49153->8000/tcp, 172.17.42.1:49154->8080/tcp   hexletjava_m3e1-7268

PlugIN73 commented May 5, 2015

@jfrazelle in my case the process with pid from container's inspect does not exist.
p.s. if it's helpful:
I returned to docker 1.5 (with aufs) and sometimes i see this error:

Error response from daemon: Cannot stop container 4a1afeeafd12: no such process
FATA[0000] Error: failed to stop one or more containers

wherein no process with pid from container's inspect and docker ps show me running container.

4a1afeeafd12        hexletboy/hexletjava_m3e1:568                "/usr/bin/supervisor   7 days ago          Up 7 days           172.17.42.1:49153->8000/tcp, 172.17.42.1:49154->8080/tcp   hexletjava_m3e1-7268
@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 May 5, 2015

Contributor

@PlugIN73 It is superimportant information.
I see that you using supervisor. Can you also try to find if processes which was spawned by supervisor still alive too?

Contributor

LK4D4 commented May 5, 2015

@PlugIN73 It is superimportant information.
I see that you using supervisor. Can you also try to find if processes which was spawned by supervisor still alive too?

@PlugIN73

This comment has been minimized.

Show comment
Hide comment
@PlugIN73

PlugIN73 May 5, 2015

@LK4D4 i saw them less than a week ago. Now i can't find this processes because i killed them. I didn't understand how it possible (and I could not compare these processes with these dead containers - not enough information).

PlugIN73 commented May 5, 2015

@LK4D4 i saw them less than a week ago. Now i can't find this processes because i killed them. I didn't understand how it possible (and I could not compare these processes with these dead containers - not enough information).

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 May 5, 2015

Contributor

@PlugIN73 Yeah, we don't understand how it possible too :/

Contributor

LK4D4 commented May 5, 2015

@PlugIN73 Yeah, we don't understand how it possible too :/

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 May 6, 2015

Contributor

Also, all who encounter this issue: how do you run your processes inside? Is it like with supervisord or maybe sh script which starts something. I'm in particular interested if there is problems with simple one-processes inside container.
Also feel free to provide some scripts to run your containers if it's not very secret.

Contributor

LK4D4 commented May 6, 2015

Also, all who encounter this issue: how do you run your processes inside? Is it like with supervisord or maybe sh script which starts something. I'm in particular interested if there is problems with simple one-processes inside container.
Also feel free to provide some scripts to run your containers if it's not very secret.

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish May 6, 2015

Contributor

@LK4D4 Here it happened to our collins image which uses the official collins image as FROM[1]. We run reefer[2] as entrypoint which then exec's java, so it's single process and no bash is involved. I'm also using btrfs, so it's not limited to aufs.

  1. https://registry.hub.docker.com/u/tumblr/collins/
  2. https://github.com/docker-infra/reefer
Contributor

discordianfish commented May 6, 2015

@LK4D4 Here it happened to our collins image which uses the official collins image as FROM[1]. We run reefer[2] as entrypoint which then exec's java, so it's single process and no bash is involved. I'm also using btrfs, so it's not limited to aufs.

  1. https://registry.hub.docker.com/u/tumblr/collins/
  2. https://github.com/docker-infra/reefer
@heyman

This comment has been minimized.

Show comment
Hide comment
@heyman

heyman May 6, 2015

It has happened twice to me using the image found here (it also uses supervisor): https://github.com/heyman/graphite_docker

heyman commented May 6, 2015

It has happened twice to me using the image found here (it also uses supervisor): https://github.com/heyman/graphite_docker

@ltunc

This comment has been minimized.

Show comment
Hide comment
@ltunc

ltunc May 6, 2015

About processes inside a container:
(note, I use clue/adminer:latest image that runs supervisor)
On the host machine is no process with PID from "inspect" information

> ps -p `docker inspect --format="{{.State.Pid}}" psup-adminer` -o pid,ppid,cmd
PID  PPID CMD

(docker inspect returns PID 2978)
I can run docker top command and see that processes are there:

> docker top psup-adminer
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                4828                1                   0                   Apr30               ?                   00:00:00            /bin/sh -c supervisord -c /etc/supervisor/conf.d/supervisord.conf
root                4848                4828                0                   Apr30               ?                   00:01:14            /usr/bin/python /usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf
...

If I remember correct the PPID of the first process should be PID from docker inspect (2978), but not "1".
I checked, on the host system process with PID 4828 exist and it's supervisor:

>  ps -p 4828 -o pid,ppid,stime,command
PID  PPID STIME COMMAND
4828     1 Apr30 /bin/sh -c supervisord -c /etc/supervisor/conf.d/supervisord.conf

But if I try to exec in the ghost container (it's somehow possible) and list processes inside the container:

> docker exec -t psup-adminer ps uax
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  2.0  0.0   7128   640 ?        Rs+  09:53   0:00 ps uax

There no other processes

UPD: About time when container was created and started
Container psup-adminer was created "2015-04-30T15:16:31.948991521Z"
Full dump of State:

"State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "2015-05-05T14:45:13.269479433Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 2978,
        "Restarting": false,
        "Running": true,
        "StartedAt": "2015-05-05T14:46:53.490814705Z"
    },

2015-05-05T14:45:13.269479433Z I restarted the container

ltunc commented May 6, 2015

About processes inside a container:
(note, I use clue/adminer:latest image that runs supervisor)
On the host machine is no process with PID from "inspect" information

> ps -p `docker inspect --format="{{.State.Pid}}" psup-adminer` -o pid,ppid,cmd
PID  PPID CMD

(docker inspect returns PID 2978)
I can run docker top command and see that processes are there:

> docker top psup-adminer
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                4828                1                   0                   Apr30               ?                   00:00:00            /bin/sh -c supervisord -c /etc/supervisor/conf.d/supervisord.conf
root                4848                4828                0                   Apr30               ?                   00:01:14            /usr/bin/python /usr/bin/supervisord -c /etc/supervisor/conf.d/supervisord.conf
...

If I remember correct the PPID of the first process should be PID from docker inspect (2978), but not "1".
I checked, on the host system process with PID 4828 exist and it's supervisor:

>  ps -p 4828 -o pid,ppid,stime,command
PID  PPID STIME COMMAND
4828     1 Apr30 /bin/sh -c supervisord -c /etc/supervisor/conf.d/supervisord.conf

But if I try to exec in the ghost container (it's somehow possible) and list processes inside the container:

> docker exec -t psup-adminer ps uax
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  2.0  0.0   7128   640 ?        Rs+  09:53   0:00 ps uax

There no other processes

UPD: About time when container was created and started
Container psup-adminer was created "2015-04-30T15:16:31.948991521Z"
Full dump of State:

"State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "2015-05-05T14:45:13.269479433Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 2978,
        "Restarting": false,
        "Running": true,
        "StartedAt": "2015-05-05T14:46:53.490814705Z"
    },

2015-05-05T14:45:13.269479433Z I restarted the container

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish May 6, 2015

Contributor

FWIW, I saw the same behaviour: The 'ghost' process got reparented to the hosts init but haven't checked if the inspect pid matches the actual pid.

Contributor

discordianfish commented May 6, 2015

FWIW, I saw the same behaviour: The 'ghost' process got reparented to the hosts init but haven't checked if the inspect pid matches the actual pid.

@ltunc

This comment has been minimized.

Show comment
Hide comment
@ltunc

ltunc May 6, 2015

After analysis of atop logs find out that:
process 4828 was started about 2015-04-30T15:10Z
process 2978 was started about 2015-05-05T14:50Z
both processes existed until about 2015-05-06T09:10Z (when I tried to stop the ghost container and received an error "Error response from daemon: Cannot stop container psup-adminer: [2] Container does not exist: container destroyed")
after that only process 4828 remains running.

I also tried to kill that process (4828) now there no processes in this "ghost" container even in docker top but container cannot be removed and remains active.

ltunc commented May 6, 2015

After analysis of atop logs find out that:
process 4828 was started about 2015-04-30T15:10Z
process 2978 was started about 2015-05-05T14:50Z
both processes existed until about 2015-05-06T09:10Z (when I tried to stop the ghost container and received an error "Error response from daemon: Cannot stop container psup-adminer: [2] Container does not exist: container destroyed")
after that only process 4828 remains running.

I also tried to kill that process (4828) now there no processes in this "ghost" container even in docker top but container cannot be removed and remains active.

@heyman heyman referenced this issue in SamSaffron/graphite_docker May 6, 2015

Closed

Missing counter data #4

@coffenbacher

This comment has been minimized.

Show comment
Hide comment
@coffenbacher

coffenbacher May 8, 2015

I've been running into this when running docker-machine upgrade. (From 1.5->1.6). Out of three machines upgraded, I have seen this on containers on each. My containers are mostly one-off processes. I can see them running under docker top and host top, but cannot control them through the docker daemon due to the "container destroyed" issue.

Is there anything additional we can do to help here?

I've been running into this when running docker-machine upgrade. (From 1.5->1.6). Out of three machines upgraded, I have seen this on containers on each. My containers are mostly one-off processes. I can see them running under docker top and host top, but cannot control them through the docker daemon due to the "container destroyed" issue.

Is there anything additional we can do to help here?

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass May 8, 2015

Collaborator

@coffenbacher yes. If you could provide a VM that has a docker daemon in that state, it would be extremely helpful!

Collaborator

tiborvass commented May 8, 2015

@coffenbacher yes. If you could provide a VM that has a docker daemon in that state, it would be extremely helpful!

@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass May 8, 2015

Collaborator

@coffenbacher If not, then at least this: #12738 (comment)

Collaborator

tiborvass commented May 8, 2015

@coffenbacher If not, then at least this: #12738 (comment)

@bobrik

This comment has been minimized.

Show comment
Hide comment
@bobrik

bobrik May 9, 2015

Contributor
{
    "State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "0001-01-01T00:00:00Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 18509,
        "Restarting": false,
        "Running": true,
        "StartedAt": "2015-05-08T17:31:12.703335608Z"
    }
}

devicemapper, docker 1.6, process 18509 is missing.

web581 ~ # ps aux | fgrep 18509
root     94101  0.0  0.0   7904   688 pts/0    S+   13:23   0:00 fgrep 18509
web581 ~ # docker rm -f 10349b27a5d2
Error response from daemon: Could not kill running container, cannot remove - [2] Container does not exist: container destroyed
FATA[0000] Error: failed to remove one or more containers

So far this is the single host out of 200 with this symptom.

Contributor

bobrik commented May 9, 2015

{
    "State": {
        "Dead": false,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "0001-01-01T00:00:00Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 18509,
        "Restarting": false,
        "Running": true,
        "StartedAt": "2015-05-08T17:31:12.703335608Z"
    }
}

devicemapper, docker 1.6, process 18509 is missing.

web581 ~ # ps aux | fgrep 18509
root     94101  0.0  0.0   7904   688 pts/0    S+   13:23   0:00 fgrep 18509
web581 ~ # docker rm -f 10349b27a5d2
Error response from daemon: Could not kill running container, cannot remove - [2] Container does not exist: container destroyed
FATA[0000] Error: failed to remove one or more containers

So far this is the single host out of 200 with this symptom.

@maspwr

This comment has been minimized.

Show comment
Hide comment
@maspwr

maspwr May 9, 2015

I had the issue again last night.

if the PID from inspect is still in ps -aux to see if the process is still running

It was not still running. Also, this container was running supervisord, which is the only container type I've seen with this problem.

maspwr commented May 9, 2015

I had the issue again last night.

if the PID from inspect is still in ps -aux to see if the process is still running

It was not still running. Also, this container was running supervisord, which is the only container type I've seen with this problem.

@bobrik

This comment has been minimized.

Show comment
Hide comment
@bobrik

bobrik May 9, 2015

Contributor

Ran into this issue again, this time with mesos slave container. Previous one was collectd that forked to another process and was reading from stdout. Both containers were running in host pid ns.

Contributor

bobrik commented May 9, 2015

Ran into this issue again, this time with mesos slave container. Previous one was collectd that forked to another process and was reading from stdout. Both containers were running in host pid ns.

@thaJeztah thaJeztah modified the milestones: 1.10.0, 1.7.0 Jan 7, 2016

@fermayo

This comment has been minimized.

Show comment
Hide comment
@fermayo

fermayo Jan 7, 2016

Contributor

This usually happens when upgrading from and older version of the engine. I don't have specific steps to reproduce it though.

Contributor

fermayo commented Jan 7, 2016

This usually happens when upgrading from and older version of the engine. I don't have specific steps to reproduce it though.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Jan 7, 2016

Member

@fermayo is that on a clean shutdown? Looking at earlier comments, it looks to be related to the daemon not being shut down cleanly, possibly causing processes to not have been reaped; docker#12738 (comment)

Member

thaJeztah commented Jan 7, 2016

@fermayo is that on a clean shutdown? Looking at earlier comments, it looks to be related to the daemon not being shut down cleanly, possibly causing processes to not have been reaped; docker#12738 (comment)

@dqminh

This comment has been minimized.

Show comment
Hide comment
@dqminh

dqminh Jan 7, 2016

Contributor

@thaJeztah for our case, the containers were managed by mesos executors, and not related to upgrade/downgrade of the daemon. I'm not sure how it ended up in that state, this is the first time i've seen it in the cluster.

Contributor

dqminh commented Jan 7, 2016

@thaJeztah for our case, the containers were managed by mesos executors, and not related to upgrade/downgrade of the daemon. I'm not sure how it ended up in that state, this is the first time i've seen it in the cluster.

@tiborvass tiborvass removed this from the 1.10.0 milestone Jan 20, 2016

@Asp3ctus

This comment has been minimized.

Show comment
Hide comment
@Asp3ctus

Asp3ctus Jan 26, 2016

HI, i just run to this on Docker version 1.9.1, build a34a1d5

And i had started a new docker container with --restart=always but the container was crashing from the first start. I had to run: docker exec {name} /bin/bash -c "git pull" to pull a new config and it would stop crashing.

When i run this command some times it would just exit, i guess in that time docker restarts the container, but i kept repeating this and it would pull from 4-5 try.

I had 4 workers, 3 of them pulled and started fine, but 4th pulled the new config and got stuck in this state.

docker ps shows it as up, but i cant restart, stop, rm it.

I don't restart the host or docker deamon .. i have it in that state.

I hoppe this help to catch it.

HI, i just run to this on Docker version 1.9.1, build a34a1d5

And i had started a new docker container with --restart=always but the container was crashing from the first start. I had to run: docker exec {name} /bin/bash -c "git pull" to pull a new config and it would stop crashing.

When i run this command some times it would just exit, i guess in that time docker restarts the container, but i kept repeating this and it would pull from 4-5 try.

I had 4 workers, 3 of them pulled and started fine, but 4th pulled the new config and got stuck in this state.

docker ps shows it as up, but i cant restart, stop, rm it.

I don't restart the host or docker deamon .. i have it in that state.

I hoppe this help to catch it.

@dkinzer

This comment has been minimized.

Show comment
Hide comment
@dkinzer

dkinzer Feb 16, 2016

Only thing I could do to fix this was:

docker-machine stop default
docker-machine start default
docker rm -f  2a6d4c9b688a

dkinzer commented Feb 16, 2016

Only thing I could do to fix this was:

docker-machine stop default
docker-machine start default
docker rm -f  2a6d4c9b688a
@jphollanti

This comment has been minimized.

Show comment
Hide comment
@jphollanti

jphollanti Feb 17, 2016

I had to rm -rf /var/lib/docker/ and update Docker :0

I had to rm -rf /var/lib/docker/ and update Docker :0

@riyadparvez

This comment has been minimized.

Show comment
Hide comment

+1

@kevinconaway kevinconaway referenced this issue in fabric8io/docker-maven-plugin Feb 24, 2016

Open

Build Fails When Docker Container Cannot be Stopped #391

@jorgemarey

This comment has been minimized.

Show comment
Hide comment
@jorgemarey

jorgemarey Feb 26, 2016

I think the same happened to me on docker 1.10.1:

Client:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

docker logs:

Feb 26 08:46:01 host docker[22141]: time="2016-02-26T08:46:01.451368355Z" level=info msg="Failed to send SIGTERM to the process, force killing"
Feb 26 08:46:01 host docker[22141]: time="2016-02-26T08:46:01.451666523Z" level=error msg="Handler for POST /containers/6ce45e61d9dac86a2a52b54145abbb15cad17b398bf1d821e5eb68c30c5cd020/stop returned error: Cannot stop container 6ce45e61d9dac86a2a52b54145abbb15cad17b398bf1d821e5eb68c30c5cd020: [2] Container does not exist: container destroyed\n"

I send a docker rm -f to a container that's up. Docker reports it ending ok, but the container is still there.

No way of making it dissapear.

I think the same happened to me on docker 1.10.1:

Client:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:09:42 2016
 OS/Arch:      linux/amd64

docker logs:

Feb 26 08:46:01 host docker[22141]: time="2016-02-26T08:46:01.451368355Z" level=info msg="Failed to send SIGTERM to the process, force killing"
Feb 26 08:46:01 host docker[22141]: time="2016-02-26T08:46:01.451666523Z" level=error msg="Handler for POST /containers/6ce45e61d9dac86a2a52b54145abbb15cad17b398bf1d821e5eb68c30c5cd020/stop returned error: Cannot stop container 6ce45e61d9dac86a2a52b54145abbb15cad17b398bf1d821e5eb68c30c5cd020: [2] Container does not exist: container destroyed\n"

I send a docker rm -f to a container that's up. Docker reports it ending ok, but the container is still there.

No way of making it dissapear.

@mpalmer

This comment has been minimized.

Show comment
Hide comment
@mpalmer

mpalmer Feb 29, 2016

If you've got a container that just won't die after you docker rm -f it, you can force-nuke it with

lscgroup | grep <shortID> | xargs cgdelete

Where <shortID> is the blob of hex that identifies the container in docker ps.

It's not a fix, but it's an effective workaround.

mpalmer commented Feb 29, 2016

If you've got a container that just won't die after you docker rm -f it, you can force-nuke it with

lscgroup | grep <shortID> | xargs cgdelete

Where <shortID> is the blob of hex that identifies the container in docker ps.

It's not a fix, but it's an effective workaround.

@kevinconaway kevinconaway referenced this issue in jmxtrans/jmxtrans Mar 2, 2016

Merged

Verify RPM/Deb Behavior with Docker Tests #423

5 of 7 tasks complete
@Jarema

This comment has been minimized.

Show comment
Hide comment
@Jarema

Jarema Mar 17, 2016

happend to me too on osx with latest docker machine.

Only working walkaround was one provided by @dkinzer

Jarema commented Mar 17, 2016

happend to me too on osx with latest docker machine.

Only working walkaround was one provided by @dkinzer

@atemerev

This comment has been minimized.

Show comment
Hide comment
@atemerev

atemerev Mar 24, 2016

Just happened to me. In production! Docker-machine deployed on AWS, OS X as dev environment.

I really start to doubt my decision to embrace Docker as a solution for structuring our service.

Just happened to me. In production! Docker-machine deployed on AWS, OS X as dev environment.

I really start to doubt my decision to embrace Docker as a solution for structuring our service.

@atemerev

This comment has been minimized.

Show comment
Hide comment
@atemerev

atemerev Mar 24, 2016

(Repro: one of the containers was hanging with 100% cpu, attempted to shutdown, failed, tried to restart docker-container, got stuck.)

Now having to restart docker-machine and regenerate certs, as IP has changed. This takes time.

(Repro: one of the containers was hanging with 100% cpu, attempted to shutdown, failed, tried to restart docker-container, got stuck.)

Now having to restart docker-machine and regenerate certs, as IP has changed. This takes time.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Mar 24, 2016

Member

@atemerev if you're using devicemapper, you may want to check if the thin pool didn't run out of space, because that may lead to issues. Docker 1.11 will include an option to reserve space to prevent this from happening; #20786

Member

thaJeztah commented Mar 24, 2016

@atemerev if you're using devicemapper, you may want to check if the thin pool didn't run out of space, because that may lead to issues. Docker 1.11 will include an option to reserve space to prevent this from happening; #20786

@trupty

This comment has been minimized.

Show comment
Hide comment
@trupty

trupty Apr 11, 2016

I recently faced this issue on docker 1.10.x

My error being:
sudo docker stop <container_name>
Failed to stop container (<container_name>): Error response from daemon: Cannot stop container <container_name>: [2] Container does not exist: container destroyed

Resolution:

  • restarted the docker service [Yes, not great, but works fine for my case]
  • redeployed the container

trupty commented Apr 11, 2016

I recently faced this issue on docker 1.10.x

My error being:
sudo docker stop <container_name>
Failed to stop container (<container_name>): Error response from daemon: Cannot stop container <container_name>: [2] Container does not exist: container destroyed

Resolution:

  • restarted the docker service [Yes, not great, but works fine for my case]
  • redeployed the container
@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Apr 11, 2016

Contributor

This is difficult to reproduce since it relies on some weird area where libcontainer's state differs from what we expect.

We also no longer use libcontainer directly (in Docker 1.11) but instead execute containers via runc. If this is still an issue, it will definitely be a different error message since the container destroyed error does not appear in runc codebase anymore.

Issue is probably actually fixed in this commit: https://github.com/opencontainers/runc/commit/556f798a19ecf23b20c518db5fa5fcec4c5034b6#diff-7b8effb45402944e445a664e4d9c296dL1025

I think we can close this but will leave open for others to comment.

Contributor

cpuguy83 commented Apr 11, 2016

This is difficult to reproduce since it relies on some weird area where libcontainer's state differs from what we expect.

We also no longer use libcontainer directly (in Docker 1.11) but instead execute containers via runc. If this is still an issue, it will definitely be a different error message since the container destroyed error does not appear in runc codebase anymore.

Issue is probably actually fixed in this commit: https://github.com/opencontainers/runc/commit/556f798a19ecf23b20c518db5fa5fcec4c5034b6#diff-7b8effb45402944e445a664e4d9c296dL1025

I think we can close this but will leave open for others to comment.

@ctrlhxj

This comment has been minimized.

Show comment
Hide comment
@ctrlhxj

ctrlhxj Apr 13, 2016

Is there any workaround without restarting the docker daemon? this is quite annoying in the production. We are using docker 1.9.1. We also see some cgroup settings are left over

ctrlhxj commented Apr 13, 2016

Is there any workaround without restarting the docker daemon? this is quite annoying in the production. We are using docker 1.9.1. We also see some cgroup settings are left over

@SirUrban

This comment has been minimized.

Show comment
Hide comment
@SirUrban

SirUrban Apr 13, 2016

We had the same problem here today. The comment of @mpalmer did help a lot but not completely. We found out that you can do the following:

lscgroup | grep  <ID> | xargs cgdelete
docker restart <ID>
docker kill <ID>
docker rm <ID>

We had the same problem here today. The comment of @mpalmer did help a lot but not completely. We found out that you can do the following:

lscgroup | grep  <ID> | xargs cgdelete
docker restart <ID>
docker kill <ID>
docker rm <ID>
@rwrnet

This comment has been minimized.

Show comment
Hide comment
@rwrnet

rwrnet Apr 20, 2016

Did not work for us. The

lscgroup | grep <ID> | xargs cgdelete

command did delete the entries form lscgroup, but the container ist still in docker ps and all the 3 steps from @SirUrban did just fail as usual Container does not exist: container destroyed.

Docker engine restart fixes the problem for us, but this is not an acceptable workaround.

Just as an addition to the cases before. We triggered it a couple of times with using docker-compose up ..." and then killed it with double CTRL-C.

for the record:

$ docker kill 69ee03602684
Failed to kill container (69ee03602684): Error response from daemon: Cannot kill container 69ee03602684: [2] Container does not exist: container destroyed

$ lscgroup | grep 69ee03602684
cpuset:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
cpu:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
cpuacct:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
memory:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
devices:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
freezer:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
blkio:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
perf_event:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
hugetlb:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564

# lscgroup | grep 69ee03602684 | xargs cgdelete

# lscgroup | grep 69ee03602684

# docker ps
CONTAINER ID        IMAGE                 COMMAND                   CREATED             STATUS              PORTS                         NAMES
69ee03602684        sessiondb_sink        "start-flume"             2 days ago          Up 2 days                                         sessiondb_sink_1

# docker restart 69ee03602684
Failed to kill container (69ee03602684): Error response from daemon: Cannot restart container 69ee03602684: [2] Container does not exist: container destroyed

rwrnet commented Apr 20, 2016

Did not work for us. The

lscgroup | grep <ID> | xargs cgdelete

command did delete the entries form lscgroup, but the container ist still in docker ps and all the 3 steps from @SirUrban did just fail as usual Container does not exist: container destroyed.

Docker engine restart fixes the problem for us, but this is not an acceptable workaround.

Just as an addition to the cases before. We triggered it a couple of times with using docker-compose up ..." and then killed it with double CTRL-C.

for the record:

$ docker kill 69ee03602684
Failed to kill container (69ee03602684): Error response from daemon: Cannot kill container 69ee03602684: [2] Container does not exist: container destroyed

$ lscgroup | grep 69ee03602684
cpuset:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
cpu:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
cpuacct:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
memory:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
devices:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
freezer:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
blkio:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
perf_event:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564
hugetlb:/docker/69ee036026840507c51897cb29d95367cdd3701db68d2338aa1f1d8a2ee67564

# lscgroup | grep 69ee03602684 | xargs cgdelete

# lscgroup | grep 69ee03602684

# docker ps
CONTAINER ID        IMAGE                 COMMAND                   CREATED             STATUS              PORTS                         NAMES
69ee03602684        sessiondb_sink        "start-flume"             2 days ago          Up 2 days                                         sessiondb_sink_1

# docker restart 69ee03602684
Failed to kill container (69ee03602684): Error response from daemon: Cannot restart container 69ee03602684: [2] Container does not exist: container destroyed
@dmyerscough

This comment has been minimized.

Show comment
Hide comment
@dmyerscough

dmyerscough Apr 28, 2016

I had the same issue, when I inspect the container the Pid that Docker knows about no longer exists:-

root@XXXX:/var/lib/docker# docker inspect -f "{{ .State.Pid }}" 9a13e6af5414
52428
root@XXXX:/var/lib/docker# ps aux | grep -i 52428
root      39393  0.0  0.0  12728  2172 pts/4    S+   00:11   0:00 grep -i 52428

I am not sure how Docker go into this issue but shouldn't Docker check to see if the Pid exists before trying to stop the container?

I had the same issue, when I inspect the container the Pid that Docker knows about no longer exists:-

root@XXXX:/var/lib/docker# docker inspect -f "{{ .State.Pid }}" 9a13e6af5414
52428
root@XXXX:/var/lib/docker# ps aux | grep -i 52428
root      39393  0.0  0.0  12728  2172 pts/4    S+   00:11   0:00 grep -i 52428

I am not sure how Docker go into this issue but shouldn't Docker check to see if the Pid exists before trying to stop the container?

@gaumire

This comment has been minimized.

Show comment
Hide comment
@gaumire

gaumire May 8, 2016

I am having the same issue as @rwrnet mentioned, after suggesting the solution as suggested by @mpalmer the entries disappear from lscgroup but the containers are still listed by docker ps . Anyone got around this ?

gaumire commented May 8, 2016

I am having the same issue as @rwrnet mentioned, after suggesting the solution as suggested by @mpalmer the entries disappear from lscgroup but the containers are still listed by docker ps . Anyone got around this ?

@yujuhong yujuhong referenced this issue in kubernetes/kubernetes May 11, 2016

Open

docker kill hangs, pod stuck in terminating #25456

@sandyskies

This comment has been minimized.

Show comment
Hide comment
@sandyskies

sandyskies May 16, 2016

Contributor

Containers: 5
Images: 292
Storage Driver: aufs
Root Dir: /data/docker/aufs
Backing Filesystem: extfs
Dirs: 330
Dirperm1 Supported: true
Execution Driver: native-0.2
Kernel Version: 3.10.83
Operating System:
CPUs: 8
Total Memory: 15.38 GiB
ID: BJGD:S5Y5:ONTS:N6LE:4QTC:GSZI:SMK5:JOD3:KJMX:LYQT:NEWG:VWGT

docker version
Client version: 1.6.2
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 7c8fca2/1.6.2
OS/Arch (client): linux/amd64
Server version: 1.6.2
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 7c8fca2/1.6.2
OS/Arch (server): linux/amd64

same issue

Contributor

sandyskies commented May 16, 2016

Containers: 5
Images: 292
Storage Driver: aufs
Root Dir: /data/docker/aufs
Backing Filesystem: extfs
Dirs: 330
Dirperm1 Supported: true
Execution Driver: native-0.2
Kernel Version: 3.10.83
Operating System:
CPUs: 8
Total Memory: 15.38 GiB
ID: BJGD:S5Y5:ONTS:N6LE:4QTC:GSZI:SMK5:JOD3:KJMX:LYQT:NEWG:VWGT

docker version
Client version: 1.6.2
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 7c8fca2/1.6.2
OS/Arch (client): linux/amd64
Server version: 1.6.2
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 7c8fca2/1.6.2
OS/Arch (server): linux/amd64

same issue

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 May 16, 2016

Contributor

Closing as this is no longer an issue for 1.11.

Contributor

cpuguy83 commented May 16, 2016

Closing as this is no longer an issue for 1.11.

@cpuguy83 cpuguy83 closed this May 16, 2016

krallin added a commit to krallin/captain-comeback that referenced this issue Sep 8, 2016

Workaround Docker's "container destroyed"
When a container exists right around when Docker was attempting to
restart it, it might get into an awkward state that makes it impossible
to restart. Resolving the issue involves restarting the Docker daemon,
which isn't particularly desirable for us.

So, in order to be able to send SIGTERM to all processes in a cgroup,
without making Docker crazy, we'll just wait for the cgroup to exit
ourselves. If it doesn't, then we'll ask Docker to kill it with
prejudice immediately (i.e. `docker kill -t 0`).

This appears to be resolved in 1.11 (and to be most common on 1.9), but
not all our infrastructure is there yet.

See: moby/moby#12738

krallin added a commit to krallin/captain-comeback that referenced this issue Sep 8, 2016

Workaround Docker's "container destroyed"
When a container exists right around when Docker was attempting to
restart it, it might get into an awkward state that makes it impossible
to restart. Resolving the issue involves restarting the Docker daemon,
which isn't particularly desirable for us.

So, in order to be able to send SIGTERM to all processes in a cgroup,
without making Docker crazy, we'll just wait for the cgroup to exit
ourselves. If it doesn't, then we'll ask Docker to kill it with
prejudice immediately (i.e. `docker kill -t 0`).

This appears to be resolved in 1.11 (and to be most common on 1.9), but
not all our infrastructure is there yet.

See: moby/moby#12738
@suchakra012

This comment has been minimized.

Show comment
Hide comment
@suchakra012

suchakra012 Oct 6, 2016

Hi all,

Facing an issue while stopping a running container that is using NFS client to communicate with storage mounted on a separate hosts through NFS.

When I run docker stop <container_id>, it stays hang for quite sometime without showing any change in state process. Any fix for this ??

Hi all,

Facing an issue while stopping a running container that is using NFS client to communicate with storage mounted on a separate hosts through NFS.

When I run docker stop <container_id>, it stays hang for quite sometime without showing any change in state process. Any fix for this ??

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Oct 6, 2016

Member

@suchakra012 please don't comment on closed issues with unrelated questions; if you suspect there's a bug, open a new issue; for questions about running docker, either https://forums.docker.com, the #docker IRC channel, or StackOverflow are probably better

Member

thaJeztah commented Oct 6, 2016

@suchakra012 please don't comment on closed issues with unrelated questions; if you suspect there's a bug, open a new issue; for questions about running docker, either https://forums.docker.com, the #docker IRC channel, or StackOverflow are probably better

@suchakra012

This comment has been minimized.

Show comment
Hide comment

@thaJeztah Ok fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment