Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang when exiting a "docker exec" shell #24974

Closed
paralin opened this issue Jul 24, 2016 · 48 comments
Closed

Hang when exiting a "docker exec" shell #24974

paralin opened this issue Jul 24, 2016 · 48 comments
Labels
area/runtime kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/more-info-needed version/1.12

Comments

@paralin
Copy link

paralin commented Jul 24, 2016

Docker info:

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 3
Server Version: v1.12.0-rc4
Storage Driver: aufs
 Root Dir: /mnt/persist/skiff/docker/aufs
 Backing Filesystem: extfs
 Dirs: 18
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: host null bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.14.65
Operating System: Buildroot 2016.08-git
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 1.928 GiB
Name: c2
ID: 6KOP:P45D:OCVY:QYVJ:PD6C:ZMOR:MSV5:2NM3:XP27:2IYI:2XI3:ETB6
Docker Root Dir: /mnt/persist/skiff/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

Version:

Client:
 Version:      v1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   unknown
 Built:        Sat Jul 23 16:29:55 PDT 2016
 OS/Arch:      linux/arm64

Server:
 Version:      v1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.2
 Git commit:   unknown
 Built:        Sat Jul 23 16:29:55 PDT 2016
 OS/Arch:      linux/arm64
docker exec -it mycontainer sh
$ exit
.... hang

Any ideas?

@paralin
Copy link
Author

paralin commented Jul 24, 2016

Potentially relevant log line:

Jul 24 00:15:22 c2 systemd[1]: Started Docker Application Container Engine.
Jul 24 00:20:57 c2 dockerd[3866]: time="2016-07-24T00:20:57.580405000Z" level=error msg="Handler for POST /v1.24/exec/c856a7a907b0dd09992ba99a10915f07d0e6993c2f7b95207994f922433edf4d/resize returned error: rpc error: code = 2 desc = containerd: process not found for container"

Docker-containerd version:

# docker-containerd -v
containerd version 0.2.0 commit: 0ac3cd1be170d180b2baed755e8f0da547ceb267

Runc version:

# runc -v
runc version commit: v1.0.0-rc1
spec: 1.0.0-rc1

@djarosz
Copy link

djarosz commented Jul 24, 2016

I have exact same problem I have only few different items to above configuration. Actually had the same problem with 1.12_rc3. My configuration:

# docker info
Containers: 4
 Running: 0
 Paused: 0
 Stopped: 4
Images: 46
Server Version: 1.12.0-rc4
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.6.4-gentoo
Operating System: Gentoo/Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 11.59 GiB
Name: myname
ID: JMEA:TOQD:4SUI:XKBJ:IURL:5UOR:RJOW:XGDO:C5IW:DWB2:C3XE:K6FQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8
# docker version
Client:
 Version:      1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   e4a0dbc
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0-rc4
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   e4a0dbc
 Built:        
 OS/Arch:      linux/amd64

@thaJeztah
Copy link
Member

process not found for container

Is the container still running, or did it exit? You can only docker exec into a running container.

@paralin
Copy link
Author

paralin commented Jul 24, 2016

@thaJeztah Execed into a running container, the execed process exited & hangs, the container is still running.

@paralin
Copy link
Author

paralin commented Jul 24, 2016

Reverting to docker-containerd 1b3a81545ca79456086dc2aa424357be98b962ee and docker-engine v1.12.0-rc3 fixes these issues.

@thaJeztah
Copy link
Member

ping @mlaventure PTAL ^^

@thaJeztah thaJeztah added area/runtime kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. labels Jul 24, 2016
@thaJeztah thaJeztah added this to the 1.12.0 milestone Jul 24, 2016
paralin added a commit to skiffos/buildroot that referenced this issue Jul 24, 2016
Temporary revert until these issues are fixed:

 - moby/moby#24974
 - moby/moby#24985

Signed-off-by: Christian Stewart <christian@paral.in>
@paralin
Copy link
Author

paralin commented Jul 24, 2016

I have a feeling it was the containerd revert that did it, as @djarosz says he has the same problem under rc3, can @djarosz please confirm which docker-containerd -v output you have?

@mlaventure
Copy link
Contributor

@paralin I can't reproduce this issue.

How did you start the initial container?

Also, your docker version doesn't provide a git commit in its output.

Here the version I used to try and reproduce the issue:

$ docker version
Client:
 Version:      1.12.0-dev
 API version:  1.25
 Go version:   go1.6.3
 Git commit:   9c1be54
 Built:        Sun Jul 24 20:21:51 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0-dev
 API version:  1.25
 Go version:   go1.6.3
 Git commit:   9c1be54
 Built:        Sun Jul 24 20:21:51 2016
 OS/Arch:      linux/amd64

$ docker-containerd --version
containerd version 0.2.0 commit: 0ac3cd1be170d180b2baed755e8f0da547ceb267

$ docker-runc --version
runc version 1.0.0-rc1
commit: cc29e3dded8e27ba8f65738f40d251c885030a28
spec: 1.0.0-rc1

$ docker run -dit --name top alpine top
6ba0bd15ceef0e5a2d2fba61b7c7272bd367bfc8d196ee7fba4fec18a475fe32
$ docker exec -ti top sh
/ # exit
$ 

@paralin
Copy link
Author

paralin commented Jul 24, 2016

@mlaventure Versions that work:

# docker-containerd -v
containerd version 0.2.0 commit: 1b3a81545ca79456086dc2aa424357be98b962ee

docker info:

# docker info
Containers: 4
 Running: 3
 Paused: 0
 Stopped: 1
Images: 18
Server Version: v1.12.0-rc3
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null bridge host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 3.14.65
Operating System: Buildroot 2016.08-git
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 1.928 GiB
Name: c2
ID: GB6M:AU3P:R57K:E7QL:XOP5:VSCL:RHLR:P5YX:AYZ6:E3L7:P6D7:3Y4M
Docker Root Dir: /mnt/persist/skiff/docker
Debug Mode (client): false
Debug Mode (server): false
Username: paralin
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

The git commit for docker-engine is v.1.12.0-rc3 - the Git tag.

paralin added a commit to skiffos/buildroot that referenced this issue Jul 24, 2016
Temporary revert until these issues are fixed:

 - moby/moby#24974
 - moby/moby#24985

Signed-off-by: Christian Stewart <christian@paral.in>
@mlaventure
Copy link
Contributor

@paralin it may be, but if I can't reproduce it, I can't fix it :)

Could you share which base image you used, with what options you started it, and the daemon logs of the failure?

atm, it works just fine on RC4 and Master for me.

@djarosz
Copy link

djarosz commented Jul 24, 2016

docker-containerd -v`
containerd version 0.2.0

I can reproduce it with docker run -it debian bash end when I type exit inside container witch exit process hangs and does not return.

docker logs shows

DEBU[0037] container mounted via layerStore: /var/lib/docker/overlay/40df433462b4d4d97ef72066b8ffcc77b48f85555208428a8182317096687166/merged 
DEBU[0037] Calling POST /v1.24/containers/3c5eb6494245e41a94f1f0e7752cf90d1649fb7982e09a0d953f419b1f4d5122/attach?stderr=1&stdin=1&stdout=1&stream=1 
DEBU[0037] attach: stdin: begin                         
DEBU[0037] attach: stdout: begin                        
DEBU[0037] attach: stderr: begin                        
DEBU[0037] Calling POST /v1.24/containers/3c5eb6494245e41a94f1f0e7752cf90d1649fb7982e09a0d953f419b1f4d5122/start 
DEBU[0037] container mounted via layerStore: /var/lib/docker/overlay/40df433462b4d4d97ef72066b8ffcc77b48f85555208428a8182317096687166/merged 
DEBU[0037] Assigning addresses for endpoint sleepy_hawking's interface on network bridge 
DEBU[0037] RequestAddress(LocalDefault/172.17.0.0/16, <nil>, map[]) 
DEBU[0037] Assigning addresses for endpoint sleepy_hawking's interface on network bridge 
INFO[0037] No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4] 
INFO[0037] IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844] 
DEBU[0037] Programming external connectivity on endpoint sleepy_hawking (c16e7292a444558eeea7e6035ef36e40b379e3d010168e33aca1226baa308e48) 
DEBU[0037] sandbox set key processing took 24.204668ms for container 3c5eb6494245e41a94f1f0e7752cf90d1649fb7982e09a0d953f419b1f4d5122 
DEBU[0038] received past containerd event: &types.Event{Type:"start-container", Id:"3c5eb6494245e41a94f1f0e7752cf90d1649fb7982e09a0d953f419b1f4d5122", Status:0x0, Pid:"", Timestamp:0x0} 
DEBU[0038] Calling POST /v1.24/containers/3c5eb6494245e41a94f1f0e7752cf90d1649fb7982e09a0d953f419b1f4d5122/resize?h=70&w=116 
DEBU[0044] containerd: process exited                    id=3c5eb6494245e41a94f1f0e7752cf90d1649fb7982e09a0d953f419b1f4d5122 pid=init status=0 systemPid=18005
DEBU[0044] received past containerd event: &types.Event{Type:"exit", Id:"3c5eb6494245e41a94f1f0e7752cf90d1649fb7982e09a0d953f419b1f4d5122", Status:0x0, Pid:"init", Timestamp:0x0} 

and ps xa | grep docker gives me

17841 pts/1    S+     0:00 sudo docker daemon --debug
17842 pts/1    Sl+    0:00 dockerd --debug
17852 ?        Ssl    0:00 docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc --debug
17970 pts/4    Sl+    0:00 docker run -it debian bash
18080 pts/3    S+     0:00 grep --colour=auto docker
``
What i can do from this poit is to ``kill -9 17970`` manually. 

paralin added a commit to skiffos/buildroot that referenced this issue Jul 24, 2016
Temporary revert until these issues are fixed:

 - moby/moby#24974
 - moby/moby#24985

Signed-off-by: Christian Stewart <christian@paral.in>
@mlaventure
Copy link
Contributor

@djarosz Unfortunately, I can't reproduce this either. the output for docker-containerd -v should have given you a git commit hash, can you confirm what it is?

Also, can you confirm that the 3 git hash for docker, docker-containerd and docker-runc are the same as the one I got in #24974 (comment) if you're using RC4.

Also, just in case, could you try shutting down your daemon, renaming /var/lib/docker to have a backup and trying again?

From your log, the exit event is correctly received by the daemon which should have cause the connection to be released and the client to exit properly.

paralin added a commit to skiffos/buildroot that referenced this issue Jul 24, 2016
Temporary revert until these issues are fixed:

 - moby/moby#24974
 - moby/moby#24985

Signed-off-by: Christian Stewart <christian@paral.in>
paralin added a commit to skiffos/buildroot that referenced this issue Jul 24, 2016
Temporary revert until these issues are fixed:

 - moby/moby#24974
 - moby/moby#24985

Signed-off-by: Christian Stewart <christian@paral.in>
@paralin
Copy link
Author

paralin commented Jul 24, 2016

@mlarcher Using:

docker-containerd: 0ac3cd1be170d180b2baed755e8f0da547ceb267
docker-engine: v1.12.0-rc4
runc: v1.0.0-rc1

With a clean system. Still have the same error.

# docker info                                                                                                                                                                        
Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 3
Server Version: v1.12.0-rc4
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge overlay null host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.14.65
Operating System: Buildroot 2016.08-git
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 1.928 GiB
Name: c2
ID: 65DB:OBWI:L4GI:TN5T:SRVQ:RLTD:TT5G:FEMI:3P3H:3ATK:WWLX:VTQI
Docker Root Dir: /mnt/persist/skiff/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

@paralin
Copy link
Author

paralin commented Jul 24, 2016

Errors from that test:

May 21 22:31:33 c2 systemd[1]: Started Docker Application Container Engine.
Jul 24 21:35:18 c2 dockerd[274]: time="2016-07-24T21:35:18.081782000Z" level=error msg="Handler for POST /v1.24/exec/f3d521bad77f560b3b3d47bc07aa06647509882544d48129a24509ac0659991b/
resize returned error: rpc error: code = 2 desc = containerd: process not found for container"
Jul 24 21:35:23 c2 dockerd[274]: time="2016-07-24T21:35:23.060185000Z" level=error msg="Handler for POST /v1.24/exec/f3d521bad77f560b3b3d47bc07aa06647509882544d48129a24509ac0659991b/
resize returned error: rpc error: code = 2 desc = containerd: process not found for container"

@mlaventure
Copy link
Contributor

@paralin please post the exact steps you used to reproduce the error. Or are they the same as the one I've shown in #24974 (comment) ?

Also, the full daemon logs would be useful.

@djarosz
Copy link

djarosz commented Jul 24, 2016

Unfortunately running docker after removing /var/lib/docker did not help. I cannot provide exect commit ids for runc and docker but it seams than gentoo packages download sources from following locations:

@paralin
Copy link
Author

paralin commented Jul 24, 2016

Exact steps to reproduce the error.

  1. Buy an odroid C2.
  2. git clone https://github.com/paralin/SkiffOS.git
  3. SKIFF_CONFIG=odroid/c2,skiff/standard make compile
  4. Install to SD card, boot
  5. docker run --name=test -d container4armhf/armhf-alpine:edge sleep infinity
  6. docker exec -it test sh
  7. exit -> hang

Obviously the odroid part is a tiny bit sarcastic, but that is exactly what you would have to do to reproduce this.

As you can see here this commit reverting the patch IDs for docker-containerd and docker-engine fixes the problem.

I'm glad @djarosz can also reproduce because he is operating in a bit of a more standard environment.

@paralin
Copy link
Author

paralin commented Jul 24, 2016

I think a scorched-earth solution to this might be doing a git bisect on docker-containerd

@djarosz
Copy link

djarosz commented Jul 25, 2016

To reproduce I just docker run -it --name test alpine sh and after issuing exit from container process hangs and never returns to bash prompt (unless I kill it witch -9). docker ps still reports container as running and docker kill test does nothing event thou it reports 0 exit status.

I have just run the test witch docker, containerd and runc compiled from exactly the same git commits as pointed by @mlaventure in this comment

So I've made test and issued this commands in following order. Numbers in brackets represent terminal windows

[0] # docker daemon --debug
[1] # docker run -it -name test alpine sh
[2] # docker ps # this reports 'test' container as up and running
[1] # exit # inside 'test' container - this one hangs
[2] # docker ps # this reports  'test' container as up and running
[2] # docker kill test # on another terminal this returns 0
[2] # docker ps # this reports 'test' container as up and running
[2] # kill -9 <PID> # PID is system PI of first of above commands in terminal [1]
[0] send  Ctrl-C to docker daemon

And this is debug output of docker daemn during this time

DEBU[0019] Calling POST /v1.24/containers/create?name=test 
DEBU[0019] form data: {"AttachStderr":true,"AttachStdin":true,"AttachStdout":true,"Cmd":["sh"],"Domainname":"","Entrypoint":null,"Env":[],"HostConfig":{"AutoRemove":false,"Binds":null,"BlkioDeviceReadBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceWriteIOps":null,"BlkioWeight":0,"BlkioWeightDevice":null,"CapAdd":null,"CapDrop":null,"Cgroup":"","CgroupParent":"","ConsoleSize":[0,0],"ContainerIDFile":"","CpuCount":0,"CpuPercent":0,"CpuPeriod":0,"CpuQuota":0,"CpuShares":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"DiskQuota":0,"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IOMaximumBandwidth":0,"IOMaximumIOps":0,"IpcMode":"","Isolation":"","KernelMemory":0,"Links":null,"LogConfig":{"Config":{},"Type":""},"Memory":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":-1,"NetworkMode":"default","OomKillDisable":false,"OomScoreAdj":0,"PidMode":"","PidsLimit":0,"PortBindings":{},"Privileged":false,"PublishAllPorts":false,"ReadonlyRootfs":false,"RestartPolicy":{"MaximumRetryCount":0,"Name":"no"},"SecurityOpt":null,"ShmSize":0,"UTSMode":"","Ulimits":null,"UsernsMode":"","VolumeDriver":"","VolumesFrom":null},"Hostname":"","Image":"alpine","Labels":{},"NetworkingConfig":{"EndpointsConfig":{}},"OnBuild":null,"OpenStdin":true,"StdinOnce":true,"Tty":true,"User":"","Volumes":{},"WorkingDir":""} 
DEBU[0019] container mounted via layerStore: /var/lib/docker/overlay/f4fd7bdc8925b9d166dbc2ca27edd9297f4baa9dc9a19297959bc2edd414a61e/merged 
DEBU[0019] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/attach?stderr=1&stdin=1&stdout=1&stream=1 
DEBU[0019] attach: stdin: begin                         
DEBU[0019] attach: stderr: begin                        
DEBU[0019] attach: stdout: begin                        
DEBU[0019] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/start 
DEBU[0019] container mounted via layerStore: /var/lib/docker/overlay/f4fd7bdc8925b9d166dbc2ca27edd9297f4baa9dc9a19297959bc2edd414a61e/merged 
DEBU[0019] Assigning addresses for endpoint test's interface on network bridge 
DEBU[0019] RequestAddress(LocalDefault/172.17.0.0/16, <nil>, map[]) 
DEBU[0019] Assigning addresses for endpoint test's interface on network bridge 
INFO[0019] No non-localhost DNS nameservers are left in resolv.conf. Using default external servers : [nameserver 8.8.8.8 nameserver 8.8.4.4] 
INFO[0019] IPv6 enabled; Adding default IPv6 external servers : [nameserver 2001:4860:4860::8888 nameserver 2001:4860:4860::8844] 
DEBU[0019] Programming external connectivity on endpoint test (9c90b79f34fa312858cae78b7838a3b4b70381fa30e38eafb80d513912143802) 
DEBU[0019] sandbox set key processing took 25.246116ms for container c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53 
DEBU[0019] received past containerd event: &types.Event{Type:"start-container", Id:"c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53", Status:0x0, Pid:"", Timestamp:0x0} 
DEBU[0019] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize?h=25&w=118 
DEBU[0024] Calling GET /v1.24/containers/json           
DEBU[0028] containerd: process exited                    id=c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53 pid=init status=0 systemPid=5353
DEBU[0028] received past containerd event: &types.Event{Type:"exit", Id:"c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53", Status:0x0, Pid:"init", Timestamp:0x0} 
DEBU[0033] Calling GET /v1.24/containers/json           
DEBU[0039] Calling POST /v1.24/containers/test/kill?signal=KILL 
DEBU[0039] Sending 9 to c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53 
WARN[0039] container kill failed because of 'container not found' or 'no such process': Cannot kill container c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53: rpc error: code = 2 desc = containerd: container not found 
INFO[0049] Container c507a67c9817 failed to exit within 10 seconds of kill - trying direct SIGKILL 
DEBU[0049] Cannot kill process (pid=5353) with signal 9: no such process. 
DEBU[0182] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize?h=25&w=117 
ERRO[0182] Handler for POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize returned error: rpc error: code = 2 desc = containerd: container not found 
DEBU[0182] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize?h=2&w=111 
ERRO[0182] Handler for POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize returned error: rpc error: code = 2 desc = containerd: container not found 
DEBU[0185] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize?h=25&w=118 
ERRO[0185] Handler for POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize returned error: rpc error: code = 2 desc = containerd: container not found 
DEBU[0217] Calling GET /v1.24/containers/json?size=1    
DEBU[0217] container mounted via layerStore: /var/lib/docker/overlay/f4fd7bdc8925b9d166dbc2ca27edd9297f4baa9dc9a19297959bc2edd414a61e/merged 
DEBU[0253] Calling GET /v1.24/containers/json?all=1     
DEBU[0501] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize?h=25&w=117 
ERRO[0501] Handler for POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize returned error: rpc error: code = 2 desc = containerd: container not found 
DEBU[0501] Calling POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize?h=2&w=111 
ERRO[0501] Handler for POST /v1.24/containers/c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53/resize returned error: rpc error: code = 2 desc = containerd: container not found 
DEBU[0529] Closing buffered stdin pipe                  
DEBU[0529] attach: stdin: end                           
DEBU[0529] attach: stderr: end                          
DEBU[0529] attach: stdout: end                 
INFO[0691] Processing signal 'interrupt'                
DEBU[0691] starting clean shutdown of all containers... 
DEBU[0691] stopping c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53 
DEBU[0691] Sending 15 to c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53 
WARN[0691] container kill failed because of 'container not found' or 'no such process': Cannot kill container c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53: rpc error: code = 2 desc = containerd: container not found 
INFO[0701] Container c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53 failed to exit within 10 seconds of signal 15 - using the force 
DEBU[0701] Sending 9 to c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53 
WARN[0701] container kill failed because of 'container not found' or 'no such process': Cannot kill container c507a67c98171367db2941c3a3c90902fb7c4d30b476ba72fce721e7d0d5eb53: rpc error: code = 2 desc = containerd: container not found 
ERRO[0706] Force shutdown daemon                        
DEBU[0706] containerd connection state change: SHUTDOWN 
INFO[0706] stopping containerd after receiving terminated 

@djarosz
Copy link

djarosz commented Jul 25, 2016

Forgot to add that

DEBU[0529] attach: stdin: end                           
DEBU[0529] attach: stderr: end                          
DEBU[0529] attach: stdout: end   

was after kill -9.

Also I have tried commits which fix the issue for @paralin. But in my case they didn't work either. So maybe we are facing different issue after all.

@cpuguy83
Copy link
Member

I got this error as well while using out of sync containerd/docker builds.

Not just on exec, docker rm -f would hang because we'd tell containerd to kill the process, but containerd couldn't find the process, so docker waited the 10s timeout and then sigkilled.

paralin added a commit to skiffos/buildroot that referenced this issue Jul 27, 2016
Temporary revert until these issues are fixed:

 - moby/moby#24974
 - moby/moby#24985

Signed-off-by: Christian Stewart <christian@paral.in>
@justincormack
Copy link
Contributor

@paralin is this fixed for you with the commits @djarosz uses? Or do you still have an issue?

@thaJeztah thaJeztah removed this from the 1.12.0 milestone Jul 28, 2016
@paralin
Copy link
Author

paralin commented Aug 4, 2016

@djarosz Was the runc version important? Or just containerd / docker?

@mlaventure
Copy link
Contributor

runc --version will show 1.0.0-rc1 for quite a few commits as the VERSION of the repository file hasn't been changed since jun 3rd. Hence all revision afterwards still identify themselves as 1.0.0-rc1 even though they may be incompatible with each other.

@djarosz
Copy link

djarosz commented Aug 5, 2016

No runc did not matter. For some of my tests I just switched between different commits of docker and containerd but stayed on the same commit of runc.

Now I'm on 1.12.0_rc5 and works ok for me.

@thaJeztah
Copy link
Member

I'll close this issue, because it looks resolved (and it's not a bug), but feel free to continue the discussion

@huangcuiyang
Copy link

I have the steps to reproduce the issue;

docker version:
Client:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: bda11d5
Built: Thu May 4 14:04:43 2017
OS/Arch: linux/amd64

Server:
Version: 1.12.3
API version: 1.24
Go version: go1.6.3
Git commit: bda11d5
Built: Thu May 4 14:04:43 2017
OS/Arch: linux/amd64

step 1: Chose a image, and run cmd: docker run -d -it image:tag /bin/bash
step 2: Get into the Created container by cmd: docker exec -it ${container-id} bash. then run cmd: sleep 3000 &, and exit;
step 3: now, you will see that exit hangs and 'docker ps' hangs.

Can Open this issue???? @mlaventure

@cpuguy83
Copy link
Member

cpuguy83 commented Jun 5, 2017

@huangcuiyang This is fixed on newer versions of Docker... maybe even in 1.12.6... but likely you'll need to upgrade to 17.03

@SerenaFeng
Copy link

SerenaFeng commented Jun 30, 2017

Hi, I got the same problem in version 17.03.1-ce
It looks like the container is running, actually, it isn't. and when I try to restart it, it will response with:

[serena ~]$ docker ps
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS                                     NAMES
ed0c4e7d4ea1        opnfv/testapi:latest   "bash docker/start..."   36 hours ago        Up 36 hours         0.0.0.0:8082->8000/tcp                    brave_varahamihira
[serena ~]$ docker exec -ti ed0c4e7d4ea1 bash
rpc error: code = 2 desc = containerd: container not found

I also tried restart/stop, nothing worked

other information I gathered:

[serena ~]$ ps -ef | grep docker
root      6025     1  0 Jun21 ?        00:00:03 docker-containerd-shim 1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 /var/run/docker/libcontainerd/1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 docker-runc
root      8681 21273  0 Jun28 ?        00:00:02 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8082 -container-ip 172.17.0.2 -container-port 8000
root      9183     1  0 Jun21 ?        00:00:01 docker-containerd-shim 1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 /var/run/docker/libcontainerd/1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 docker-runc
root     10433     1  0 Jun21 ?        00:00:01 docker-containerd-shim 1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 /var/run/docker/libcontainerd/1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 docker-runc
serena   12828 11357  0 06:36 pts/0    00:00:00 grep --color=auto docker
root     19995 21273  0 May18 ?        00:00:45 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8084 -container-ip 172.17.0.3 -container-port 8000
root     20000     1  0 May18 ?        00:01:05 docker-containerd-shim 1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 /var/run/docker/libcontainerd/1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 docker-runc
root     20648 21273  0 Jun29 ?        00:01:23 docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc
root     21273     1  0 May16 ?        01:54:08 /usr/bin/dockerd
root     27209     1  0 Jun27 ?        00:00:00 docker-containerd-shim 1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 /var/run/docker/libcontainerd/1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 docker-runc
root     27357     1  0 Jun19 ?        00:00:01 docker-containerd-shim 1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 /var/run/docker/libcontainerd/1ff0790e1b816973612a9f2b5fa0726571ebd4c87a103c3b891106853a7e8341 docker-runc
[serena ~]$ docker version
Client:
 Version:      17.03.1-ce
 API version:  1.27
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Mon Mar 27 17:05:44 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.1-ce
 API version:  1.27 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   c6d412e
 Built:        Mon Mar 27 17:05:44 2017
 OS/Arch:      linux/amd64
 Experimental: false

[serena ~]$ docker-containerd -v
containerd version 0.2.3 commit: 4ab9917febca54791c5f071a9d1f404867857fcc

@mlaventure
Copy link
Contributor

@SerenaFeng this is a different issue altogether (something more like #32134)

Hopefully 17.06 should resolve most if not all of those issues.

@Lax77
Copy link

Lax77 commented Aug 30, 2017

Still seeing the same hang issue with exiting docker exec shell.

My Docker version: 17.06.0-ce

@mlaventure
Copy link
Contributor

@Lax77 it's expected if your exec processes spawned children that are holding the IOs open

@Lax77
Copy link

Lax77 commented Aug 30, 2017

@mlaventure

in mycase on the problem system all I did was
docker ps - to find the container id
docker exec -it sh
exit - This operation gets hung

I didn't trigger in any special calls within the container when I exec into the container. All I did was exec into the container and get out.

@mlaventure
Copy link
Contributor

@Lax77 in that case open a new issue with the relevant data so we can debug this. If you can reproduce it, having the daemon in debug mode would be extremely helpful to try and have an idea of what's happening.

@d0c-s4vage
Copy link

Was a new issue created for this? I'm experiencing this as well.

@Lax77
Copy link

Lax77 commented Oct 6, 2017

No, didn't get a chance to collect the debug logs of this issue. So didn't open a new bug for it. But this problem still keeps happening with 17.06, haven't tried upgrading to 17.09 though yet.

@d0c-s4vage
Copy link

Ah, thanks for the update. I'm also seeing it with 17.06. If I have some spare cycles I'll try to get some debug logs. Probably not till next week though

@JettJones
Copy link

JettJones commented Oct 10, 2017

I'm also seeing hangs exiting after running docker exec ... bash - this in docker-for-windows running 17.09-ce. I recently updated from 17.03-ce which did not see this issue. The behavior of my code - to close the console and keep running might be new too - that's to say I'm not yet sure the version upgrade was the only culprit.

I think child processes are probably part of the problem, as I can exec to the same container a second time, and kill some previous processes to unhang my original shell.

@AndriiNikitin
Copy link

AndriiNikitin commented Nov 9, 2017

Maybe it worth trying -d flag for exec as workaround

@d0c-s4vage
Copy link

d0c-s4vage commented Nov 9, 2017 via email

@haridsv
Copy link

haridsv commented May 29, 2019

I am still seeing hangs using 18.06.0-ce, but docker ps works fine for me. If I run pstree on the pid corresponding to docker exec, I see only the usual docker threads on the host, but nothing on the container side (never mind.. seems like the container side children are not visible). In the daemon logs, I see the below, which matches a couple of other reports above:

dockerd[60866]: time="2019-05-21T01:31:34.126473617Z" level=warning msg="Ignoring Exit Event, no such exec command found" container=b6cbd66c37f7c11b0a1fa1822de5623d2560fcd205fff7244f7cdf2f516011ef exec-id=f76710d56104ed0efee9d99114a231b941acf19f14cca32a38a1af6055eafc8b exec-pid=22063
dockerd[60866]: time="2019-05-21T01:31:34.126518688Z" level=error msg="exit event" container=b6cbd66c37f7c11b0a1fa1822de5623d2560fcd205fff7244f7cdf2f516011ef error="no such process" module=libcontainerd namespace=moby process=f76710d56104ed0efee9d99114a231b941acf19f14cca32a38a1af6055eafc8b

This doesn't happen as soon as the container is spawned, but sometime later after running a few docker exec commands successfully. We worked around for our automation by killing the process once the command is done execution. To detect if the command is done, we generate a shell script that embeds the actual command and prints a message at the end, copy the generated script to the container and run the script. However, if we need to troubleshoot an issue, we need to start an interactive bash session and that gets stuck on exit. Extending the same approach that we use for automation, I can lookup the process for docker exec and kill it, but this is a big pain as I need a second ssh session to that host. When the exit hangs, I can't even suspend the process (using ^Z), so a second login session is the only way to find and kill that process.

@OleksiiStepanov
Copy link

i have same issue my docker version is

Client: Docker Engine - Community
 Version:           19.03.5
 API version:       1.40
 Go version:        go1.12.12
 Git commit:        633a0ea838
 Built:             Wed Nov 13 07:25:38 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.5
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.12
  Git commit:       633a0ea838
  Built:            Wed Nov 13 07:24:09 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.10
  GitCommit:        b34a5c8af56e510852c35414db4c1f4fa6172339
 runc:
  Version:          1.0.0-rc8+dev
  GitCommit:        3e425f80a8c931f88e6d94a8c831b9d5aa481657
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

@OleksiiStepanov
Copy link

I seem to understand what is causing the hang. When command executed by docker exec create child process, and this process still work after command exit. On next run command hung on exit.
For example
docker exec test sh -c 'exec sleep infinity &'
first time command exited as expected but when execute same command again command will exit only if child process finished

@eugeneievlev
Copy link

I seem to understand what is causing the hang. When command executed by docker exec create child process, and this process still work after command exit. On next run command hung on exit.
For example
docker exec test sh -c 'exec sleep infinity &'
first time command exited as expected but when execute same command again command will exit only if child process finished

Hey @OleksiiStepanov , had the same issue. Fixed a few minutes ago :) with replacing mather OS. Looks like the cause of the issue is old kernel version + 18th docker. I have used Amazon Linux v1 with a docker 18+ version. The last of my hope was migration as I realized that issue in exit signals between mother OS and docker engine. As soon as I have migrated to Amazon Linux v2 with the 19th docker the issue is gone. I hope that it helps you as well.
Thanks, Eugene

@OleksiiStepanov
Copy link

@eugeneievlev I no sure the problem with host OS i have same problem with different host OS. i made test with:

  1. SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux
  2. Ubuntu 4.15.0-66-generic 'docker import' fails silently when bsdtar is not installed #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/more-info-needed version/1.12
Projects
None yet
Development

No branches or pull requests