error removing container (1.10, 1.11/master) with AUFS #21704

Open
vikstrous opened this Issue Mar 31, 2016 · 54 comments

Projects

None yet
@vikstrous
Member

I've been seeing this error in our integration tests a lot recently:

Error response from daemon: 500 Internal Server Error: Driver aufs failed to remove root filesystem 36382c720964b0560df5fb858af8197169ee4eb399906c0e65c4ca85d795941e: rename /var/lib/docker/aufs/mnt/e7d36cc07ee4aad50f61259bea24876cc925f3c417b6d5ea9c2c1b055d243c82 /var/lib/docker/aufs/mnt/e7d36cc07ee4aad50f61259bea24876cc925f3c417b6d5ea9c2c1b055d243c82-removing: device or resource busy

This happens when a container is being removed and causes our tests to fail. I've seen it only on aufs so far.

Output of docker version:

$ docker version
Client:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 15:54:52 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.3-cs2
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   f02424d
 Built:        Thu Mar 17 21:52:14 2016
 OS/Arch:      linux/amd64
$ docker version
Client:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 15:54:52 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.0-dev
 API version:  1.24
 Go version:   go1.5.3
 Git commit:   dd94c88
 Built:        Thu Mar 31 21:32:39 2016
 OS/Arch:      linux/amd64

Additional environment details (AWS, VirtualBox, physical, etc.):
This is happening on AWS with AUFS

Steps to reproduce the issue:
unknown

Describe the results you received:
500 error from the daemon

Describe the results you expected:
the container should be removed without an error

Additional information you deem important (e.g. issue happens only occasionally):
It happens less than half of the time

@thaJeztah
Member

Can you give the output of docker info as well?

@vikstrous
Member
$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.11.0-dev
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
 Dirperm1 Supported: false
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-53-generic
Operating System: Ubuntu 14.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.305 GiB
Name: jenkins-dtr-integration-2023
ID: A64M:T365:F3GT:OPMD:H3YO:AQFH:65Y6:H2YZ:PHGN:4KZI:2BC5:ISLE
Username: dockerbuildbot
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 provider=amazonec2
@cpuguy83
Contributor
cpuguy83 commented Apr 1, 2016

How recent is your master? I think we fixed this.

@thaJeztah
Member

@cpuguy83 looks like he's on dd94c88

@thaJeztah
Member

Is this a duplicate of #21111 and #21101 ?

@thaJeztah
Member

Oh, and #17902

@thaJeztah
Member

ping @anusha-ragunathan would you be able to look into this? I linked various related / similar issues above

@vikstrous
Member

I think the daemon logs from one successful run of our integration tests and one run that caused this error will be helpful, but I'm not sure if I can share them publicly here. If you have access, check these out:

success: https://ci.qa.aws.dckr.io/job/dtr-deploy/2749/artifact/integration/results/docker.log
failure: https://ci.qa.aws.dckr.io/job/dtr-deploy/2753/artifact/integration/results/docker.log

They are not exactly from the same PR, but they are very similar.

There is a potentially relevant one earlier in the logs:


�[34mINFO�[0m[0058] Failed to send signal 15 to the process, force killing 
�[31mERRO�[0m[0058] Handler for POST /v1.15/containers/2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410/stop returned error: Cannot stop container 2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410: Cannot kill container 2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410: rpc error: code = 2 desc = "no such process" 

This is the complete error at the time of the failed container delete:


�[31mERRO�[0m[0148] Error removing mounted layer e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72: rename /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82 /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82-removing: device or resource busy 
�[31mERRO�[0m[0148] Handler for DELETE /v1.15/containers/e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72 returned error: Driver aufs failed to remove root filesystem e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72: rename /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82 /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82-removing: device or resource busy 
�[31mERRO�[0m[0148] Handler for GET /v1.15/containers/e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72/json returned error: No such container: e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72 
�[31mERRO�[0m[0148] Handler for DELETE /v1.15/containers/155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f returned error: Unable to remove filesystem for 155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f: remove /var/lib/docker/containers/155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f/shm: device or resource busy 

It's interesting that the same error log appears when we restart the daemon earlier in the test.

@vikstrous
Member

If I had to guess, I'd say there are left over processes referencing the same layers from when the daemon tried to restart and failed to properly kill them.

@dnephin dnephin referenced this issue in docker/compose Apr 5, 2016
Open

Unable to remove container #3097

@anusha-ragunathan
Contributor

@vikstrous : I cannot access the jenkins logs. Can you create a gist of the logs? I tried a quick test of creating and removing containers in a loop of 15 (not concurrent) on AUFS and didnt observe this issue. Is there a deterministic way to repro the issue?

Can you confirm that the containers start successfully? If yes, then a couple of things to proceed on:

  • If there was another concurrent request to stop container. This would result in a race and the rename in the context on the second request would error out. You can check the existence of the corresponding diff file. If its doesnt exist, then its most likely a race.
  • In 1.11, we recently changed the way reference counts work in aufs (and other graph drivers). If you can run some instrumented builds, then I can send over a docker binary to debug this more.
@thaJeztah
Member

@vikstrous @anusha-ragunathan please post it on slack if the jenkins log contains information that should not be shared publicly 👍

@vikstrous
Member

I haven't seen this bug since last time I posted in this thread. It's possible that it was fixed. I'll update you if I see it again.

@FelikZ
FelikZ commented Apr 7, 2016

I have the similar issue, please have a look.

test.yml:

version: "2"
services:
    browser:
        image: elgalu/selenium:2.53.0e
        ports:
            - "5920:25900"
# - "4444:24444"
# volumes:
#     - "/dev/shm:/dev/shm"
        environment:
          - "VNC_PASSWORD=test"
          - "FIREFOX=false"
          - "CHROME=true"
        networks:
            my-net:
                aliases:
                  - browser
networks:
  my-net:
    driver: bridge

Stdout:

$ docker-compose --version
docker-compose version 1.6.2, build 4d72027
$ docker --version
Docker version 1.10.3, build 20f81dd
$ docker info
Containers: 3
 Running: 1
 Paused: 0
 Stopped: 2
Images: 142
Server Version: 1.10.3
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 245
 Dirperm1 Supported: false
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-85-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.58 GiB
Name: tk9
ID: CYDT:VFSD:2M77:W5P7:OQUD:J6G7:EWQR:KJWR:SOUX:JCLZ:2SBG:J7GX
WARNING: No swap limit support
$ docker-compose -f test.yml up -d
Creating network "homelocal_my-net" with driver "bridge"
Creating homelocal_browser_1
$ docker-compose -f test.yml stop
Stopping homelocal_browser_1 ... 

ERROR: for homelocal_browser_1  ('Connection aborted.', BadStatusLine("''",)) 
ERROR: Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?

If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
$ docker-compose -f test.yml rm -f
Going to remove homelocal_browser_1
Removing homelocal_browser_1 ... error

ERROR: for homelocal_browser_1  Driver aufs failed to remove root filesystem 0e6e88bcc931eb13e141ac871b4ba965d01aae880a20255a5e974f15dff40b0e: rename /var/lib/docker/aufs/mnt/d4e6ee5ebd3ac40e256afa4492451e25cbea87f5041a1dce0bec7a302f41cc45 /var/lib/docker/aufs/mnt/d4e6ee5ebd3ac40e256afa4492451e25cbea87f5041a1dce0bec7a302f41cc45-removing: device or resource busy 
$ docker-compose -f test.yml rm -f
Going to remove homelocal_browser_1
Removing homelocal_browser_1 ... error
@FelikZ
FelikZ commented Apr 7, 2016

And this probably related as well #21845

@FelikZ
FelikZ commented Apr 7, 2016

@cpuguy83 looks like it does not related to aufs...

Trying to solve this, I switched to overlayfs and see the same picture:

Error response from daemon: Driver overlay failed to remove root filesystem 8b21bec99eccde191ca98e944003274c5b45bbf6f1e4cc08560c0e454e5d3719: readdirent: no such file or directory
$ docker info
Containers: 4
 Running: 0
 Paused: 0
 Stopped: 4
Images: 44
Server Version: 1.10.3
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.19.0-58-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.58 GiB
Name: tk9
ID: CYDT:VFSD:2M77:W5P7:OQUD:J6G7:EWQR:KJWR:SOUX:JCLZ:2SBG:J7GX
WARNING: No swap limit support
@anusha-ragunathan
Contributor

@FelikZ : Can you upgrade to docker-engine 1.11 rc4 and try again?

@thaJeztah
Member

ping @FelikZ do you still see this on 1.11.0?

@ryane ryane referenced this issue in mantl/mantl Apr 28, 2016
Merged

Add new partitioner script, which can do job on first boot #1239

3 of 4 tasks complete
@ncadou
ncadou commented May 24, 2016

Seeing this frequently on different machines, all on 1.11.1 plus aufs. Most are on 14.04 LTS (3.13 kernel).

@jbeda
Contributor
jbeda commented May 29, 2016

I just saw this when trying to start a container using the gcplog logdriver that wasn't able to launch successfully.

docker run -d --name my-container --log-driver=gcplogs --log-opt gcp-log-cmd=true [...]
38e6a733b02a825dc97208ee2436d31353480bfe31cfa4799cd48203d746fe6e
docker: Error response from daemon: Failed to initialize logging driver: unable to connect or authenticate with Google Cloud Logging: googleapi: Error 403: Google Cloud Logging API has not been used in project 784782548624 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/logging/overview?project=784782548624 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., forbidden.
$ docker rm my-container
Error response from daemon: Driver aufs failed to remove root filesystem fd5e668e8cd14c1a7a2405b26ea3d75bdfd9f25525447b20f6b400eff02a7a23: rename /var/lib/docker/aufs/diff/ae3e5b240821fe702b68544188663ef092bf173b33e495678800c6c3ea498f92 /var/lib/docker/aufs/diff/ae3e5b240821fe702b68544188663ef092bf173b33e495678800c6c3ea498f92-removing: device or resource busy
# docker info
Containers: 9
 Running: 5
 Paused: 0
 Stopped: 4
Images: 382
Server Version: 1.11.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 609
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.16.0-0.bpo.4-amd64
Operating System: Debian GNU/Linux 7 (wheezy)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 6.338 GiB
Name: web
ID: IXJA:3H47:WFQC:GZPE:3WTJ:5W4P:LEKY:OCCF:CDXQ:IDV7:O23X:K7HQ
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

I couldn't find reference to that filesystem or any of those directories when groveling around in /proc or via lsof.

@ensilon
ensilon commented Jun 24, 2016

I've seen this several times recently also.

# docker info
Containers: 5
 Running: 3
 Paused: 0
 Stopped: 2
Images: 12
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 106
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.849 GiB
Name: docker-internal-01
ID: Z3B7:D5KD:QYKT:YMH3:V57J:MLIK:6HWR:XG3Q:3WR6:RWJV:YOZW:6LNG
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

@servomac
servomac commented Jun 28, 2016 edited

Same here with Ubuntu 14.04

tpiza@neptune:~$ docker info
Containers: 31
 Running: 27
 Paused: 0
 Stopped: 4
Images: 28
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 207
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: host bridge null
Kernel Version: 3.13.0-24-generic
Operating System: Ubuntu 14.04 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 31.12 GiB
Name: neptune.placeholder.lan
ID: RVOT:4V5S:Q7KF:DK7A:OKO7:EFAM:RAO4:6ZLF:4OJE:33GM:TQNY:YRYS
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
WARNING: No kernel memory limit support
@hknochi
hknochi commented Jul 13, 2016 edited

we've the same problem on all our docker host..

Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 1.11.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 55
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null host bridge
Kernel Version: 3.19.0-25-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.852 GiB
Name: jira
ID: 6FQN:7YWG:OIM6:4RFY:66MC:6DFV:3FYJ:6MT4:XPD2:MCHW:HEI6:UOLM
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
@gamesbook

Likewise in our VM. Is there any known workaround or patch?

$ docker info
Containers: 11
Running: 5
Paused: 0
Stopped: 6
Images: 163
Server Version: 1.11.0
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 182
Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host bridge
Kernel Version: 3.13.0-24-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 2.894 GiB
Name: mercury
ID: 5N6J:YJUJ:PB5C:HN5Y:7TVH:K5EE:AEB3:QWGU:F3G3:KT7B:XBBV:NJI3
Docker Root Dir: /var/lib/docker

@paralin
paralin commented Jul 24, 2016

Happening for me on overlayfs

# docker info
Containers: 4
 Running: 3
 Paused: 0
 Stopped: 1
Images: 18
Server Version: v1.12.0-rc3
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null bridge overlay host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 3.14.65
Operating System: Buildroot 2016.08-git
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 1.928 GiB
Name: c2
ID: GB6M:AU3P:R57K:E7QL:XOP5:VSCL:RHLR:P5YX:AYZ6:E3L7:P6D7:3Y4M
Docker Root Dir: /mnt/persist/skiff/docker
Debug Mode (client): false
Debug Mode (server): false
Username: paralin
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8

Error:

Jul 24 20:32:31 c2 dockerd[285]: time="2016-07-24T20:32:31.584405000Z" level=error msg="Handler for DELETE /v1.24/containers/09b77fdaa5d7 returned error: Unable to remove filesystem 
for 09b77fdaa5d7d35f53e3e00f45d06a16f16e4c464311380e5c4f65c5010c5a41: remove /mnt/persist/skiff/docker/containers/09b77fdaa5d7d35f53e3e00f45d06a16f16e4c464311380e5c4f65c5010c5a41/shm
: device or resource busy"
@paralin
paralin commented Jul 24, 2016

Can we add a 1.12 milestone on this?

@ssbarnea
ssbarnea commented Jul 26, 2016 edited

We reached the same bug on latest stable release 11.11.2

Containers: 12
 Running: 11
 Paused: 0
 Stopped: 1
Images: 348
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 269
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 4.2.0-42-generic
Operating System: Ubuntu 15.10
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.67 GiB
Name: sodium
ID: 2EL5:JC7O:43TB:5ZXW:VZ4W:6UJL:FSWH:ZFST:N5L7:3ONP:5RUH:HJSJ
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Username: xxxx
Registry: https://index.docker.io/v1/
WARNING: No swap limit support

Any workarounds till this will be fixed?

@anusha-ragunathan
Contributor

There's a known issue running docker on kernels older than 3.19.
#21969 (comment)
We fixed it on docker's side in 1.11.1 with #22256.

  • Folks with docker version prior to 1.11.1 should upgrade to this and retest.
  • Folks with docker version greater than 1.11.1:
    Is there a consistent repro of the issue? This issue has several reports with docker info
    and the error message, but that's not sufficient. If someone can provide a reproduction
    scenario that fails 100% of the time, then it would help rootcause the issue.

CC-ing folks that reported this issue on kernel >= 3.19 and docker >= 1.11.1 as a call for repro setup. @ssbarnea @hknochi

@agunnerson-ibm
agunnerson-ibm commented Jul 26, 2016 edited

@anusha-ragunathan This is still reproducible with docker 1.11.2 (built from source) and kernel 3.10 (CentOS 7.2).

It's not 100% reproducible, but simply running bash in a python:3.5.0 container will trigger the bug a large majority of the time on our servers:

[andrew.gunnerson@[REDACTED] ~]$ sudo docker run -it --rm --privileged -v "$(pwd):/mnt" python:3.5.0 bash
root@a939881042d0:/# exit
Error response from daemon: Driver devicemapper failed to remove root filesystem a939881042d0962391d7cbe1ffc5541a06dc06a101fda05e2f14003b18a24038: remove /var/lib/docker/devicemapper/mnt/81a361e7b1485cdec54f3b8a08274b1946c7d0b2407cb85b6536459339eef39b: device or resource busy

docker version:

Client:
 Version:         1.11.2
 API version:     1.23
 Package version: docker-upstream-1.11.2-12.git4ddbd3d.el7.centos.x86_64
 Go version:      go1.4.2
 Git commit:      4ddbd3d/1.11.2
 Built:
 OS/Arch:         linux/amd64

Server:
 Version:         1.11.2
 API version:     1.23
 Package version: docker-upstream-1.11.2-12.git4ddbd3d.el7.centos.x86_64
 Go version:      go1.4.2
 Git commit:      4ddbd3d/1.11.2
 Built:
 OS/Arch:         linux/amd64

docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 1.11.2
Storage Driver: devicemapper
 Pool Name: vg00-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file:
 Metadata file:
 Data Space Used: 1.284 GB
 Data Space Total: 46.17 GB
 Data Space Available: 44.88 GB
 Metadata Space Used: 397.3 kB
 Metadata Space Total: 138.4 MB
 Metadata Space Available: 138 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.107-RHEL7 (2016-06-09)
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.10.0-327.22.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 4
Total Memory: 3.703 GiB
Name: cl-pd-docker-1
ID: ZAJ2:RMYX:IJLI:3L3R:HNRA:34IF:FIGU:ALIG:BTBI:SDQK:Z3G7:AFDL
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Registries: docker.io (secure)
@anusha-ragunathan
Contributor
anusha-ragunathan commented Jul 26, 2016 edited

@agunnerson-explorys: Thanks for the info. I tried on your repro on RHEL 7.2 with the same kernel. Ran docker run in a loop (50 times) and no luck. Thing with this issue is that, we have not had a reliable repro scenario. We have jenkins jobs on Docker Master & Docker PRs that run on a good matrix of host vs. storage drivers and haven’t caught this even once. So definitely need a simple, consistent repro.

@cpuguy83
Contributor

I suspect it has to do with shared mount propagation.

@jmkgreen

Saw it a few times yesterday with Docker 1.11.1. I was using a docker-compose.yaml with numerous services defined.

I do not remember ever seeing this with just a single container being spun up - it has always been in the context of of a bunch of services being shut down via compose.

@ensilon
ensilon commented Jul 27, 2016 edited

In my experience, any Debian Jessie host with docker 1.10+ and AUFS does this most of the time when stopping and removing a container. I've downgraded to 1.9.1 and have no trouble at all.

This should reproduce the issue:

Dockerfile:
FROM debian:jessie
RUN apt-get install apache2
CMD ["/usr/sbin/apache2ctl", "-D", "FOREGROUND"]

docker build -t example:1.0 .

docker-compose.yml
app:
image: example:1.0

docker-compose up -d; docker-compose stop; docker-compose rm;

Note: the problem is identical with and without using compose.

@Akaito
Akaito commented Jul 31, 2016 edited

I've just experienced-- only once-- what I think to be this same bug on Debian 8.5, kernel 3.16.0, docker 1.12.0. I tried ensilon's repro above with a tweaked Dockerfile (had to add 'RUN apt-get update' and '-y' to the apt-get install), but didn't repro the issue. Out of a few more stop/rm/build/start commands, I haven't reproed my original case again.
To work around the issue, I did a service docker restart.

The error received, no matter how many times docker rm mpd was issued or if I waited for a while and tried again:

chris@rekt:~/docker/mpd$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
099175dbe20a        0fd34470d7ba        "/usr/local/bin/mpd.s"   8 hours ago         Dead                                    mpd
chris@rekt:~/docker/mpd$ docker rm mpd
Error response from daemon: Driver aufs failed to remove root filesystem 099175dbe20ad230e660ada92a87cf638d08b527479aa4419c92bd7095bec233: rename /var/lib/docker/aufs/mnt/53ed335b0c195a4ddd4c3e6b5970b5154471177f7881f6ed2cc2219b2cc98426 /var/lib/docker/aufs/mnt/53ed335b0c195a4ddd4c3e6b5970b5154471177f7881f6ed2cc2219b2cc98426-removing: device or resource busy

Leading up to the error, this is roughly what I was doing:

chris@rekt:~/docker/mpd$ docker build -t codesaru/mpd .  # (see below for Dockerfile)
chris@rekt:~/docker/mpd$ ./run.sh  # (see below)
chris@Chris-Surface-U:~$ ncmpcpp  # and use VLC, etc. to test MPD (music player daemon) container from another machine, including an HTTP stream it's outputting to
chris@rekt:~/docker/mpd$ vim mpd.conf  # edit a file that gets copied into the codesaru/mpd image
chris@rekt:~/docker/mpd$ docker stop mpd
chris@rekt:~/docker/mpd$ docker rm mpd  # lather, rinse, repeat.  Except the one the one time, as seen above

I'm also mounting three host directories in the Docker container, as seen in run.sh (see below). The latter two are on an encrypted volume (which gets unencrypted almost immediately on system boot by manually entering the password), and the host is also sharing them in the following ways:

  • /home/chris/private/music to 192.168.8.0/24 as read-only via NFS. The host is on this network.
  • /home/chris/private had previously been mounted via sshfs, but I don't believe it was at the time of the error.

There are no symlinks heading out of /home/chris/private/music.

Relevant files below and attached. Most are slightly tweaked versions of files from jess/mpd.

Dockerfile (note FROM debian:jessie instead of FROM debian:sid):

# Music player daemon
#
# docker run -d \
#   --device /dev/snd \
#   -v /etc/localtime:/etc/localtime:ro \
#   -v $HOME/.mpd:/var/lib/mpd \
#   -p 6600:6600 \
#   --name mpd \
#   jess/mpd     
#
FROM debian:jessie
#MAINTAINER Jessica Frazelle <jess@docker.com>
MAINTAINER Chris Barrett <chris@codesaru.com>

RUN apt-get update && apt-get install -y \
    mpc \
    mpd \
    nfs-common \
    sudo \
    --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*

run mkdir -p /var/lib/mpd/playlists \
    && mkdir -p /var/lib/mpd/music \
    && touch /var/lib/mpd/state \
    && touch /var/lib/mpd/tag_cache \
    && chmod 0777 -R /var/lib/mpd \
    && chown -R mpd /var/lib/mpd

# my user needs the ability to mount
# because all my music is in a nfs mount
RUN echo "mpd ALL=NOPASSWD: /usr/bin/mount, /sbin/mount.nfs, /usr/bin/umount" >> /etc/sudoers

ENV HOME /home/mpd
COPY mpd.conf /etc/mpd.conf
COPY mpd.sh /usr/local/bin/mpd.sh
RUN chown mpd:audio /etc/mpd.conf /usr/local/bin/mpd.sh

WORKDIR $HOME
USER mpd

ENTRYPOINT [ "/usr/local/bin/mpd.sh" ]

run.sh:

#!/bin/bash

# -p format: host:container

docker run -d \
    --device /dev/snd \
    -v /etc/localtime:/etc/localtime:ro \
    -v /home/chris/share/private/music:/var/lib/mpd/music/chris-private:ro \
    -v /home/chris/share/private/music-sharing:/var/lib/mpd/music/chris-sharing:ro \
    -p 51313:6600 \
    -p 46730:8000 \
    --name mpd \
    codesaru/mpd

Below is an archive with the Dockerfile, everything I'm COPYing into the Docker image, and run.sh.
docker-issue-21704-akaito.zip

@parkr
parkr commented Aug 9, 2016

Hello, I'm still seeing this in v1.12 as well, on DELETE /containers/{sha}?v=1. Using aufs as well, on Ubuntu 12.04 Precise.

@Puneeth-n
Puneeth-n commented Aug 9, 2016 edited

i'm facing the same issue. I am running jenkins on AWS with a 100GB mounted on /var/lib/docker via EBS.

ubuntu 14.04 LTS - 3.13.0-92-generic
Docker version 1.12.0, build 8eab29e
time="2016-08-09T20:24:00.284387399Z" level=error msg="Error removing mounted layer d268f85645bbd0466e2a4f4a42f242065bf83de56e0c1b017754b07aa654bb5e: rename /var/lib/docker/aufs/mnt/4eb4e8a0cf1a59da4ef5022fc4a23fe6223ebe33f4ad8965c8b40db35599e496 /var/lib/docker/aufs/mnt/4eb4e8a0cf1a59da4ef5022fc4a23fe6223ebe33f4ad8965c8b40db35599e496-removing: device or resource busy"
time="2016-08-09T20:24:00.284570701Z" level=error msg="Handler for DELETE /v1.21/containers/d268f85645bbd0466e2a4f4a42f242065bf83de56e0c1b017754b07aa654bb5e returned error: Driver aufs failed to remove root filesystem d268f85645bbd0466e2a4f4a42f242065bf83de56e0c1b017754b07aa654bb5e: rename /var/lib/docker/aufs/mnt/4eb4e8a0cf1a59da4ef5022fc4a23fe6223ebe33f4ad8965c8b40db35599e496 /var/lib/docker/aufs/mnt/4eb4e8a0cf1a59da4ef5022fc4a23fe6223ebe33f4ad8965c8b40db35599e496-removing: device or resource busy"
@Puneeth-n
Puneeth-n commented Aug 10, 2016 edited

ok. I found a workaround. Basically, I updated the AWS EBS volume to a higher capacity so that I get a higher baseline IOPS. This still didn't solve the issue but improved the performance a bit. Upgrading to the linux kernel version 3.19 seems to have fixed the issue. Haven't see any aufs related issue since 2 days.

@antony
antony commented Aug 28, 2016

Seeing this on CircleCI where we can't upgrade past 1.10.1. Not sure what to do.

@rindek
rindek commented Aug 28, 2016

I had same experience - but what @Puneeth-n suggested, upgrading linux kernel to 3.19.x seems to resolve this issue. No aufs errors since upgrade

@anusha-ragunathan
Contributor

@rindek : As I've previously mentioned #21704 (comment), there's a known issue with kernel versions lesser than 3.19. Upgrading to 3.19+ should help.

@antony
antony commented Aug 29, 2016

In case it helps anyone else, because I can't upgrade Docker, the Kernel, or anything else on CircleCI, I've resorted to a simple daemon restart:

sudo service docker restart

@ragnarkurm
ragnarkurm commented Sep 6, 2016 edited

uname
3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt25-2~bpo70+1 (2016-04-12) x86_64 GNU/Linux

Docker info

Containers: 51
 Running: 25
 Paused: 0
 Stopped: 26     <---- almost all containers failed on linux boot
Images: 170
Server Version: 1.11.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 397
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.16.0-0.bpo.4-amd64
Operating System: Debian GNU/Linux 7 (wheezy)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.873 GiB
Name: xxx
ID: ICC7:JUY2:U5BD:JA7Y:VJKI:42AU:5CSF:5HKY:MQS6:H4V6:OGAW:KXTS
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

Docker version

Client:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:26:49 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.0
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   4dc5990
 Built:        Wed Apr 13 18:26:49 2016
 OS/Arch:      linux/amd64

From kernel logs, possibly related:

[   62.025942] aufs au_opts_verify:1570:docker[3542]: dirperm1 breaks the protection by the permission bits on the lower branch
[   62.476638] aufs au_opts_verify:1570:docker[3542]: dirperm1 breaks the protection by the permission bits on the lower branch
[   62.477259] aufs au_opts_verify:1570:docker[3542]: dirperm1 breaks the protection by the permission bits on the lower branch
[   62.486309] aufs au_opts_verify:1570:docker[3542]: dirperm1 breaks the protection by the permission bits on the lower branch

Errors seen from shell during starting/stopping containers:

Error response from daemon: Unable to remove filesystem for e7312396d0d7943fb69bf61dad477148436ca21d9525324942ef06b2275c8106: remove /var/lib/docker/containers/e7312396d0d7943fb69bf61dad477148436ca21d9525324942ef06b2275c8106/shm: device or resource busy
Error response from daemon: Driver aufs failed to remove root filesystem fef95d4a52a4fe37b0f32c8a5db23970400ee3fc388a9125dbc4ada807fb6c52: rename /var/lib/docker/aufs/mnt/26a38ea90a4778d8c9bc6fb0238e4f88036f85f269adb649e92c557319076356 /var/lib/docker/aufs/mnt/26a38ea90a4778d8c9bc6fb0238e4f88036f85f269adb649e92c557319076356-removing: device or resource busy

Checking open files there:

root@xxx:~# ls -la /var/lib/docker/containers/e7312396d0d7943fb69bf61dad477148436ca21d9525324942ef06b2275c8106/shm
total 8
drwx------ 2 root root 4096 May  5 09:24 .
drwx------ 3 root root 4096 Sep  6 11:22 ..
root@xxx:~# lsof +D /var/lib/docker/containers/e7312396d0d7943fb69bf61dad477148436ca21d9525324942ef06b2275c8106/shm
root@xxx:~# 

Some hacking

root@xxx:/var/lib/docker/containers/61abe1d7820feb64f7ab76f1b1e4a17fc5321aa01012387263d83e752849936e# umount shm/
root@xxx:/var/lib/docker/containers/61abe1d7820feb64f7ab76f1b1e4a17fc5321aa01012387263d83e752849936e# lsof +D shm/
root@xxx:/var/lib/docker/containers/61abe1d7820feb64f7ab76f1b1e4a17fc5321aa01012387263d83e752849936e# rmdir shm/
rmdir: failed to remove `shm/': Device or resource busy
root@xxx:/var/lib/docker/containers/61abe1d7820feb64f7ab76f1b1e4a17fc5321aa01012387263d83e752849936e# 

In addition, after started using docker, server hangs regularly after few months, sometimes even without errors, needs to be booted manually.

@ragnarkurm

Some light at the end of tunnel!

More experimentation yields some clues:

#!/bin/bash

for container in $(docker ps -a --filter status="dead" --format '{{.ID}}' --no-trunc)
do
        echo "Container: $container"
        echo "Mounted PIDs:" $(grep -l "$container" /proc/[0-9]*/mountinfo | cut -f 3 -d / | sort -n)
        echo
done

Given script finds PIDs that are related to Docker container path.
Sometimes there are only couple of processes running (escaped from container?) and after killing them it is possible to remove container.

Other times, I see that most of processes (event init, pid 1) are related to container path. For example:

Container: 964f191ee2d16621c790985149faae38a6a86800bec3127b037df138bf46f26f
Mounted PIDs: 1 2 3 5 7 8 9 10 11 12 13 15 16 18 19 20 21 22 23 24 25 26 29 30 31 37 38 39 107 110 123 382 390 411 412 413 414 431 432 452 453 593 1000 1025 2592 2687 2689 2690 2791 2828 2853 2940 3066 3085 3193 3218 3236 3299 3318 3354 3370 3374 3388 3389 3390 3391 3392 3393 3394 3395 3396 3397 3399 3484 3489 3517 3583 3946 3947 4477 4508 4518 4623 4642 5591 5623 5624 5625 5713 5716 5719 5897 5898 5899 5900 5901 5902 6505 6540 8735 8794 8812 8817 8831 8850 8851 9257 10159 10889 11209 11480 12795 14605 14732 14738 14973 15054 15059 15630 16363 16445 16452 16917 17007 17012 17232 17233 17238 17491 17574 17579 17735 18005 18062 18139 18146 18775 18781 19244 19322 19328 19356 19546 19551 19663 19668 19699 20048 20147 20199 20226 20242 20805 20845 20926 20931 21102 21182 21183 21197 21234 21235 21236 21237 21253 21335 21342 21769 21848 21854 26346 26349 27986 30940

What we see in mountinfo?

# grep 964f191ee2d16621c790985149faae38a6a86800bec3127b037df138bf46f26f /proc/1/mountinfo 
1724 18 0:67 / /run/docker/libcontainerd/964f191ee2d16621c790985149faae38a6a86800bec3127b037df138bf46f26f/rootfs rw,relatime - aufs none rw,si=3fbedc05b5b4a1e1,dio,dirperm1
@cpuguy83
Contributor
cpuguy83 commented Sep 6, 2016

@ragnarkurm Looks like you are running 1.11.0, this is fixed in 1.11.1, and 1.11.2 is the latest from the 1.11 tree.
This is happening because you are mounting /run into the container, which in 1.11.0 had mounts from docker in it under /run/docker/libcontainerd/<id>/rootfs. This doesn't happen anymore.

@ragnarkurm

Upgraded docker, but still getting the behavior.

# docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:13:43 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:13:43 2016
 OS/Arch:      linux/amd64
@cpuguy83
Contributor
cpuguy83 commented Sep 6, 2016

@ragnarkurm Docker does not mount to /run/* (only ever did it in 1.11.0) so you should not be seeing this exact behavior you mentioned.

@knight42

I came across thie same problem on debian jessie.

docker info:

Containers: 3
 Running: 3
 Paused: 0
 Stopped: 0
Images: 6
Server Version: 1.12.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 41
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: host bridge null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options:
Kernel Version: 3.16.7-ckt11+deb8u6-ustclugsigned
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 3.872 GiB
Name: gitlab
ID: ZFN7:HV3V:GFUN:GV6H:YB63:Z5TQ:AU7S:HDO7:XNJM:RD7U:NB4A:VEUQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
Insecure Registries:
 127.0.0.0/8

I solved this problem by running docker-runc delete <resource-id>(the status of the resource was "created", not "running"), then I could delete the container.

@psychok7

Same issue, sudo service docker restart fixed it

$ docker info
Containers: 8
 Running: 0
 Paused: 0
 Stopped: 8
Images: 89
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 187
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-86-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.798 GiB
Name: ahr
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8
@anusha-ragunathan anusha-ragunathan added a commit to anusha-ragunathan/docker that referenced this issue Sep 12, 2016
@anusha-ragunathan anusha-ragunathan Mark mountpoint as unavailable to reduce EBUSY errors on removal.
May fix aufs errors seen in #21704

Signed-off-by: Anusha Ragunathan <anusha@docker.com>
f05e978
@anusha-ragunathan anusha-ragunathan added a commit to anusha-ragunathan/docker that referenced this issue Sep 12, 2016
@anusha-ragunathan anusha-ragunathan Mark mountpoint as unavailable to reduce EBUSY errors on removal.
May fix aufs errors seen in #21704

Signed-off-by: Anusha Ragunathan <anusha@docker.com>
46da85d
@scher200

thank you ragnarkurm, this script really helped me out!

@monikakatiyar16
monikakatiyar16 commented Dec 20, 2016 edited

Tested with docker v1.12.5 and v1.13.0-rc4 on AWS t2.medium, ubuntu 14.04, Kernel version 3.13.0-106-generic, x86_64 system, aufs storage system

Steps to reproduce in my case : Please refer #22207 (comment)

Case 1 : Interrupt docker build while its copying the layers (Ctrl-C) > Try to run docker rm -v -f $(docker ps -a -q)

Gives the error :
docker rm -v -f $(docker ps -a -q) Error response from daemon: Driver aufs failed to remove root filesystem 97931bf059a0ec219efd3f762dbb173cf9372761ff95746358c08e2b61f7ce79: rename /home/lib/docker/aufs/diff/359d27c5b608c9dda1170d1e34e5d6c5d90aa2e94826257f210b1442317fad70 /home/lib/docker/aufs/diff/359d27c5b608c9dda1170d1e34e5d6c5d90aa2e94826257f210b1442317fad70-removing: device or resource busy

I guess its because it's in use by docker-untar which keeps running at the background in my case.
ps -ef | grep docker
root 12590 12589 7 11:41 pts/1 00:00:02 docker build --rm --no-cache -t 5gblayer-1 .
root 12595 10781 8 11:41 ? 00:00:03 docker-untar /home/lib/docker/tmp/docker-builder874246941

Case 2 : Interrupt docker build while its copying the layers (Ctrl-C) > kill the docker-untar process ( kill 12595) > Run docker rm -v -f $(docker ps -a -q) OR docker system prune -a. This works cleanly in my case, without giving any errors and also cleaning up the data.

Side effect of this error also seems to be : The blob data left in the /mnt and /aufs/diff folders don't get cleaned up using docker system prune or docker rm -v. The docker rm -v gives this error and removes the container. However, the data persists at the backend.

I am not sure if this happens only on my system. While interrupting the docker build process, the docker-build process gets killed but the docker-untar doesn't. Though I am not aware of the side-effects it can have, is it possible by any way to include it as a part of docker build interrupt process?

@kleptog
kleptog commented Jan 9, 2017

As an additional data point: a colleague of mine can reproduce this very reliably just by suspending the VM Docker is running on and then unsuspending. He uses VMWare, I do this regularly with my machines with KVM and don't have the same issue.

Perhaps notable is that the unmount fails, but the rename also fails. The latter I have verified by trying the rename myself.

Docker 1.12.5

Jan  9 15:00:47 debdev dockerd[7523]: time="2017-01-09T15:00:47.789831517+01:00" level=debug msg="Calling DELETE /v1.21/containers/2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a?force=True&link=False&v=False"
Jan  9 15:00:47 debdev dockerd[7523]: time="2017-01-09T15:00:47.789924591+01:00" level=debug msg="Sending 9 to 2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a"
Jan  9 15:00:47 debdev dockerd[7523]: time="2017-01-09T15:00:47.900779418+01:00" level=debug msg="containerd: process exited" id=2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a pid=init status=137 systemPid=7769
Jan  9 15:00:47 debdev dockerd[7523]: time="2017-01-09T15:00:47.923769821+01:00" level=debug msg="libcontainerd: received containerd event: &types.Event{Type:\"exit\", Id:\"2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a\", Status:0x89, Pid:\"init\", Timestamp:(*timestamp.Timestamp)(0xc8218c99f0)}"
Jan  9 15:00:47 debdev dockerd[7523]: time="2017-01-09T15:00:47.923946052+01:00" level=warning msg="libcontainerd: container 2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a restart canceled"
Jan  9 15:00:47 debdev dockerd[7523]: time="2017-01-09T15:00:47.924683285+01:00" level=debug msg="Revoking external connectivity on endpoint aaaaaa (a7035e343ca692c511d653002e6ea366c4783a639ed1e07e07583a7a5a04e8d6)"
Jan  9 15:00:48 debdev kernel: [  885.809828] docker0: port 2(vethff77bc8) entered disabled state
Jan  9 15:00:48 debdev kernel: [  885.810581] docker0: port 2(vethff77bc8) entered disabled state
Jan  9 15:00:48 debdev dockerd[7523]: time="2017-01-09T15:00:48.118427658+01:00" level=debug msg="Releasing addresses for endpoint aaaaaa's interface on network bridge"
Jan  9 15:00:48 debdev dockerd[7523]: time="2017-01-09T15:00:48.118484342+01:00" level=debug msg="ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.3)"
Jan  9 15:00:48 debdev dockerd[7523]: time="2017-01-09T15:00:48.262104562+01:00" level=error msg="Error removing mounted layer 2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a: rename /var/lib/docker/aufs/mnt/e95c0253c56cc3ce332af68f03265c8743fdbd16f3801b75b6ca3273410e3453 /var/lib/docker/aufs/mnt/e95c0253c56cc3ce332af68f03265c8743fdbd16f3801b75b6ca3273410e3453-removing: device or resource busy"
Jan  9 15:00:48 debdev dockerd[7523]: time="2017-01-09T15:00:48.262571542+01:00" level=error msg="Handler for DELETE /v1.21/containers/2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a returned error: Driver aufs failed to remove root filesystem 2fd0e35b4e63a35a0945c3fae790e43a4bd462c38f99bc042ffeb3f4d9c4a23a: rename /var/lib/docker/aufs/mnt/e95c0253c56cc3ce332af68f03265c8743fdbd16f3801b75b6ca3273410e3453 /var/lib/docker/aufs/mnt/e95c0253c56cc3ce332af68f03265c8743fdbd16f3801b75b6ca3273410e3453-removing: device or resource busy"
Jan  9 15:00:48 debdev dockerd[7523]: time="2017-01-09T15:00:48.405534108+01:00" level=debug msg="containerd: process exited" id=0cac5034a42e9c2adb8c14dfd2ed9a1534a2bd71a74096714d73442fb3671736 pid=init status=1 systemPid=8400

@vikstrous
Member

I'm still seeing this on docker 1.13. I have the daemon data and all the logs captured. We retry the delete until the daemon tells us that the container doesn't exist, and that used to work, but in this case it errors out instead. https://gist.github.com/vikstrous/e370d7b210e605fa23d7198541056d9e

@sblackstone

I managed to reate this issue on 1.13 by doing the following:

  1. Updated image string in docker-compose
  2. docker-compose pull
  3. Ran docker-compose create --force-recreate with images still running

At this point dockerd stop responding to http requests until I restarted the service

With dockerd restarted, I now get the following:

ERROR: Driver aufs failed to remove root filesystem

when trying docker-compose create

@Glideh
Glideh commented Feb 1, 2017

I'm also having this error since I updated to Docker 1.13 (docker4mac).

$ docker-compose up
ERROR: for my_pretty_service open /var/lib/docker/containers/44d508b497efe6336e43e195e943945a061b55392b9809739cadaab35e7b7e20/.tmp-config.v2.json372435876: no such file or directory
ERROR: Encountered errors while bringing up the project.

$ docker-compose rm -f my_pretty_service
ERROR: for my_project_my_pretty_service_1  Driver aufs failed to remove root filesystem 8650d5ccadbda926fa16232a55e45853806194511868ef6e519ceea46e363544: rename /var/lib/docker/aufs/mnt/e560c77b4aa932a88c94ef20015939eed4b894bfc944ad80e3a4db179d4d1d1a /var/lib/docker/aufs/mnt/e560c77b4aa932a88c94ef20015939eed4b894bfc944ad80e3a4db179d4d1d1a-removing: device or resource busy

Restarting the docker service doesn't fix it anymore for me.

@scher200
scher200 commented Feb 1, 2017

in between i stepped over to overlay2, this made me smile again :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment