New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error removing container (1.10, 1.11/master) with AUFS #21704

Closed
vikstrous opened this Issue Mar 31, 2016 · 77 comments

Comments

Projects
None yet
@vikstrous
Contributor

vikstrous commented Mar 31, 2016

I've been seeing this error in our integration tests a lot recently:

Error response from daemon: 500 Internal Server Error: Driver aufs failed to remove root filesystem 36382c720964b0560df5fb858af8197169ee4eb399906c0e65c4ca85d795941e: rename /var/lib/docker/aufs/mnt/e7d36cc07ee4aad50f61259bea24876cc925f3c417b6d5ea9c2c1b055d243c82 /var/lib/docker/aufs/mnt/e7d36cc07ee4aad50f61259bea24876cc925f3c417b6d5ea9c2c1b055d243c82-removing: device or resource busy

This happens when a container is being removed and causes our tests to fail. I've seen it only on aufs so far.

Output of docker version:

$ docker version
Client:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 15:54:52 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.3-cs2
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   f02424d
 Built:        Thu Mar 17 21:52:14 2016
 OS/Arch:      linux/amd64
$ docker version
Client:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 15:54:52 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.0-dev
 API version:  1.24
 Go version:   go1.5.3
 Git commit:   dd94c88
 Built:        Thu Mar 31 21:32:39 2016
 OS/Arch:      linux/amd64

Additional environment details (AWS, VirtualBox, physical, etc.):
This is happening on AWS with AUFS

Steps to reproduce the issue:
unknown

Describe the results you received:
500 error from the daemon

Describe the results you expected:
the container should be removed without an error

Additional information you deem important (e.g. issue happens only occasionally):
It happens less than half of the time

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Mar 31, 2016

Member

Can you give the output of docker info as well?

Member

thaJeztah commented Mar 31, 2016

Can you give the output of docker info as well?

@vikstrous

This comment has been minimized.

Show comment
Hide comment
@vikstrous

vikstrous Apr 1, 2016

Contributor
$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.11.0-dev
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
 Dirperm1 Supported: false
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-53-generic
Operating System: Ubuntu 14.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.305 GiB
Name: jenkins-dtr-integration-2023
ID: A64M:T365:F3GT:OPMD:H3YO:AQFH:65Y6:H2YZ:PHGN:4KZI:2BC5:ISLE
Username: dockerbuildbot
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 provider=amazonec2
Contributor

vikstrous commented Apr 1, 2016

$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.11.0-dev
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 0
 Dirperm1 Supported: false
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-53-generic
Operating System: Ubuntu 14.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.305 GiB
Name: jenkins-dtr-integration-2023
ID: A64M:T365:F3GT:OPMD:H3YO:AQFH:65Y6:H2YZ:PHGN:4KZI:2BC5:ISLE
Username: dockerbuildbot
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 provider=amazonec2
@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Apr 1, 2016

Contributor

How recent is your master? I think we fixed this.

Contributor

cpuguy83 commented Apr 1, 2016

How recent is your master? I think we fixed this.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Apr 1, 2016

Member

@cpuguy83 looks like he's on dd94c88

Member

thaJeztah commented Apr 1, 2016

@cpuguy83 looks like he's on dd94c88

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Apr 1, 2016

Member

Is this a duplicate of #21111 and #21101 ?

Member

thaJeztah commented Apr 1, 2016

Is this a duplicate of #21111 and #21101 ?

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Apr 1, 2016

Member

Oh, and #17902

Member

thaJeztah commented Apr 1, 2016

Oh, and #17902

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Apr 1, 2016

Member

ping @anusha-ragunathan would you be able to look into this? I linked various related / similar issues above

Member

thaJeztah commented Apr 1, 2016

ping @anusha-ragunathan would you be able to look into this? I linked various related / similar issues above

@vikstrous

This comment has been minimized.

Show comment
Hide comment
@vikstrous

vikstrous Apr 2, 2016

Contributor

I think the daemon logs from one successful run of our integration tests and one run that caused this error will be helpful, but I'm not sure if I can share them publicly here. If you have access, check these out:

success: https://ci.qa.aws.dckr.io/job/dtr-deploy/2749/artifact/integration/results/docker.log
failure: https://ci.qa.aws.dckr.io/job/dtr-deploy/2753/artifact/integration/results/docker.log

They are not exactly from the same PR, but they are very similar.

There is a potentially relevant one earlier in the logs:


�[34mINFO�[0m[0058] Failed to send signal 15 to the process, force killing 
�[31mERRO�[0m[0058] Handler for POST /v1.15/containers/2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410/stop returned error: Cannot stop container 2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410: Cannot kill container 2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410: rpc error: code = 2 desc = "no such process" 

This is the complete error at the time of the failed container delete:


�[31mERRO�[0m[0148] Error removing mounted layer e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72: rename /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82 /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82-removing: device or resource busy 
�[31mERRO�[0m[0148] Handler for DELETE /v1.15/containers/e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72 returned error: Driver aufs failed to remove root filesystem e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72: rename /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82 /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82-removing: device or resource busy 
�[31mERRO�[0m[0148] Handler for GET /v1.15/containers/e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72/json returned error: No such container: e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72 
�[31mERRO�[0m[0148] Handler for DELETE /v1.15/containers/155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f returned error: Unable to remove filesystem for 155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f: remove /var/lib/docker/containers/155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f/shm: device or resource busy 

It's interesting that the same error log appears when we restart the daemon earlier in the test.

Contributor

vikstrous commented Apr 2, 2016

I think the daemon logs from one successful run of our integration tests and one run that caused this error will be helpful, but I'm not sure if I can share them publicly here. If you have access, check these out:

success: https://ci.qa.aws.dckr.io/job/dtr-deploy/2749/artifact/integration/results/docker.log
failure: https://ci.qa.aws.dckr.io/job/dtr-deploy/2753/artifact/integration/results/docker.log

They are not exactly from the same PR, but they are very similar.

There is a potentially relevant one earlier in the logs:


�[34mINFO�[0m[0058] Failed to send signal 15 to the process, force killing 
�[31mERRO�[0m[0058] Handler for POST /v1.15/containers/2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410/stop returned error: Cannot stop container 2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410: Cannot kill container 2b0a7117aff26868e3f0cfaa29c60146199bc435a388ff79025a7ef951479410: rpc error: code = 2 desc = "no such process" 

This is the complete error at the time of the failed container delete:


�[31mERRO�[0m[0148] Error removing mounted layer e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72: rename /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82 /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82-removing: device or resource busy 
�[31mERRO�[0m[0148] Handler for DELETE /v1.15/containers/e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72 returned error: Driver aufs failed to remove root filesystem e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72: rename /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82 /var/lib/docker/aufs/diff/e46c976939ee6366109ffb8bb95b09ed0ddd5f0c08f100040ed1abc656317c82-removing: device or resource busy 
�[31mERRO�[0m[0148] Handler for GET /v1.15/containers/e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72/json returned error: No such container: e8b084c4ad5b491c20d610842ac96d22c457440418ebfc6a6c941d837ecdce72 
�[31mERRO�[0m[0148] Handler for DELETE /v1.15/containers/155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f returned error: Unable to remove filesystem for 155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f: remove /var/lib/docker/containers/155af974b06e6c09e0f59594812e4e0139e2a5f63a2fe22ca6e9693dccb4491f/shm: device or resource busy 

It's interesting that the same error log appears when we restart the daemon earlier in the test.

@vikstrous

This comment has been minimized.

Show comment
Hide comment
@vikstrous

vikstrous Apr 2, 2016

Contributor

If I had to guess, I'd say there are left over processes referencing the same layers from when the daemon tried to restart and failed to properly kill them.

Contributor

vikstrous commented Apr 2, 2016

If I had to guess, I'd say there are left over processes referencing the same layers from when the daemon tried to restart and failed to properly kill them.

@anusha-ragunathan

This comment has been minimized.

Show comment
Hide comment
@anusha-ragunathan

anusha-ragunathan Apr 6, 2016

Contributor

@vikstrous : I cannot access the jenkins logs. Can you create a gist of the logs? I tried a quick test of creating and removing containers in a loop of 15 (not concurrent) on AUFS and didnt observe this issue. Is there a deterministic way to repro the issue?

Can you confirm that the containers start successfully? If yes, then a couple of things to proceed on:

  • If there was another concurrent request to stop container. This would result in a race and the rename in the context on the second request would error out. You can check the existence of the corresponding diff file. If its doesnt exist, then its most likely a race.
  • In 1.11, we recently changed the way reference counts work in aufs (and other graph drivers). If you can run some instrumented builds, then I can send over a docker binary to debug this more.
Contributor

anusha-ragunathan commented Apr 6, 2016

@vikstrous : I cannot access the jenkins logs. Can you create a gist of the logs? I tried a quick test of creating and removing containers in a loop of 15 (not concurrent) on AUFS and didnt observe this issue. Is there a deterministic way to repro the issue?

Can you confirm that the containers start successfully? If yes, then a couple of things to proceed on:

  • If there was another concurrent request to stop container. This would result in a race and the rename in the context on the second request would error out. You can check the existence of the corresponding diff file. If its doesnt exist, then its most likely a race.
  • In 1.11, we recently changed the way reference counts work in aufs (and other graph drivers). If you can run some instrumented builds, then I can send over a docker binary to debug this more.
@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Apr 6, 2016

Member

@vikstrous @anusha-ragunathan please post it on slack if the jenkins log contains information that should not be shared publicly 👍

Member

thaJeztah commented Apr 6, 2016

@vikstrous @anusha-ragunathan please post it on slack if the jenkins log contains information that should not be shared publicly 👍

@vikstrous

This comment has been minimized.

Show comment
Hide comment
@vikstrous

vikstrous Apr 6, 2016

Contributor

I haven't seen this bug since last time I posted in this thread. It's possible that it was fixed. I'll update you if I see it again.

Contributor

vikstrous commented Apr 6, 2016

I haven't seen this bug since last time I posted in this thread. It's possible that it was fixed. I'll update you if I see it again.

@FelikZ

This comment has been minimized.

Show comment
Hide comment
@FelikZ

FelikZ Apr 7, 2016

I have the similar issue, please have a look.

test.yml:

version: "2"
services:
    browser:
        image: elgalu/selenium:2.53.0e
        ports:
            - "5920:25900"
# - "4444:24444"
# volumes:
#     - "/dev/shm:/dev/shm"
        environment:
          - "VNC_PASSWORD=test"
          - "FIREFOX=false"
          - "CHROME=true"
        networks:
            my-net:
                aliases:
                  - browser
networks:
  my-net:
    driver: bridge

Stdout:

$ docker-compose --version
docker-compose version 1.6.2, build 4d72027
$ docker --version
Docker version 1.10.3, build 20f81dd
$ docker info
Containers: 3
 Running: 1
 Paused: 0
 Stopped: 2
Images: 142
Server Version: 1.10.3
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 245
 Dirperm1 Supported: false
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-85-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.58 GiB
Name: tk9
ID: CYDT:VFSD:2M77:W5P7:OQUD:J6G7:EWQR:KJWR:SOUX:JCLZ:2SBG:J7GX
WARNING: No swap limit support
$ docker-compose -f test.yml up -d
Creating network "homelocal_my-net" with driver "bridge"
Creating homelocal_browser_1
$ docker-compose -f test.yml stop
Stopping homelocal_browser_1 ... 

ERROR: for homelocal_browser_1  ('Connection aborted.', BadStatusLine("''",)) 
ERROR: Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?

If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
$ docker-compose -f test.yml rm -f
Going to remove homelocal_browser_1
Removing homelocal_browser_1 ... error

ERROR: for homelocal_browser_1  Driver aufs failed to remove root filesystem 0e6e88bcc931eb13e141ac871b4ba965d01aae880a20255a5e974f15dff40b0e: rename /var/lib/docker/aufs/mnt/d4e6ee5ebd3ac40e256afa4492451e25cbea87f5041a1dce0bec7a302f41cc45 /var/lib/docker/aufs/mnt/d4e6ee5ebd3ac40e256afa4492451e25cbea87f5041a1dce0bec7a302f41cc45-removing: device or resource busy 
$ docker-compose -f test.yml rm -f
Going to remove homelocal_browser_1
Removing homelocal_browser_1 ... error

FelikZ commented Apr 7, 2016

I have the similar issue, please have a look.

test.yml:

version: "2"
services:
    browser:
        image: elgalu/selenium:2.53.0e
        ports:
            - "5920:25900"
# - "4444:24444"
# volumes:
#     - "/dev/shm:/dev/shm"
        environment:
          - "VNC_PASSWORD=test"
          - "FIREFOX=false"
          - "CHROME=true"
        networks:
            my-net:
                aliases:
                  - browser
networks:
  my-net:
    driver: bridge

Stdout:

$ docker-compose --version
docker-compose version 1.6.2, build 4d72027
$ docker --version
Docker version 1.10.3, build 20f81dd
$ docker info
Containers: 3
 Running: 1
 Paused: 0
 Stopped: 2
Images: 142
Server Version: 1.10.3
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 245
 Dirperm1 Supported: false
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-85-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.58 GiB
Name: tk9
ID: CYDT:VFSD:2M77:W5P7:OQUD:J6G7:EWQR:KJWR:SOUX:JCLZ:2SBG:J7GX
WARNING: No swap limit support
$ docker-compose -f test.yml up -d
Creating network "homelocal_my-net" with driver "bridge"
Creating homelocal_browser_1
$ docker-compose -f test.yml stop
Stopping homelocal_browser_1 ... 

ERROR: for homelocal_browser_1  ('Connection aborted.', BadStatusLine("''",)) 
ERROR: Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?

If it's at a non-standard location, specify the URL with the DOCKER_HOST environment variable.
$ docker-compose -f test.yml rm -f
Going to remove homelocal_browser_1
Removing homelocal_browser_1 ... error

ERROR: for homelocal_browser_1  Driver aufs failed to remove root filesystem 0e6e88bcc931eb13e141ac871b4ba965d01aae880a20255a5e974f15dff40b0e: rename /var/lib/docker/aufs/mnt/d4e6ee5ebd3ac40e256afa4492451e25cbea87f5041a1dce0bec7a302f41cc45 /var/lib/docker/aufs/mnt/d4e6ee5ebd3ac40e256afa4492451e25cbea87f5041a1dce0bec7a302f41cc45-removing: device or resource busy 
$ docker-compose -f test.yml rm -f
Going to remove homelocal_browser_1
Removing homelocal_browser_1 ... error
@FelikZ

This comment has been minimized.

Show comment
Hide comment
@FelikZ

FelikZ Apr 7, 2016

And this probably related as well #21845

FelikZ commented Apr 7, 2016

And this probably related as well #21845

@FelikZ

This comment has been minimized.

Show comment
Hide comment
@FelikZ

FelikZ Apr 7, 2016

@cpuguy83 looks like it does not related to aufs...

Trying to solve this, I switched to overlayfs and see the same picture:

Error response from daemon: Driver overlay failed to remove root filesystem 8b21bec99eccde191ca98e944003274c5b45bbf6f1e4cc08560c0e454e5d3719: readdirent: no such file or directory
$ docker info
Containers: 4
 Running: 0
 Paused: 0
 Stopped: 4
Images: 44
Server Version: 1.10.3
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.19.0-58-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.58 GiB
Name: tk9
ID: CYDT:VFSD:2M77:W5P7:OQUD:J6G7:EWQR:KJWR:SOUX:JCLZ:2SBG:J7GX
WARNING: No swap limit support

FelikZ commented Apr 7, 2016

@cpuguy83 looks like it does not related to aufs...

Trying to solve this, I switched to overlayfs and see the same picture:

Error response from daemon: Driver overlay failed to remove root filesystem 8b21bec99eccde191ca98e944003274c5b45bbf6f1e4cc08560c0e454e5d3719: readdirent: no such file or directory
$ docker info
Containers: 4
 Running: 0
 Paused: 0
 Stopped: 4
Images: 44
Server Version: 1.10.3
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.19.0-58-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.58 GiB
Name: tk9
ID: CYDT:VFSD:2M77:W5P7:OQUD:J6G7:EWQR:KJWR:SOUX:JCLZ:2SBG:J7GX
WARNING: No swap limit support
@anusha-ragunathan

This comment has been minimized.

Show comment
Hide comment
@anusha-ragunathan

anusha-ragunathan Apr 7, 2016

Contributor

@FelikZ : Can you upgrade to docker-engine 1.11 rc4 and try again?

Contributor

anusha-ragunathan commented Apr 7, 2016

@FelikZ : Can you upgrade to docker-engine 1.11 rc4 and try again?

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Apr 18, 2016

Member

ping @FelikZ do you still see this on 1.11.0?

Member

thaJeztah commented Apr 18, 2016

ping @FelikZ do you still see this on 1.11.0?

@ncadou

This comment has been minimized.

Show comment
Hide comment
@ncadou

ncadou May 24, 2016

Seeing this frequently on different machines, all on 1.11.1 plus aufs. Most are on 14.04 LTS (3.13 kernel).

ncadou commented May 24, 2016

Seeing this frequently on different machines, all on 1.11.1 plus aufs. Most are on 14.04 LTS (3.13 kernel).

@jbeda

This comment has been minimized.

Show comment
Hide comment
@jbeda

jbeda May 29, 2016

Contributor

I just saw this when trying to start a container using the gcplog logdriver that wasn't able to launch successfully.

docker run -d --name my-container --log-driver=gcplogs --log-opt gcp-log-cmd=true [...]
38e6a733b02a825dc97208ee2436d31353480bfe31cfa4799cd48203d746fe6e
docker: Error response from daemon: Failed to initialize logging driver: unable to connect or authenticate with Google Cloud Logging: googleapi: Error 403: Google Cloud Logging API has not been used in project 784782548624 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/logging/overview?project=784782548624 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., forbidden.
$ docker rm my-container
Error response from daemon: Driver aufs failed to remove root filesystem fd5e668e8cd14c1a7a2405b26ea3d75bdfd9f25525447b20f6b400eff02a7a23: rename /var/lib/docker/aufs/diff/ae3e5b240821fe702b68544188663ef092bf173b33e495678800c6c3ea498f92 /var/lib/docker/aufs/diff/ae3e5b240821fe702b68544188663ef092bf173b33e495678800c6c3ea498f92-removing: device or resource busy
# docker info
Containers: 9
 Running: 5
 Paused: 0
 Stopped: 4
Images: 382
Server Version: 1.11.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 609
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.16.0-0.bpo.4-amd64
Operating System: Debian GNU/Linux 7 (wheezy)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 6.338 GiB
Name: web
ID: IXJA:3H47:WFQC:GZPE:3WTJ:5W4P:LEKY:OCCF:CDXQ:IDV7:O23X:K7HQ
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

I couldn't find reference to that filesystem or any of those directories when groveling around in /proc or via lsof.

Contributor

jbeda commented May 29, 2016

I just saw this when trying to start a container using the gcplog logdriver that wasn't able to launch successfully.

docker run -d --name my-container --log-driver=gcplogs --log-opt gcp-log-cmd=true [...]
38e6a733b02a825dc97208ee2436d31353480bfe31cfa4799cd48203d746fe6e
docker: Error response from daemon: Failed to initialize logging driver: unable to connect or authenticate with Google Cloud Logging: googleapi: Error 403: Google Cloud Logging API has not been used in project 784782548624 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/logging/overview?project=784782548624 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry., forbidden.
$ docker rm my-container
Error response from daemon: Driver aufs failed to remove root filesystem fd5e668e8cd14c1a7a2405b26ea3d75bdfd9f25525447b20f6b400eff02a7a23: rename /var/lib/docker/aufs/diff/ae3e5b240821fe702b68544188663ef092bf173b33e495678800c6c3ea498f92 /var/lib/docker/aufs/diff/ae3e5b240821fe702b68544188663ef092bf173b33e495678800c6c3ea498f92-removing: device or resource busy
# docker info
Containers: 9
 Running: 5
 Paused: 0
 Stopped: 4
Images: 382
Server Version: 1.11.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 609
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.16.0-0.bpo.4-amd64
Operating System: Debian GNU/Linux 7 (wheezy)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 6.338 GiB
Name: web
ID: IXJA:3H47:WFQC:GZPE:3WTJ:5W4P:LEKY:OCCF:CDXQ:IDV7:O23X:K7HQ
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

I couldn't find reference to that filesystem or any of those directories when groveling around in /proc or via lsof.

@ensilon

This comment has been minimized.

Show comment
Hide comment
@ensilon

ensilon Jun 24, 2016

I've seen this several times recently also.

# docker info
Containers: 5
 Running: 3
 Paused: 0
 Stopped: 2
Images: 12
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 106
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.849 GiB
Name: docker-internal-01
ID: Z3B7:D5KD:QYKT:YMH3:V57J:MLIK:6HWR:XG3Q:3WR6:RWJV:YOZW:6LNG
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

ensilon commented Jun 24, 2016

I've seen this several times recently also.

# docker info
Containers: 5
 Running: 3
 Paused: 0
 Stopped: 2
Images: 12
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 106
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: null host bridge
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.849 GiB
Name: docker-internal-01
ID: Z3B7:D5KD:QYKT:YMH3:V57J:MLIK:6HWR:XG3Q:3WR6:RWJV:YOZW:6LNG
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support

@servomac

This comment has been minimized.

Show comment
Hide comment
@servomac

servomac Jun 28, 2016

Same here with Ubuntu 14.04

tpiza@neptune:~$ docker info
Containers: 31
 Running: 27
 Paused: 0
 Stopped: 4
Images: 28
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 207
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: host bridge null
Kernel Version: 3.13.0-24-generic
Operating System: Ubuntu 14.04 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 31.12 GiB
Name: neptune.placeholder.lan
ID: RVOT:4V5S:Q7KF:DK7A:OKO7:EFAM:RAO4:6ZLF:4OJE:33GM:TQNY:YRYS
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
WARNING: No kernel memory limit support

servomac commented Jun 28, 2016

Same here with Ubuntu 14.04

tpiza@neptune:~$ docker info
Containers: 31
 Running: 27
 Paused: 0
 Stopped: 4
Images: 28
Server Version: 1.11.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 207
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: host bridge null
Kernel Version: 3.13.0-24-generic
Operating System: Ubuntu 14.04 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 31.12 GiB
Name: neptune.placeholder.lan
ID: RVOT:4V5S:Q7KF:DK7A:OKO7:EFAM:RAO4:6ZLF:4OJE:33GM:TQNY:YRYS
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
WARNING: No kernel memory limit support
@Glideh

This comment has been minimized.

Show comment
Hide comment
@Glideh

Glideh Feb 1, 2017

I'm also having this error since I updated to Docker 1.13 (docker4mac).

$ docker-compose up
ERROR: for my_pretty_service open /var/lib/docker/containers/44d508b497efe6336e43e195e943945a061b55392b9809739cadaab35e7b7e20/.tmp-config.v2.json372435876: no such file or directory
ERROR: Encountered errors while bringing up the project.

$ docker-compose rm -f my_pretty_service
ERROR: for my_project_my_pretty_service_1  Driver aufs failed to remove root filesystem 8650d5ccadbda926fa16232a55e45853806194511868ef6e519ceea46e363544: rename /var/lib/docker/aufs/mnt/e560c77b4aa932a88c94ef20015939eed4b894bfc944ad80e3a4db179d4d1d1a /var/lib/docker/aufs/mnt/e560c77b4aa932a88c94ef20015939eed4b894bfc944ad80e3a4db179d4d1d1a-removing: device or resource busy

Restarting the docker service doesn't fix it anymore for me.

Glideh commented Feb 1, 2017

I'm also having this error since I updated to Docker 1.13 (docker4mac).

$ docker-compose up
ERROR: for my_pretty_service open /var/lib/docker/containers/44d508b497efe6336e43e195e943945a061b55392b9809739cadaab35e7b7e20/.tmp-config.v2.json372435876: no such file or directory
ERROR: Encountered errors while bringing up the project.

$ docker-compose rm -f my_pretty_service
ERROR: for my_project_my_pretty_service_1  Driver aufs failed to remove root filesystem 8650d5ccadbda926fa16232a55e45853806194511868ef6e519ceea46e363544: rename /var/lib/docker/aufs/mnt/e560c77b4aa932a88c94ef20015939eed4b894bfc944ad80e3a4db179d4d1d1a /var/lib/docker/aufs/mnt/e560c77b4aa932a88c94ef20015939eed4b894bfc944ad80e3a4db179d4d1d1a-removing: device or resource busy

Restarting the docker service doesn't fix it anymore for me.

@scher200

This comment has been minimized.

Show comment
Hide comment
@scher200

scher200 Feb 1, 2017

in between i stepped over to overlay2, this made me smile again :)

scher200 commented Feb 1, 2017

in between i stepped over to overlay2, this made me smile again :)

@lazize

This comment has been minimized.

Show comment
Hide comment
@lazize

lazize Jul 4, 2017

I am facing some similar issue, I don't know if it is related or not.

I have on container, that was created from a service, that is dead, but I can't remove it. I tried to remove using name and id. I already try to stop and remove.

That service doesn't exist anymore, I already removed it. I already restarted the service and even the computer, none helps to remove it.

Is there any manual steps to remove it from the syste?

:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
78b0dcaffa89        ubuntu:latest       "bash -c 'while tr..."   31 hours ago        Dead                                    leo.1.bkbjt6w08vgeo39rt1nmi7ock

:~$ **docker stop 78b0dcaffa89**
78b0dcaffa89

:~$ docker rm --force 78b0dcaffa89
Error response from daemon: driver "aufs" failed to remove root filesystem for 78b0dcaffa89ac1e532748d44c9b2f57b940def0e34f1f0d26bf7ea1a10c222b: no such file or directory

:~$ docker stop leo.1.bkbjt6w08vgeo39rt1nmi7ock
leo.1.bkbjt6w08vgeo39rt1nmi7ock

:~$ docker rm --force leo.1.bkbjt6w08vgeo39rt1nmi7ock
Error response from daemon: driver "aufs" failed to remove root filesystem for 78b0dcaffa89ac1e532748d44c9b2f57b940def0e34f1f0d26bf7ea1a10c222b: no such file or directory

:~$ sudo find /var/lib/docker -name "78b0dcaffa89ac1e532748d44c9b2f57b94‌​0def0e34f1f0d26bf7ea‌​1a10c222b"

:~$ docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS

lazize commented Jul 4, 2017

I am facing some similar issue, I don't know if it is related or not.

I have on container, that was created from a service, that is dead, but I can't remove it. I tried to remove using name and id. I already try to stop and remove.

That service doesn't exist anymore, I already removed it. I already restarted the service and even the computer, none helps to remove it.

Is there any manual steps to remove it from the syste?

:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
78b0dcaffa89        ubuntu:latest       "bash -c 'while tr..."   31 hours ago        Dead                                    leo.1.bkbjt6w08vgeo39rt1nmi7ock

:~$ **docker stop 78b0dcaffa89**
78b0dcaffa89

:~$ docker rm --force 78b0dcaffa89
Error response from daemon: driver "aufs" failed to remove root filesystem for 78b0dcaffa89ac1e532748d44c9b2f57b940def0e34f1f0d26bf7ea1a10c222b: no such file or directory

:~$ docker stop leo.1.bkbjt6w08vgeo39rt1nmi7ock
leo.1.bkbjt6w08vgeo39rt1nmi7ock

:~$ docker rm --force leo.1.bkbjt6w08vgeo39rt1nmi7ock
Error response from daemon: driver "aufs" failed to remove root filesystem for 78b0dcaffa89ac1e532748d44c9b2f57b940def0e34f1f0d26bf7ea1a10c222b: no such file or directory

:~$ sudo find /var/lib/docker -name "78b0dcaffa89ac1e532748d44c9b2f57b94‌​0def0e34f1f0d26bf7ea‌​1a10c222b"

:~$ docker service ls
ID                  NAME                MODE                REPLICAS            IMAGE               PORTS
@Puneeth-n

This comment has been minimized.

Show comment
Hide comment
@Puneeth-n

Puneeth-n Jul 4, 2017

@lazize restart the docker daemon to remove the dead container.

Also, if you can, change the storage driver to overlay2 way better than aufs

Puneeth-n commented Jul 4, 2017

@lazize restart the docker daemon to remove the dead container.

Also, if you can, change the storage driver to overlay2 way better than aufs

@lazize

This comment has been minimized.

Show comment
Hide comment
@lazize

lazize Jul 4, 2017

@Puneeth-n I already did it, I even restarted my computer, but the container still there.

Do you have some link to instruct me how to change to overlay2?

EDIT: I changed to overlay2, after restart the service the dead container is lost. All my images also, but I can pull it again.

lazize commented Jul 4, 2017

@Puneeth-n I already did it, I even restarted my computer, but the container still there.

Do you have some link to instruct me how to change to overlay2?

EDIT: I changed to overlay2, after restart the service the dead container is lost. All my images also, but I can pull it again.

@Puneeth-n

This comment has been minimized.

Show comment
Hide comment
@Puneeth-n

Puneeth-n Jul 4, 2017

@lazize yes you lose everything cos the storage driver is different.

Puneeth-n commented Jul 4, 2017

@lazize yes you lose everything cos the storage driver is different.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Jul 4, 2017

Member

@lazize can you open a new issue with details? Docker 17.06 has some changes related to removal of containers; if you're running docker 17.06, we may have to look into that (e.g. ignoring errors where the containers file system was already removed)

Member

thaJeztah commented Jul 4, 2017

@lazize can you open a new issue with details? Docker 17.06 has some changes related to removal of containers; if you're running docker 17.06, we may have to look into that (e.g. ignoring errors where the containers file system was already removed)

@lazize

This comment has been minimized.

Show comment
Hide comment
@lazize

lazize Jul 5, 2017

@thaJeztah As I changed to overlay2, everything was gone, so I don't have the "dead" container anymore.
I definitely can open a new issue with my docker installation information and details, but I will not be able to test solutions, at least not until I face the problem again. Do you believe it help anyway? I am using 17.06-ce.

lazize commented Jul 5, 2017

@thaJeztah As I changed to overlay2, everything was gone, so I don't have the "dead" container anymore.
I definitely can open a new issue with my docker installation information and details, but I will not be able to test solutions, at least not until I face the problem again. Do you believe it help anyway? I am using 17.06-ce.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Jul 5, 2017

Member

@lazize if you still have logs from around the time it happened, that could be welcome (be sure to check them for confidential information)

Member

thaJeztah commented Jul 5, 2017

@lazize if you still have logs from around the time it happened, that could be welcome (be sure to check them for confidential information)

@lazize

This comment has been minimized.

Show comment
Hide comment
@lazize

lazize Jul 5, 2017

@thaJeztah Unfortunately SystemD wasn't configured to persist logs, it means that I just have logs from my boot of this morning. I changed it to persist logs now, if it happens again I will be able to help much more. Sorry for that!

lazize commented Jul 5, 2017

@thaJeztah Unfortunately SystemD wasn't configured to persist logs, it means that I just have logs from my boot of this morning. I changed it to persist logs now, if it happens again I will be able to help much more. Sorry for that!

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Jul 5, 2017

Member

No worries, thanks!

Member

thaJeztah commented Jul 5, 2017

No worries, thanks!

@padeyoung

This comment has been minimized.

Show comment
Hide comment
@padeyoung

padeyoung Oct 10, 2017

I am not a linux expert by any means. I have installed docker-ce in order to use a container called elabftw. It does not start and the error is the one continually referenced in this thread. I am su throughout. I am using 17.09.0-ce, there are in fact dead containers, and I tried service docker restart but the problem persists. A reboot did not fix the problem. This is debian8 jessie amd64. Some captures are included below. Thanks for any help anyone can give me.

FROM THE END OF THE DOCKER INSTALLATION

pauldeyoung@local-root-analysis-3:~$ sudo docker run hello-world
[sudo] password for pauldeyoung:

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:

  1. The Docker client contacted the Docker daemon.
  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
  3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
  4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
https://cloud.docker.com/

For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/

TRYING TO START THE CONTAINER
root@local-root-analysis-3:/home/pauldeyoung# elabctl start
elabctl © 2017 Nicolas CARPi - https://www.elabftw.net
Version: 0.6.2
Using configuration file: /etc/elabftw.yml

Removing mysql
ERROR: driver "aufs" failed to remove root filesystem for 4528a3598c4d6b846099a9e00e413cb7f29b1778d1dc73b434caeb8407f1b41f: could not remove diff path for id 1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851: error preparing atomic delete: rename /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851 /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851-removing: device or resource busy

INFO ABOUT INSTALLED VERSION OF DOCKER

root@local-root-analysis-3:/home/pauldeyoung# docker version
Client:
Version: 17.09.0-ce
API version: 1.32
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:40:46 2017
OS/Arch: linux/amd64

Server:
Version: 17.09.0-ce
API version: 1.32 (minimum version 1.12)
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:39:27 2017
OS/Arch: linux/amd64
Experimental: false

TRYING A DOCKER RESTART AND GETTING SAME ERROR

root@local-root-analysis-3:/home/pauldeyoung# service docker restart
root@local-root-analysis-3:/home/pauldeyoung# elabctl start
elabctl © 2017 Nicolas CARPi - https://www.elabftw.net
Version: 0.6.2
Using configuration file: /etc/elabftw.yml

Removing mysql
ERROR: driver "aufs" failed to remove root filesystem for 4528a3598c4d6b846099a9e00e413cb7f29b1778d1dc73b434caeb8407f1b41f: could not remove diff path for id 1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851: error preparing atomic delete: rename /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851 /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851-removing: device or resource busy

RUNNING A COMMAND FROM THE THREAD TO DIAGNOSE DOCKER ISSUE

root@local-root-analysis-3:/home/pauldeyoung# docker ps -a --filter status="dead"
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4528a3598c4d mysql:5.7 "docker-entrypoint..." 3 days ago Dead mysql
388e2c3a5b0a elabftw/elabimg "/run.sh" 4 days ago Dead elabftw
a424097a27fb mysql:5.7 "docker-entrypoint..." 4 days ago Dead a424097a27fb_mysql

padeyoung commented Oct 10, 2017

I am not a linux expert by any means. I have installed docker-ce in order to use a container called elabftw. It does not start and the error is the one continually referenced in this thread. I am su throughout. I am using 17.09.0-ce, there are in fact dead containers, and I tried service docker restart but the problem persists. A reboot did not fix the problem. This is debian8 jessie amd64. Some captures are included below. Thanks for any help anyone can give me.

FROM THE END OF THE DOCKER INSTALLATION

pauldeyoung@local-root-analysis-3:~$ sudo docker run hello-world
[sudo] password for pauldeyoung:

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:

  1. The Docker client contacted the Docker daemon.
  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
  3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
  4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
https://cloud.docker.com/

For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/

TRYING TO START THE CONTAINER
root@local-root-analysis-3:/home/pauldeyoung# elabctl start
elabctl © 2017 Nicolas CARPi - https://www.elabftw.net
Version: 0.6.2
Using configuration file: /etc/elabftw.yml

Removing mysql
ERROR: driver "aufs" failed to remove root filesystem for 4528a3598c4d6b846099a9e00e413cb7f29b1778d1dc73b434caeb8407f1b41f: could not remove diff path for id 1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851: error preparing atomic delete: rename /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851 /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851-removing: device or resource busy

INFO ABOUT INSTALLED VERSION OF DOCKER

root@local-root-analysis-3:/home/pauldeyoung# docker version
Client:
Version: 17.09.0-ce
API version: 1.32
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:40:46 2017
OS/Arch: linux/amd64

Server:
Version: 17.09.0-ce
API version: 1.32 (minimum version 1.12)
Go version: go1.8.3
Git commit: afdb6d4
Built: Tue Sep 26 22:39:27 2017
OS/Arch: linux/amd64
Experimental: false

TRYING A DOCKER RESTART AND GETTING SAME ERROR

root@local-root-analysis-3:/home/pauldeyoung# service docker restart
root@local-root-analysis-3:/home/pauldeyoung# elabctl start
elabctl © 2017 Nicolas CARPi - https://www.elabftw.net
Version: 0.6.2
Using configuration file: /etc/elabftw.yml

Removing mysql
ERROR: driver "aufs" failed to remove root filesystem for 4528a3598c4d6b846099a9e00e413cb7f29b1778d1dc73b434caeb8407f1b41f: could not remove diff path for id 1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851: error preparing atomic delete: rename /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851 /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851-removing: device or resource busy

RUNNING A COMMAND FROM THE THREAD TO DIAGNOSE DOCKER ISSUE

root@local-root-analysis-3:/home/pauldeyoung# docker ps -a --filter status="dead"
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4528a3598c4d mysql:5.7 "docker-entrypoint..." 3 days ago Dead mysql
388e2c3a5b0a elabftw/elabimg "/run.sh" 4 days ago Dead elabftw
a424097a27fb mysql:5.7 "docker-entrypoint..." 4 days ago Dead a424097a27fb_mysql

@mjmunger

This comment has been minimized.

Show comment
Hide comment
@mjmunger

mjmunger Feb 15, 2018

This is supremely annoying. I have the same issue confirmed on:

Docker version 17.06.2-ce, build cec0b72
Distributor ID: Debian
Description: Debian GNU/Linux 8.9 (jessie)
Release: 8.9
Codename: jessie

Completely stopped all containers and the docker service trying to release this directory. No joy.

The only thing that is odd is that the directory in question ( /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851) is mostly empty directories. The mysqld directory that was in it had a user: group of 999:docker.
As root, I am able to change to this directory and then delete the entire contents of it, but I cannot delete or rename the directory itself.

It appears that the aufs driver is what has a hold of it. Can't rmmod aufs because it's in use by PID #2. Tried rmmod -f (at my own peril) and got a seg fault.

My only recourse at the moment is to reboot when it happens.

Prior to using docker, I rebooted my Linux workstation once a year. It now has to be rebooted with the frequency of a Windows XP machine.

Please fix this or post a workaround.

Something, somewhere is not properly closing a file handle to this directory.

mjmunger commented Feb 15, 2018

This is supremely annoying. I have the same issue confirmed on:

Docker version 17.06.2-ce, build cec0b72
Distributor ID: Debian
Description: Debian GNU/Linux 8.9 (jessie)
Release: 8.9
Codename: jessie

Completely stopped all containers and the docker service trying to release this directory. No joy.

The only thing that is odd is that the directory in question ( /var/lib/docker/aufs/diff/1db9ace3915e17dc1a4753b7368598a4d0aacbde37e2f01c5589d3a3ea52f851) is mostly empty directories. The mysqld directory that was in it had a user: group of 999:docker.
As root, I am able to change to this directory and then delete the entire contents of it, but I cannot delete or rename the directory itself.

It appears that the aufs driver is what has a hold of it. Can't rmmod aufs because it's in use by PID #2. Tried rmmod -f (at my own peril) and got a seg fault.

My only recourse at the moment is to reboot when it happens.

Prior to using docker, I rebooted my Linux workstation once a year. It now has to be rebooted with the frequency of a Windows XP machine.

Please fix this or post a workaround.

Something, somewhere is not properly closing a file handle to this directory.

@mjmunger

This comment has been minimized.

Show comment
Hide comment
@mjmunger

mjmunger Feb 15, 2018

...if changing to a different file driver would help, I'm open to that.

mjmunger commented Feb 15, 2018

...if changing to a different file driver would help, I'm open to that.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Feb 15, 2018

Contributor

This should be fixed on master and am working on backports for docker 17.12.
But without knowing exactly what is running that's holding onto the reference it's hard to tell for sure.

Typically what happens here is a mount has leaked into another namespace (could be a container, or even another service started by systemd).

Contributor

cpuguy83 commented Feb 15, 2018

This should be fixed on master and am working on backports for docker 17.12.
But without knowing exactly what is running that's holding onto the reference it's hard to tell for sure.

Typically what happens here is a mount has leaked into another namespace (could be a container, or even another service started by systemd).

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Feb 15, 2018

Contributor

btw, you can use this to sniff out what's holding onto the mount reference: https://github.com/rhvgoyal/misc/blob/master/find-busy-mnt.sh

Contributor

cpuguy83 commented Feb 15, 2018

btw, you can use this to sniff out what's holding onto the mount reference: https://github.com/rhvgoyal/misc/blob/master/find-busy-mnt.sh

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Feb 15, 2018

Contributor

That one is for devmapper, but can be modified for aufs.

Contributor

cpuguy83 commented Feb 15, 2018

That one is for devmapper, but can be modified for aufs.

@mjmunger

This comment has been minimized.

Show comment
Hide comment
@mjmunger

mjmunger Feb 15, 2018

This should be fixed on master and am working on backports for docker 17.12.
What do I need to do to get to master? This is how we install docker currently.

Re: find-busy-mnt.sh
I'll use that next time this fails, and report back if there is anything useful. lsof was not helpful.

mjmunger commented Feb 15, 2018

This should be fixed on master and am working on backports for docker 17.12.
What do I need to do to get to master? This is how we install docker currently.

Re: find-busy-mnt.sh
I'll use that next time this fails, and report back if there is anything useful. lsof was not helpful.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Feb 15, 2018

Contributor

And actually looking closer, should work for any graphdriver that does mounts.

Contributor

cpuguy83 commented Feb 15, 2018

And actually looking closer, should work for any graphdriver that does mounts.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Feb 15, 2018

Contributor

What do I need to do to get to master?

You can grab a nightly static binary from https://master.dockerproject.org/ and replace dockerd with it... but I wouldn't do this in anything but a test environment.

Contributor

cpuguy83 commented Feb 15, 2018

What do I need to do to get to master?

You can grab a nightly static binary from https://master.dockerproject.org/ and replace dockerd with it... but I wouldn't do this in anything but a test environment.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Feb 15, 2018

Member

Not really announced yet, but we also now have nightly builds in our apt/yum repository, e.g.: https://download.docker.com/linux/ubuntu/dists/xenial/pool/nightly/

Member

thaJeztah commented Feb 15, 2018

Not really announced yet, but we also now have nightly builds in our apt/yum repository, e.g.: https://download.docker.com/linux/ubuntu/dists/xenial/pool/nightly/

@kleptog

This comment has been minimized.

Show comment
Hide comment
@kleptog

kleptog Feb 21, 2018

FWIW, we switched to overlay2 everywhere and never had any (graphdriver) issues since. aufs just doesn't appear to work very well (also solves the core-dumps-in-images issue).

kleptog commented Feb 21, 2018

FWIW, we switched to overlay2 everywhere and never had any (graphdriver) issues since. aufs just doesn't appear to work very well (also solves the core-dumps-in-images issue).

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jul 7, 2018

Contributor

Closing because his is fixed in 17.12.1 and 18.03+

Contributor

cpuguy83 commented Jul 7, 2018

Closing because his is fixed in 17.12.1 and 18.03+

@cpuguy83 cpuguy83 closed this Jul 7, 2018

@moritz

This comment has been minimized.

Show comment
Hide comment
@moritz

moritz Jul 23, 2018

I still get this error occasionally, on Debian 8.11 with kernel 3.16.0-6-amd64 and docker-ce 18.06.0ce3 obtained from https://download.docker.com/linux/debian/.

@cpuguy83 can you please reopen?

moritz commented Jul 23, 2018

I still get this error occasionally, on Debian 8.11 with kernel 3.16.0-6-amd64 and docker-ce 18.06.0ce3 obtained from https://download.docker.com/linux/debian/.

@cpuguy83 can you please reopen?

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jul 31, 2018

Contributor

@moritz What's the exact error you received and how did you get it? Do you have daemon logs for this time period?

Contributor

cpuguy83 commented Jul 31, 2018

@moritz What's the exact error you received and how did you get it? Do you have daemon logs for this time period?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment