New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to remove filesystem for xxx: remove /var/lib/docker/containers/xxx/shm: device or resource busy #17902

Closed
bboreham opened this Issue Nov 11, 2015 · 23 comments

Comments

Projects
None yet
@bboreham
Copy link
Contributor

bboreham commented Nov 11, 2015

Docker is unable to remove a container if another container has bind-mounted /var/lib/docker/containers. Only hits with older kernel, e.g. 3.13. Does not happen with newer kernel, e.g. 3.19. Also does not happen with Docker 1.8.

This is a repro carefully crafted to use only the simple alpine image, nothing fancy.

docker version:

Client:
 Version:      1.9.0
 API version:  1.21
 Go version:   go1.4.3
 Git commit:   76d6bc9
 Built:        Tue Nov  3 19:20:09 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.0
 API version:  1.21
 Go version:   go1.4.3
 Git commit:   76d6bc9
 Built:        Tue Nov  3 19:20:09 UTC 2015
 OS/Arch:      linux/amd64

docker info:

Containers: 1
Images: 93
Server Version: 1.9.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 95
 Dirperm1 Supported: false
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.13.0-55-generic
Operating System: Ubuntu 14.04.2 LTS
CPUs: 1
Total Memory: 490 MiB
Name: weave-gs-01
ID: LLWG:CZWC:AHZP:EA5W:5MG5:5IOC:C6GD:JKBJ:ZI2V:FMFT:ZNXG:G2OW
WARNING: No swap limit support

uname -a:

Linux weave-gs-01 3.13.0-55-generic #92-Ubuntu SMP Sun Jun 14 18:32:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Environment details (AWS, VirtualBox, physical, etc.):
VirtualBox VM in the above example; get identical symptoms from stock GCE RHEL 7.1.

How reproducible:
100% on older kernel; does not happen on newer kernel.

Steps to Reproduce:

$ docker run --name c1 -tdi alpine /bin/sh
3b1a8ae179c72c2b411e65a8e4099b61fbfd91f1843f86594bea51f56ca40766
$ docker run --name c2 -tdi -v /var/lib/docker/containers:/var/lib/docker/containers alpine /bin/sh
1938a9588bfd4708f82294ab9b2be509b6053a9802482565916ab403ae1fe729
$ docker rm -f c1

Actual Results:

Error response from daemon: Unable to remove filesystem for 3b1a8ae179c72c2b411e65a8e4099b61fbfd91f1843f86594bea51f56ca40766: remove /var/lib/docker/containers/3b1a8ae179c72c2b411e65a8e4099b61fbfd91f1843f86594bea51f56ca40766/shm: device or resource busy
Error: failed to remove containers: [c1]

Expected Results:

c1

Additional info:
This is similar to #17823, although it only hits on older kernels.

@jeinwag

This comment has been minimized.

Copy link

jeinwag commented Mar 7, 2016

Just wanted to say thank you for identifying this issue. It has been driving me nuts for the last couple of days!

@GordonTheTurtle

This comment has been minimized.

Copy link

GordonTheTurtle commented Mar 28, 2016

USER POLL

The best way to get notified of updates is to use the Subscribe button on this page.

Please don't use "+1" or "I have this too" comments on issues. We automatically
collect those comments to keep the thread short.

The people listed below have upvoted this issue by leaving a +1 comment:

@allencloud
@yank1

@dsteinkopf

This comment has been minimized.

Copy link

dsteinkopf commented May 9, 2016

I think I had the same problem (as far as I can tell). I had it on different kernels, differens docker versions and both on aufs and devicemapper.
Since switching to overlay, it seems to work fine, now.

@siddharthist

This comment has been minimized.

Copy link

siddharthist commented May 14, 2016

I've seen this with overlayfs and Linux 3.10, so we might be able to isolate it to older kernels, but probably not to storage drivers.

siddharthist added a commit to mantl/mantl that referenced this issue May 14, 2016

martin-helmich added a commit to martin-helmich/salt-microservices that referenced this issue Jun 27, 2016

Run cAdvisor directly on host
See Docker issue #17902 [1] for more information.

  [1] moby/moby#17902
@mtiadm

This comment has been minimized.

Copy link

mtiadm commented Aug 16, 2016

We decided to update this summer from 1.6 to 1.12 and get the same error with the btrfs backend. We use an older kernel because we need it for other softwares :
Linux node1 3.10.0-123.20.1.el7.x86_64 #1 SMP Thu Jan 29 18:05:33 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Is it to be fixed ?
Thank you

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Aug 16, 2016

@mtiadm It's not clear why this would happen. I cannot reproduce locally.
I know some mount namespace behavior has changed between 3.13 and 3.16 that makes the kernel not return ebusy. We'll have to look a bit more.

@mtiadm

This comment has been minimized.

Copy link

mtiadm commented Aug 17, 2016

Is it safe to remove the containers/{container_id} folder afterwards ?
Do you know if there is any other folders that would remain after the container is destroyed ?

@sakserv

This comment has been minimized.

Copy link

sakserv commented Jan 30, 2017

I appear to be encountering the same issue, resulting in quite a few "Dead" containers. Would deferred deletion/removal help for allowing rm to succeed instead of leaving around "Dead" containers?

Jan 28 07:33:29 foo.exmple.com dockerd[285110]: time="2017-01-28T07:33:28.693425032Z" level=error msg="Handler for DELETE /v1.24/containers/container_e124_1484953284061_0206_01_000002 returned error: Unable to remove filesystem for f9f78ff13093f423ba0f44ee5564275ac38f714d7e04554043c792fca5724031: remove /grid/0/docker/containers/f9f78ff13093f423ba0f44ee5564275ac38f714d7e04554043c792fca5724031/shm: device or resource busy"
Containers: 54
 Running: 26
 Paused: 0
 Stopped: 28
Images: 272
Server Version: 1.12.2
Storage Driver: devicemapper
 Pool Name: vg01-docker--pool
 Pool Blocksize: 524.3 kB
 Base Device Size: 274.9 GB
 Backing Filesystem: xfs
 Data file:
 Metadata file:
 Data Space Used: 1.642 TB
 Data Space Total: 5.63 TB
 Data Space Available: 3.988 TB
 Metadata Space Used: 217.8 MB
 Metadata Space Total: 16.98 GB
 Metadata Space Available: 16.76 GB
 Thin Pool Minimum Free Space: 563 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.107-RHEL7 (2015-12-01)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay bridge null host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 3.10.0-327.13.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 251.6 GiB
Name: foo.example.com
ID: 6P36:RULK:MPTH:ZHJW:KYJE:J3FD:F77L:62AF:KZWE:BTIK:ASWT:AMWS
Docker Root Dir: /grid/0/docker
Debug Mode (client): false
Debug Mode (server): false
Username: hwxycloudro
Registry: https://index.docker.io/v1/
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Insecure Registries:
 127.0.0.0/8
@mtiadm

This comment has been minimized.

Copy link

mtiadm commented Jan 30, 2017

Hi,

I have a new job now and don't matter about it anymore, but as far as I can tell, removing dead folders with a script seemed to be ok to avoid partition's inodes usage getting full.

I know that we used to launch each container with docker -v sharedspace:sharedspace
We may have wanted to do as it is in the docker documentation : mount a shared volume on each node and then use the --volume-from option rather than -v directly (I don't know if it works any better but i'd have tried it if I had more time there).

@sakserv : I guess that for Docker's team pov it's a waste of time to fix old kernels behavior especially for the open source version.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Feb 2, 2017

@mtiadm this is a tricky one to solve, it just takes time for the right solution.
I would recommend making sure you don't use shared subtree mounts and do not mount "/var/lib/docker" into a container for now.

@mikesimons

This comment has been minimized.

Copy link

mikesimons commented Feb 2, 2017

The best solution is don't use an old kernel (which means don't use CentOS or use elrepo kernels).

Personally I think it is a stretch to say that 3.10 is a supported kernel as there are so many caveats like this. Even if you work around this one you will probably encounter more issues. There are some around networking that I don't recall and we're pretty sure some random crashes we've seen are due to kernel too.

If you really have to use CentOS and you can't use elrepo (IT policy or such) then as @cpuguy83 said the next best thing is to avoid mounting folders that already have bind mounts in them as this is caused by a kernel bug that incorrectly tracks references to mounts within mounts; it's fixed in 3.19+ kernels IIRC.

Our biggest culprits were cadvisor in a container and logspout. We switched to the journald log driver so that we could get rid of logspout and we moved cadvisor out of a container and started it as a systemd service instead. Please note that if any container has nested bind mounts this will affect all containers on the system until the one that has the mounts is stopped.

@sakserv

This comment has been minimized.

Copy link

sakserv commented Feb 2, 2017

@cpuguy83 @mikesimons - thanks for the response. Very helpful. Moving away from centos/using a different kernel has been a topic of discussion this week. We've had many issues we believe to be due to CentOS and its kernel.

Couple of follow ups to make sure I understand the potential workaround.

  • Mounting, for example, /app/logs and /app/logs/app_1_logs into the container could cause this? Am I understanding what is meant by shared subtree mounts? Mounting either of those is fine, but not both? Given that the error we see is for the shm mount, is that the culprit? Any details on why the shm mount is needed in the container/more control to workaround this issue?
  • Any comment on DM deferred deletion/removal in this situation?
  • Finally, mountflags=slave is what we set in the systemd unit file. Is there a better option regarding this issue? Does this even come into play?
@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Feb 2, 2017

@sakserv I do not think dm deferred deletion will help here.

Mounting /app/logs and /app/logs/app_1_logs should not do this.
Shared-subtree mounts is about propagation of mounts across mount namespaces.
For instance, /var/lib/docker/containers has at least 1 mount for each container for /dev/shm in it. If you mount /var/lib/docker/containers into a container, and then start a new container (even with no extra mounts), it may not be possible to remove that new container until the former container is stopped.
This is because the former container would get the newly mounted shm for the new container and essentially hold onto that mount even though it's been unmounted in another namespace.

This is a bit of a simplification, check out https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt for more details.

@sakserv

This comment has been minimized.

Copy link

sakserv commented Feb 2, 2017

So perhaps adding my "me too" to this thread was immature.

We don't mount /grid/0/docker/containers/ (/var/lib/docker/containers) or any docker daemon related directories into the container. All of our "device busy" errors on docker rm are for the shm device. Does this maybe point do a different issue?

Here are the mounts for an example container that is currently in a "Dead" state (paths slightly scrubbed, sorry for typos). Propagation is private for all of them.

    "Mounts": [
            {
                "Source": "/grid/0/usercache/root/appcache/application_1485243696039_4603/container_e126_1485243696039_4603_01_000003",
                "Destination": "/grid/0/usercache/root/appcache/application_1485243696039_4603/container_e126_1485243696039_4603_01_000003",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Source": "/grid/0/log/application_1485243696039_4603/container_e126_1485243696039_4603_01_000003",
                "Destination": "/grid/0/log/application_1485243696039_4603/container_e126_1485243696039_4603_01_000003",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Source": "/grid/0/local/usercache/root",
                "Destination": "/grid/0/local/usercache/root",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Source": "/sys/fs/cgroup/blkio/containers/container_e126_1485243696039_4603_01_000003",
                "Destination": "/private/cgroups/blkio",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Source": "/sys/fs/cgroup/memory/containers/container_e126_1485243696039_4603_01_000003",
                "Destination": "/private/cgroups/memory",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Source": "/sys/fs/cgroup/cpu,cpuacct/containers/container_e126_1485243696039_4603_01_000003",
                "Destination": "/private/cgroups/cpu",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Source": "/sys/fs/cgroup",
                "Destination": "/sys/fs/cgroup",
                "Mode": "ro",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Source": "/grid/0/usercache/root/appcache/application_1485243696039_4603",
                "Destination": "/grid/0/usercache/root/appcache/application_1485243696039_4603",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],

Any additional pointers to help me better understand ways to work around this? I'll read the above docs after my next cup of coffee. :)

@mikesimons

This comment has been minimized.

Copy link

mikesimons commented Feb 2, 2017

@sakserv I can't tell if any of those directories contain mounts but nothing jumps out. I'd just reiterate that you need to look at all containers on that machine though.

If anything, the dead ones are the least likely to be the culprits because in order for them to be in the dead state, the container that is holding the nested mount would still be running.

@mikesimons

This comment has been minimized.

Copy link

mikesimons commented Feb 2, 2017

@sakserv No promises but this script tries to count references of the shm mount per container. 1 is normal. Any more than this and you may be nesting mounts. For instance here e5365f791a8d has no mounts and e9fd794979a0: mounts /var/lib/docker.

$ bash shmcounter.sh 
e9fd794979a0: 3
e5365f791a8d: 1

False positives are plausible given that this is a twenty minute hack but if it works for you maybe it will give you a pointer.

#!/bin/bash

containers=($(docker ps --format="{{ .ID }}"))

for id in "${containers[@]}"; do
  pid="$(docker top "$id" | tail -n+2 | head -n1 | awk '{ print $2 }')"
  count="$(findmnt --task "$pid" | grep -c /shm)"
  echo "$id: $count"
done
@sakserv

This comment has been minimized.

Copy link

sakserv commented Feb 2, 2017

@mikesimons - thx again, that was a huge help! Sure enough, one of our guys recently deployed cadvisor, and I was unaware it was on all nodes. That script called out the cadvisor container as having 25 shm mounts on this node. We'll move away from running that in a container and see if the Dead containers go away. Again, thank you very much for your time!

@mikesimons

This comment has been minimized.

Copy link

mikesimons commented Feb 2, 2017

@sakserv Awesome. Glad it was time well spent :)

@jeanpralo

This comment has been minimized.

Copy link

jeanpralo commented Apr 11, 2017

Hey there,

I have been spending a bit of time trying to debug this issue and can't really find any workaround from the info above. I have no container mounting /var/lib/docker or any sub-dir but the problem is still happening, also the kernel version is quite recent and I am using overlay as a volume driver.

When this problem happens in production I have no other fix for now but to restart either docker daemon or the server.

docker-version:

Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   60ccb22
 Built:        Thu Feb 23 11:02:43 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   60ccb22
 Built:        Thu Feb 23 11:02:43 2017
 OS/Arch:      linux/amd64
 Experimental: false

docker info

Containers: 95
 Running: 95
 Paused: 0
 Stopped: 0
Images: 20
Server Version: 17.03.0-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-45-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 7.826 GiB
Name: server1
ID: RIIY:VZ2B:W5PL:HDLV:HVID:TRXF:NBBH:3MJD:FFFN:REMY:OAFE:ZGNV
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

uname -a:

Linux server1 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
@ertanden

This comment has been minimized.

Copy link

ertanden commented May 12, 2017

I do experience this issue too with following configuration;

Docker Version: 17.03.1-ce
Kernel Version: 3.10.0-514.16.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)

We run "cadvisor" as well as some other services (eg: prometheus node-exporter) as Docker service. I would really like this bug to be resolved because I don't want to install all those services to the machines. Managing them with Docker is much easier...

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented May 12, 2017

@ertanden the issue is how you are mounting the docker root dir to run cadvisor which ends up leaking all the mounts. There's not much we can do about this.

That said on newer kernels this shouldn't be an issue.

@JeremyBesson

This comment has been minimized.

Copy link

JeremyBesson commented May 2, 2018

I have this issue with:

  • Docker version: 17.12-ce
  • Kernel version: 4.9.87-xxxx-std-ipv6-64
  • Operating System: Ubuntu 16.04
  • Storage driver: overlay2
  • Container "cadvisor"

Workaround:

Reboot the server
and rm -Rf /var/lib/docker again

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented May 3, 2018

This is fixed in 17.12.1 and up. Thanks!

@cpuguy83 cpuguy83 closed this May 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment