Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot delete dead containers with overlay2 on RHEL 7.4 #34538

Closed
kinghuang opened this issue Aug 16, 2017 · 12 comments

Comments

@kinghuang
Copy link

commented Aug 16, 2017

Description

RedHat Enterprise Linux 7.4 with kernel 3.10.0-693.el7.x86_64 adds support for overlay2. I've switched over to overlay2 in a non-production cluster and it seems to be working ok most of the time. However, it seems dead containers cannot be deleted. The daemon says the underlying device or resource for the filesystem is busy.

Steps to reproduce the issue:

  1. Run Docker CE 17.06.0 on RHEL 7.4 with the storage-driver set to overlay2. Kernel check override is required.
    "storage-driver": "overlay2",
    "storage-opts": [
        "overlay2.override_kernel_check=true"
    ]
  1. Run a container.
  2. Cause the container to die somehow.

Describe the results you received:

Dead containers (as opposed to containers that exited normally) cannot be removed.

[kchuang@itrmsdev04 ~]$ docker container rm exporters_cadvisor.r00sbmtripe03cv0q5cf2up9m.dibyla90r0bwfwyozvvcjd4ap
Error response from daemon: unable to remove filesystem for 716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35: remove /var/lib/docker/containers/716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35/shm: device or resource busy
[kchuang@itrmsdev04 ~]$ docker container rm wfs-testing_converis-db.1.c66jd182x2m9wns366xyewzi2
Error response from daemon: unable to remove filesystem for cb45e254c730fd4b8f42d74266edb255806c45b343e51e3ef3838dcc840c4a73: remove /var/lib/docker/containers/cb45e254c730fd4b8f42d74266edb255806c45b343e51e3ef3838dcc840c4a73/shm: device or resource busy

Describe the results you expected:

When I was on the overlay driver (not overlay2), I didn't have any problems removing dead containers, or volumes associated with them.

Additional information you deem important (e.g. issue happens only occasionally):

My impression is that this behaviour is new since switching to the overlay2 driver.

This may be relevant to #34368.

Restarting the node allows the dead containers to be deleted.

Output of docker version:

Client:
 Version:      17.06.0-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:20:36 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.0-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:21:56 2017
 OS/Arch:      linux/amd64
 Experimental: true

Output of docker info:

Containers: 43
 Running: 19
 Paused: 0
 Stopped: 24
Images: 92
Server Version: 17.06.0-ce
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: gelf
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: r00sbmtripe03cv0q5cf2up9m
 Is Manager: false
 Node Address: 10.41.149.148
 Manager Addresses:
  10.41.149.137:2377
  10.41.149.138:2377
  10.41.149.139:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-693.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.4 (Maipo)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 19.44GiB
Name: itrmsdev04.ucalgary.ca
ID: MY6L:YHUP:NPUL:MOJ6:CU5D:KTLI:PALW:E7U4:5XKY:6HYE:5W6K:2TQX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

10 node Docker swarm on RHEL 7.4.

@kinghuang kinghuang changed the title Cannot delete dead containers with volumes with overlay2 on RHEL 7.4 Cannot delete dead containers with overlay2 on RHEL 7.4 Aug 16, 2017
@kinghuang

This comment has been minimized.

Copy link
Author

commented Aug 16, 2017

Here's the info for one of the dead containers.

[
    {
        "Id": "716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35",
        "Created": "2017-08-11T21:33:42.396003294Z",
        "Path": "/usr/bin/cadvisor",
        "Args": [
            "-logtostderr",
            "--housekeeping_interval=30s",
            "--global_housekeeping_interval=30s"
        ],
        "State": {
            "Status": "dead",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": true,
            "Pid": 0,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2017-08-11T21:33:43.303004151Z",
            "FinishedAt": "2017-08-15T20:59:11.262714095Z"
        },
        "Image": "sha256:73eceaae464de9390bdb9837544f151bf8babd5e008ecd15d7f2353afbed5220",
        "ResolvConfPath": "/var/lib/docker/containers/716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35/hostname",
        "HostsPath": "/var/lib/docker/containers/716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35/hosts",
        "LogPath": "/var/lib/docker/containers/716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35/716a13d7565a8a0231b9c12e99c3672cae5affc761382736dc4a30fa956fef35-json.log",
        "Name": "/exporters_cadvisor.r00sbmtripe03cv0q5cf2up9m.dibyla90r0bwfwyozvvcjd4ap",
        "RestartCount": 0,
        "Driver": "overlay2",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {
                    "max-file": "10",
                    "max-size": "200k"
                }
            },
            "NetworkMode": "default",
            "PortBindings": {
                "8080/tcp": [
                    {
                        "HostIp": "",
                        "HostPort": "8081"
                    }
                ]
            },
            "RestartPolicy": {
                "Name": "",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": null,
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "Mounts": [
                {
                    "Type": "bind",
                    "Source": "/",
                    "Target": "/rootfs",
                    "ReadOnly": true
                },
                {
                    "Type": "bind",
                    "Source": "/var/run",
                    "Target": "/var/run"
                },
                {
                    "Type": "bind",
                    "Source": "/sys",
                    "Target": "/sys",
                    "ReadOnly": true
                },
                {
                    "Type": "bind",
                    "Source": "/var/lib/docker",
                    "Target": "/var/lib/docker",
                    "ReadOnly": true
                }
            ]
        },
        "GraphDriver": {
            "Data": null,
            "Name": "overlay2"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/",
                "Destination": "/rootfs",
                "Mode": "",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/var/run",
                "Destination": "/var/run",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/sys",
                "Destination": "/sys",
                "Mode": "",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/var/lib/docker",
                "Destination": "/var/lib/docker",
                "Mode": "",
                "RW": false,
                "Propagation": "rprivate"
            }
        ],
        "Config": {
            "Hostname": "716a13d7565a",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "8080/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "GLIBC_VERSION=2.23-r3"
            ],
            "Cmd": [
                "--housekeeping_interval=30s",
                "--global_housekeeping_interval=30s"
            ],
            "Image": "google/cadvisor:v0.26.1@sha256:e26667cecd359ef2b5f5aa86b279146438e4f427eea5eeffc7de8d190cf7b770",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": [
                "/usr/bin/cadvisor",
                "-logtostderr"
            ],
            "OnBuild": null,
            "Labels": {
                "com.docker.stack.namespace": "exporters",
                "com.docker.swarm.node.id": "r00sbmtripe03cv0q5cf2up9m",
                "com.docker.swarm.service.id": "a9abl8l000pl1q6g2rx79x1q6",
                "com.docker.swarm.service.name": "exporters_cadvisor",
                "com.docker.swarm.task": "",
                "com.docker.swarm.task.id": "dibyla90r0bwfwyozvvcjd4ap",
                "com.docker.swarm.task.name": "exporters_cadvisor.r00sbmtripe03cv0q5cf2up9m.dibyla90r0bwfwyozvvcjd4ap"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "e39932c276599a948f145337315f9e3539c75a330b0fe38deb5d221b57ea2694",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/e39932c27659",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "exporters_default": {
                    "IPAMConfig": {
                        "IPv4Address": "10.0.1.4"
                    },
                    "Links": null,
                    "Aliases": [
                        "716a13d7565a"
                    ],
                    "NetworkID": "aj4a4ca85505wubo5hua22z8i",
                    "EndpointID": "",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "",
                    "DriverOpts": null
                }
            }
        }
    }
]
@yunghoy

This comment has been minimized.

Copy link

commented Aug 18, 2017

9b5c2b171adf 9ce41916ff3f "/bin/sh -c 'npm i..." 2 days ago Dead peaceful_gates
38e7d3db20ba 6b1ae39c303d "/bin/sh -c '#(nop..." 2 days ago Dead adoring_meninsky
77ae54844a71 da5ba506028d "/bin/sh -c '#(nop..." 2 days ago Dead keen_archimedes
630c95977d0f 1b294cdd4e48 "/bin/sh -c 'npm i..." 2 days ago Dead sleepy_bell
09eba4a99339 b91604cad8d1 "/bin/sh -c 'npm i..." 2 days ago Dead blissful_cori
f4603cff2bd9 9277951c9382 "/bin/sh -c 'npm i..." 4 days ago Dead zen_bartik
8591c198cc91 87fa9f7cb078 "/bin/sh -c 'npm i..." 4 days ago Dead determined_swirles
7cc8e199aa53 376027621766 "/bin/sh -c '#(nop..." 4 days ago Dead wizardly_payne
53f19126309d df95179ae3f5 "/bin/sh -c '#(nop..." 4 days ago Dead sad_keller
9d63830bc7ed dd6d4dd80d19 "/bin/sh -c 'npm i..." 7 days ago Dead naughty_beaver
3be0ff31d187 37389d43598f "/bin/sh -c 'npm i..." 7 days ago Dead heuristic_euler
fa7460f2c5ef 8e77a885fd37 "/bin/sh -c 'npm i..." 7 days ago Dead sad_ptolemy
ca9d0341f72b 1af9160b8ecf "/bin/sh -c '#(nop..." 9 days ago Dead stupefied_lamport
26aaee6fb5f1 daa0bad67251 "/bin/sh -c 'npm i..." 9 days ago Dead dreamy_bose
a4cd7a8202fa 747a7d437869 "/bin/sh -c '#(nop..." 10 days ago Dead modest_wiles
7a84ebfde2a9 b26a04367a03 "/bin/sh -c '#(nop..." 10 days ago Dead ecstatic_mccarthy
e604bb347ac4 89dd1f5af033 "/bin/sh -c '#(nop..." 10 days ago Dead gracious_bardeen
f5989dd0b37c 532692f1e606 "/bin/sh -c 'npm i..." 10 days ago Dead dazzling_noyce
f7d28644d94c b41816db1a44 "/bin/sh -c '#(nop..." 2 weeks ago Dead vigilant_raman
95ed20399493 5a272ff9a3fa "/bin/sh -c 'npm i..." 2 weeks ago Dead cocky_williams
1493df949c60 b280251739d4 "/bin/sh -c 'npm i..." 2 weeks ago Dead gifted_lumiere
d029ea95bd9d 064bf6dbed53 "/bin/sh -c 'npm i..." 2 weeks ago Dead practical_poitras
5b53164c72e5 36b7619bd13f "/bin/sh -c 'npm i..." 2 weeks ago Dead priceless_neumann
c7c9f33c7cbf e5b6792a1775 "/bin/sh -c '#(nop..." 2 weeks ago Dead clever_dubinsky
34da6b1c1e3a d.sphd.io/mongo:latest "/entrypoint.sh mo..." 2 weeks ago Up 2 weeks 27017/tcp common_mongo_1
2efd986a9138 a5bdcfa7fb83 "/bin/sh -c 'npm i..." 2 weeks ago Dead focused_kepler
d50c864214ca e8db0d1b00cc "/bin/sh -c 'npm i..." 2 weeks ago Dead priceless_jones
fc001f4e4c7a b5edfb14bdc3 "/bin/sh -c 'npm i..." 2 weeks ago Dead focused_lewin
3b9f9f1b9ded 79a7e9c63799 "/bin/sh -c '#(nop..." 2 weeks ago Dead dreamy_johnson
a079622c766e 47122876deb2 "/bin/sh -c 'npm i..." 2 weeks ago Dead elastic_lamport
6d051bbd69b1 a52a4323abf1 "/bin/sh -c '#(nop..." 2 weeks ago Dead happy_hodgkin
273f853910e0 ffd62b597cf7 "/bin/sh -c '#(nop..." 2 weeks ago Dead jovial_goldberg
4b65da7cebcc c0cda60a64d3 "/bin/sh -c '#(nop..." 2 weeks ago Dead amazing_fermi
02817a172fd7 581d2e5bfd5b "/bin/sh -c 'npm i..." 3 weeks ago Dead

@yunghoy

This comment has been minimized.

Copy link

commented Aug 18, 2017

cannot delete those with even "docker system prune."
The only way is deleting everything in "/var/lib/docker/containers" directory.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2017

This is new behavior in 17.06, where 17.03 just ignored the error and leaked data, 17.06 doesn't ignore it anymore. (note that it only ignored the error if you used docker rm -f).

Fundamentally what's happened is a mount has leaked into another mount namespace and it's preventing removal.
RHEL 7.4 is supposed to have a fix this, but I believe it is gated by a kernel setting in /proc... namely /proc/sys/fs/may_detach_mounts. This needs to be set to 1 and it may clear up the issue... I have not tested it and only recently found out about this new knob (note on proper upstream kernels there is no such option and it works as expected).

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2017

Also, if you add MountFlags=slave to docker's systemd unit (you should do this via drop-in, not editing directly), it should help clear up these issues as well.

@cpuguy83 cpuguy83 self-assigned this Aug 25, 2017
@kinghuang

This comment has been minimized.

Copy link
Author

commented Aug 25, 2017

Ah, interesting. I wasn't aware of the the /proc/sys/fs/may_detach_mounts option. Will the docker-ce package add a sysctl conf to set fs.may_detach_mounts?

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2017

@kinghuang I haven't had a chance to test it yet, but I expect we'll add it here in moby/moby and it'll trickle down into docker-ce.

@nikolai-derzhak-distillery

This comment has been minimized.

Copy link

commented Aug 30, 2017

fs.may_detach_mounts=1 fixed issue for me on RedHat 7.4.

@mgrybyk

This comment has been minimized.

Copy link

commented Sep 14, 2017

I'm sorry, can you please provide me with exact steps to fix this issue?

  1. It is required to modify /etc/docker/daemon.json
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ]

?
2. Should I append line to /etc/sysctl.conf:
fs.may_detach_mounts=1
?
3. Should I install runc?

P.S.
is this going to be fixed in next ce release?

Thank you!

@trapier

This comment has been minimized.

Copy link

commented Sep 22, 2017

@mgrybyk fs.may_detach_mounts should go into a file under /etc/sysctl.d/: example:

# set
echo fs.may_detach_mounts=1 | sudo tee /etc/sysctl.d/may_detach_mounts.conf
sudo sysctl -p  /etc/sysctl.d/may_detach_mounts.conf
# confirm set. expected output: `fs.may_detach_mounts = 1`
sysctl fs.may_detach_mounts
@xiaods

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2018

anyone can confirm the result.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2018

This root cause of this issue is fixed by a succession of patches, starting with Docker 17.12.1 for any supported kernel:

Thanks all!

@cpuguy83 cpuguy83 closed this Mar 26, 2018
ltalirz added a commit to marvel-nccr/ansible-role-simulationbase that referenced this issue May 10, 2018
upgrade docker to fix issue with removal of docker container
at the very end
moby/moby#34538 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.