New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to remove filesystem - device or resource busy #20560

Closed
ceecko opened this Issue Feb 21, 2016 · 12 comments

Comments

Projects
None yet
8 participants
@ceecko

ceecko commented Feb 21, 2016

Hi,

I started to experience the following error:

Unable to remove filesystem for bda61d0446fa7f54681b677641334407e04f8bb40d9b76540368987fbfe3937e: remove /var/lib/docker/containers/bda61d0446fa7f54681b677641334407e04f8bb40d9b76540368987fbfe3937e/shm: device or resource busy

The error is thrown after docker stop container during docker rm container. The stop and rm action is performed via remote api.

This started after I added a custom bridge network with docker network create network-name but I'm not sure if it's connected.

The container stays in dead state until docker rm -f container
What other information can I provide?

Docker info

Containers: 91
 Running: 85
 Paused: 0
 Stopped: 6
Images: 9
Server Version: 1.10.1
Storage Driver: devicemapper
 Pool Name: docker_pool-docker
 Pool Blocksize: 65.54 kB
 Base Device Size: 107.4 GB
 Backing Filesystem: ext4
 Data file:
 Metadata file:
 Data Space Used: 9.644 GB
 Data Space Total: 26.7 GB
 Data Space Available: 17.06 GB
 Metadata Space Used: 17.73 MB
 Metadata Space Total: 121.6 MB
 Metadata Space Available: 103.9 MB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Library Version: 1.02.107-RHEL7 (2015-12-01)
Execution Driver: native-0.2
Logging Driver: journald
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.10.0-327.10.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.389 GiB
Name: ip-172-31-25-29.eu-west-1.compute.internal
ID: ZUTN:S7TL:6JRZ:HG52:LDLZ:VR5Q:RWVV:IP7E:HOQ4:R55X:Z7AI:P63R
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Docker version

docker version
Client:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:18:46 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.10.1
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   9e83765
 Built:        Thu Feb 11 19:18:46 2016
 OS/Arch:      linux/amd64
@ceecko

This comment has been minimized.

Show comment
Hide comment
@ceecko

ceecko Feb 21, 2016

I tried a couple of config changes which may have caused this and it seems I have narrowed it down to cadvisor.

When cadvisor runs with the following config, it seems to throw the before mentioned errors

docker run \
  --volume=/:/rootfs:ro \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --detach=true \
  --name=cadvisor \
  --net prometheus-net \
  google/cadvisor:latest \
  -nosystemd \
  -docker_env_metadata_whitelist app_id \
  -docker_only \
  -housekeeping_interval 30s

However when the cadvisor is run with fewer volumes, the errors does not seem to occur.

docker run \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --name=cadvisor \
  --detach=true \
  --net prometheus-net \
  google/cadvisor:latest \
  -nosystemd \
  -docker_env_metadata_whitelist app_id \
  -docker_only \
  -housekeeping_interval 30s

ceecko commented Feb 21, 2016

I tried a couple of config changes which may have caused this and it seems I have narrowed it down to cadvisor.

When cadvisor runs with the following config, it seems to throw the before mentioned errors

docker run \
  --volume=/:/rootfs:ro \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --detach=true \
  --name=cadvisor \
  --net prometheus-net \
  google/cadvisor:latest \
  -nosystemd \
  -docker_env_metadata_whitelist app_id \
  -docker_only \
  -housekeeping_interval 30s

However when the cadvisor is run with fewer volumes, the errors does not seem to occur.

docker run \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --name=cadvisor \
  --detach=true \
  --net prometheus-net \
  google/cadvisor:latest \
  -nosystemd \
  -docker_env_metadata_whitelist app_id \
  -docker_only \
  -housekeeping_interval 30s
@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Feb 22, 2016

Member

I think this has been reported in other issues, and could indeed be related to cAdvisor interfering with docker, by keeping certain mounts busy.

Member

thaJeztah commented Feb 22, 2016

I think this has been reported in other issues, and could indeed be related to cAdvisor interfering with docker, by keeping certain mounts busy.

@ceecko

This comment has been minimized.

Show comment
Hide comment
@ceecko

ceecko Feb 22, 2016

It was probably cadvisor, because the problem hasn't occured for 16+ hours already and it could be easily replicated before.

I guess it was the following volume --volume=/var/lib/docker/:/var/lib/docker:ro even though I haven't tried adding it back.

ceecko commented Feb 22, 2016

It was probably cadvisor, because the problem hasn't occured for 16+ hours already and it could be easily replicated before.

I guess it was the following volume --volume=/var/lib/docker/:/var/lib/docker:ro even though I haven't tried adding it back.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Feb 22, 2016

Member

@ceecko let me close this issue for now, but ping me if you're still having this, then I'll reopen

Member

thaJeztah commented Feb 22, 2016

@ceecko let me close this issue for now, but ping me if you're still having this, then I'll reopen

@thaJeztah thaJeztah closed this Feb 22, 2016

@arnoldbechtoldt

This comment has been minimized.

Show comment
Hide comment
@arnoldbechtoldt

arnoldbechtoldt Jun 30, 2016

I'm seeing this error sometimes when using docker rm -f for containers, where (almost) I/O happens at all (container command: cat).

arnoldbechtoldt commented Jun 30, 2016

I'm seeing this error sometimes when using docker rm -f for containers, where (almost) I/O happens at all (container command: cat).

@buckett

This comment has been minimized.

Show comment
Hide comment
@buckett

buckett Oct 18, 2016

@thaJeztah I've just seen this today:

$ docker rm 299f70df1c318164b256a8f6d2b84c46a67f3b44fc0dc329db8457b96ffe6550
Error response from daemon: Unable to remove filesystem for 299f70df1c318164b256a8f6d2b84c46a67f3b44fc0dc329db8457b96ffe6550: remove /var/lib/docker/containers/299f70df1c318164b256a8f6d2b84c46a67f3b44fc0dc329db8457b96ffe6550/shm: device or resource busy

$ docker version
Client:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 23:54:00 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.12.2
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   bb80604
 Built:        Tue Oct 11 18:19:35 2016
 OS/Arch:      linux/amd64

buckett commented Oct 18, 2016

@thaJeztah I've just seen this today:

$ docker rm 299f70df1c318164b256a8f6d2b84c46a67f3b44fc0dc329db8457b96ffe6550
Error response from daemon: Unable to remove filesystem for 299f70df1c318164b256a8f6d2b84c46a67f3b44fc0dc329db8457b96ffe6550: remove /var/lib/docker/containers/299f70df1c318164b256a8f6d2b84c46a67f3b44fc0dc329db8457b96ffe6550/shm: device or resource busy

$ docker version
Client:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 23:54:00 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.12.2
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   bb80604
 Built:        Tue Oct 11 18:19:35 2016
 OS/Arch:      linux/amd64
@ceecko

This comment has been minimized.

Show comment
Hide comment
@ceecko

ceecko Oct 18, 2016

@buckett see #27381. It should help if you're using ext4 as backing filesystem

ceecko commented Oct 18, 2016

@buckett see #27381. It should help if you're using ext4 as backing filesystem

@harobed

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Nov 14, 2016

Member

@harobed looks like there's a couple of open issues, e.g. #22877, #21704, https://github.com/docker/docker/issues?utf8=✓&q=is%3Aissue%20is%3Aopen%20aufs%20busy. Although the symptoms are similar there's actually various things that can influence this

Member

thaJeztah commented Nov 14, 2016

@harobed looks like there's a couple of open issues, e.g. #22877, #21704, https://github.com/docker/docker/issues?utf8=✓&q=is%3Aissue%20is%3Aopen%20aufs%20busy. Although the symptoms are similar there's actually various things that can influence this

@52385385

This comment has been minimized.

Show comment
Hide comment
@52385385

52385385 Apr 20, 2017

Hi all, this error occurs when the host is running with rather heavy load, and when I stop the container and then remove it instead of using ' docker rm -f ', there's no such errors. I guess command ' docker rm -f ' makes the delay between 'docker stop' and 'docker rm' a little short for some old machine which can not response in time.

52385385 commented Apr 20, 2017

Hi all, this error occurs when the host is running with rather heavy load, and when I stop the container and then remove it instead of using ' docker rm -f ', there's no such errors. I guess command ' docker rm -f ' makes the delay between 'docker stop' and 'docker rm' a little short for some old machine which can not response in time.

@bartmeuris

This comment has been minimized.

Show comment
Hide comment
@bartmeuris

bartmeuris Sep 12, 2017

I have this issue with 17.05.0-ce on Ubuntu using aufs when I'm running a Telegraf container with access to the docker socket.

This is the docker-compose file I use for starting Telegtraf (+Influx)

version: "2"
services:
  influxdb:
    image: influxdb:1.3-alpine
    restart: unless-stopped
    volumes:
    - ./influxdata:/var/lib/influxdb
    - ./influxdb.conf:/etc/influxdb/influxdb.conf:ro
    ports:
    - "8086:8086"
    command: -config /etc/influxdb/influxdb.conf
  telegraf:
    image: telegraf:1.3.5-alpine
    restart: unless-stopped
    hostname: my-host
    volumes:
    - ./telegraf.conf:/etc/telegraf/telegraf.conf:ro
    - /data/docker/haproxy/run:/haproxyrun
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - /:/rootfs:ro
    environment:
      HOST_PROC: /rootfs/proc
      HOST_SYS: /rootfs/sys
      HOST_ETC: /rootfs/etc
      HOST_ROOT_FS: /rootfs
      HOST_MOUNT_PREFIX: /rootfs
    links:
    - influxdb:influxdb
    logging:
      options:
        max-size: "10m"
        max-file: "5"

When I stop telegraf, I can remove the container without issue.

bartmeuris commented Sep 12, 2017

I have this issue with 17.05.0-ce on Ubuntu using aufs when I'm running a Telegraf container with access to the docker socket.

This is the docker-compose file I use for starting Telegtraf (+Influx)

version: "2"
services:
  influxdb:
    image: influxdb:1.3-alpine
    restart: unless-stopped
    volumes:
    - ./influxdata:/var/lib/influxdb
    - ./influxdb.conf:/etc/influxdb/influxdb.conf:ro
    ports:
    - "8086:8086"
    command: -config /etc/influxdb/influxdb.conf
  telegraf:
    image: telegraf:1.3.5-alpine
    restart: unless-stopped
    hostname: my-host
    volumes:
    - ./telegraf.conf:/etc/telegraf/telegraf.conf:ro
    - /data/docker/haproxy/run:/haproxyrun
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - /:/rootfs:ro
    environment:
      HOST_PROC: /rootfs/proc
      HOST_SYS: /rootfs/sys
      HOST_ETC: /rootfs/etc
      HOST_ROOT_FS: /rootfs
      HOST_MOUNT_PREFIX: /rootfs
    links:
    - influxdb:influxdb
    logging:
      options:
        max-size: "10m"
        max-file: "5"

When I stop telegraf, I can remove the container without issue.

@vrothberg

This comment has been minimized.

Show comment
Hide comment
@vrothberg

vrothberg Sep 14, 2017

There is a PR fixing this issue: #34573

vrothberg commented Sep 14, 2017

There is a PR fixing this issue: #34573

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment