New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.7.0 fails to remove containers #2714

Closed
ndarilek opened this Issue Nov 15, 2013 · 133 comments

Comments

Projects
None yet
@ndarilek
Contributor

ndarilek commented Nov 15, 2013

Script started on Fri 15 Nov 2013 04:28:56 PM UTC
root@thewordnerd:# uname -a
Linux thewordnerd.info 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
root@thewordnerd:
# docker version
Client version: 0.7.0-rc5
Go version (client): go1.2rc4
Git commit (client): 0c38f86-dirty
Server version: 0.7.0-rc5
Git commit (server): 0c38f86-dirty
Go version (server): go1.2rc4
Last stable version: 0.6.6, please update docker
root@thewordnerd:~# docker rm docker ps -a -q
Error: Cannot destroy container ba8a9ec006c8: Driver devicemapper failed to remove root filesystem ba8a9ec006c8e38154bd697b3ab4810ddb5fe477ed1cfb48ac3bd604a5a59495: Error running removeDevice
Error: Cannot destroy container d2f56763e65a: Driver devicemapper failed to remove root filesystem d2f56763e65a66ffccb3137017dddad745e921f4bdaa084f6b4a0d6407ec030a: Error running removeDevice
Error: Cannot destroy container c22980febe50: Driver devicemapper failed to remove root filesystem
...

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Nov 15, 2013

Contributor

Did you switch drivers from aufs to deviemapper manually without removing /var/lib/docker ?

Contributor

crosbymichael commented Nov 15, 2013

Did you switch drivers from aufs to deviemapper manually without removing /var/lib/docker ?

@ndarilek

This comment has been minimized.

Show comment
Hide comment
@ndarilek

ndarilek Nov 15, 2013

Contributor

Not that I'm aware of. How would I find out?

Contributor

ndarilek commented Nov 15, 2013

Not that I'm aware of. How would I find out?

@PierreR

This comment has been minimized.

Show comment
Hide comment
@PierreR

PierreR Dec 3, 2013

As a note I have had the exact same problem.

docker version
Client version: 0.7.0
Go version (client): go1.2rc5
Git commit (client): 0d078b6
Server version: 0.7.0
Git commit (server): 0d078b6
Go version (server): go1.2rc5
Last stable version: 0.7.0

I have rebooted the host OS and the problem disappeared. It has happened after a docker kill or docker stop (I don't remember) on the container.

PierreR commented Dec 3, 2013

As a note I have had the exact same problem.

docker version
Client version: 0.7.0
Go version (client): go1.2rc5
Git commit (client): 0d078b6
Server version: 0.7.0
Git commit (server): 0d078b6
Go version (server): go1.2rc5
Last stable version: 0.7.0

I have rebooted the host OS and the problem disappeared. It has happened after a docker kill or docker stop (I don't remember) on the container.

@ghristov

This comment has been minimized.

Show comment
Hide comment
@ghristov

ghristov Dec 9, 2013

I have the same problem and it appears on docker kill also on docker stop. Actually the problem according to me is that when mounted the container , when deleting the driver doesn't want to unmount it . Well depends whose responsibility it is( rm or kill/stop).

Indeed the problem is fixed after restart because everything is unmounted. and you have no locked situations.

ghristov commented Dec 9, 2013

I have the same problem and it appears on docker kill also on docker stop. Actually the problem according to me is that when mounted the container , when deleting the driver doesn't want to unmount it . Well depends whose responsibility it is( rm or kill/stop).

Indeed the problem is fixed after restart because everything is unmounted. and you have no locked situations.

@philips

This comment has been minimized.

Show comment
Hide comment
@philips

philips Dec 12, 2013

Contributor

I am encountering this with 0.7.1 also

Contributor

philips commented Dec 12, 2013

I am encountering this with 0.7.1 also

@philips

This comment has been minimized.

Show comment
Hide comment
@philips

philips Dec 13, 2013

Contributor

Hrm, and switching to the device mapper backend doesn't really help either. Got this just now:

Error: Cannot destroy container keystone-1: Driver devicemapper failed to remove root filesystem 1d42834e2e806e0fd0ab0351ae504ec9a98e0a74be337fc2158a516ec8d6f36b: Error running removeDevice
Contributor

philips commented Dec 13, 2013

Hrm, and switching to the device mapper backend doesn't really help either. Got this just now:

Error: Cannot destroy container keystone-1: Driver devicemapper failed to remove root filesystem 1d42834e2e806e0fd0ab0351ae504ec9a98e0a74be337fc2158a516ec8d6f36b: Error running removeDevice
@philips

This comment has been minimized.

Show comment
Hide comment
@philips

philips Dec 13, 2013

Contributor

@crosbymichael It seems like this isn't just about aufs. devicemapper is getting similar errors. #2714 (comment)

Contributor

philips commented Dec 13, 2013

@crosbymichael It seems like this isn't just about aufs. devicemapper is getting similar errors. #2714 (comment)

@zhemao

This comment has been minimized.

Show comment
Hide comment
@zhemao

zhemao Jan 4, 2014

I'm getting this still on 0.7.3 using devicemapper

Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3

However, the problem seems to resolve itself if you restart the docker server. If it happens again, I'll try running lsof on the mount to see what process is causing it to be busy.

zhemao commented Jan 4, 2014

I'm getting this still on 0.7.3 using devicemapper

Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3

However, the problem seems to resolve itself if you restart the docker server. If it happens again, I'll try running lsof on the mount to see what process is causing it to be busy.

@Chris00

This comment has been minimized.

Show comment
Hide comment
@Chris00

Chris00 Jan 4, 2014

Contributor

I have the same problem.

$ docker version
Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
538ab4938d5d        3c23bb541f74        /bin/sh -c apt-get -   12 minutes ago      Exit 100                                agitated_einstein   
bdfbff084c4d        3c23bb541f74        /bin/sh -c apt-get u   14 minutes ago      Exit 0                                  sharp_torvalds      
95cea6012869        6c5a63de23d9        /bin/sh -c echo 'for   14 minutes ago      Exit 0                                  romantic_lovelace 
$  mount|grep 538ab4938d5d
/dev/mapper/docker-8:3-2569260-538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278 on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278 type ext4 (rw,relatime,discard,stripe=16,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerinit type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerenv type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/resolv.conf type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/hostname type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/hosts type ext4 (rw,relatime,errors=remount-ro,data=ordered)
# lsof /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/bdfbff084c4d96b6817eb7ccb812a608e4a6a45cb4c06d423e26364b45b59c97/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
# ls -l /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit
-rwx------ 0 root root 14406593 Jan  4 21:05 /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit*
Contributor

Chris00 commented Jan 4, 2014

I have the same problem.

$ docker version
Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
538ab4938d5d        3c23bb541f74        /bin/sh -c apt-get -   12 minutes ago      Exit 100                                agitated_einstein   
bdfbff084c4d        3c23bb541f74        /bin/sh -c apt-get u   14 minutes ago      Exit 0                                  sharp_torvalds      
95cea6012869        6c5a63de23d9        /bin/sh -c echo 'for   14 minutes ago      Exit 0                                  romantic_lovelace 
$  mount|grep 538ab4938d5d
/dev/mapper/docker-8:3-2569260-538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278 on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278 type ext4 (rw,relatime,discard,stripe=16,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerinit type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerenv type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/resolv.conf type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/hostname type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/hosts type ext4 (rw,relatime,errors=remount-ro,data=ordered)
# lsof /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/bdfbff084c4d96b6817eb7ccb812a608e4a6a45cb4c06d423e26364b45b59c97/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
# ls -l /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit
-rwx------ 0 root root 14406593 Jan  4 21:05 /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit*
@Chris00

This comment has been minimized.

Show comment
Hide comment
@Chris00

Chris00 Jan 4, 2014

Contributor

Restarting the deamon does not solve the problem.

Contributor

Chris00 commented Jan 4, 2014

Restarting the deamon does not solve the problem.

@lzyy

This comment has been minimized.

Show comment
Hide comment
@lzyy

lzyy Jan 5, 2014

same problem:

limboy@gintama:~$ docker ps -a 
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
a7760911ecac        ubuntu:12.04        bash                About an hour ago   Exit 137                                backstabbing_mccarthy   

limboy@gintama:~$ docker rm a77
Error: Cannot destroy container a77: Driver devicemapper failed to remove root filesystem a7760911ecacb93b1c530d6a0bde4deeb79ef0cbf901488cb55df2f2ca02207a: device or resource busy
2014/01/05 16:04:21 Error: failed to remove one or more containers

limboy@gintama:~$ docker info
Containers: 1
Images: 5
Driver: devicemapper
 Pool Name: docker-202:0-93718-pool
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 1079.8 Mb
 Data Space Total: 102400.0 Mb
 Metadata Space Used: 1.3 Mb
 Metadata Space Total: 2048.0 Mb
WARNING: No memory limit support
WARNING: No swap limit support

restart the host doesn't solve the problem.

then i run docker run -i ubuntu bash can't goto interactive mode, just blank.

lzyy commented Jan 5, 2014

same problem:

limboy@gintama:~$ docker ps -a 
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
a7760911ecac        ubuntu:12.04        bash                About an hour ago   Exit 137                                backstabbing_mccarthy   

limboy@gintama:~$ docker rm a77
Error: Cannot destroy container a77: Driver devicemapper failed to remove root filesystem a7760911ecacb93b1c530d6a0bde4deeb79ef0cbf901488cb55df2f2ca02207a: device or resource busy
2014/01/05 16:04:21 Error: failed to remove one or more containers

limboy@gintama:~$ docker info
Containers: 1
Images: 5
Driver: devicemapper
 Pool Name: docker-202:0-93718-pool
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 1079.8 Mb
 Data Space Total: 102400.0 Mb
 Metadata Space Used: 1.3 Mb
 Metadata Space Total: 2048.0 Mb
WARNING: No memory limit support
WARNING: No swap limit support

restart the host doesn't solve the problem.

then i run docker run -i ubuntu bash can't goto interactive mode, just blank.

@ptmt

This comment has been minimized.

Show comment
Hide comment
@ptmt

ptmt Jan 7, 2014

+1.

$ docker version
Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3

$ docker rm d33
2014/01/07 05:55:57 DELETE /v1.8/containers/d33
[error] mount.go:11 [warning]: couldn't run auplink before unmount: exit status 116
[error] api.go:1062 Error: Cannot destroy container d33: Driver aufs failed to remove root filesystem d3312bcdeb7dc241d4
870100beadfe94d6884904229cc50d66aacd66ab16e064: stale NFS file handle
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container d33: Driver aufs failed to remove root filesystem
d3312bcdeb7dc241d4870100beadfe94d6884904229cc50d66aacd66ab16e064: stale NFS file handle
Error: Cannot destroy container d33: Driver aufs failed to remove root filesystem d3312bcdeb7dc241d4870100beadfe94d68849
04229cc50d66aacd66ab16e064: stale NFS file handle
2014/01/07 05:55:57 Error: failed to remove one or more containers

ptmt commented Jan 7, 2014

+1.

$ docker version
Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3

$ docker rm d33
2014/01/07 05:55:57 DELETE /v1.8/containers/d33
[error] mount.go:11 [warning]: couldn't run auplink before unmount: exit status 116
[error] api.go:1062 Error: Cannot destroy container d33: Driver aufs failed to remove root filesystem d3312bcdeb7dc241d4
870100beadfe94d6884904229cc50d66aacd66ab16e064: stale NFS file handle
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container d33: Driver aufs failed to remove root filesystem
d3312bcdeb7dc241d4870100beadfe94d6884904229cc50d66aacd66ab16e064: stale NFS file handle
Error: Cannot destroy container d33: Driver aufs failed to remove root filesystem d3312bcdeb7dc241d4870100beadfe94d68849
04229cc50d66aacd66ab16e064: stale NFS file handle
2014/01/07 05:55:57 Error: failed to remove one or more containers
@vjeantet

This comment has been minimized.

Show comment
Hide comment
@vjeantet

vjeantet Jan 11, 2014

same here

Client version: 0.7.5
Go version (client): go1.2
Git commit (client): c348c04
Server version: 0.7.5
Git commit (server): c348c04
Go version (server): go1.2
Last stable version: 0.7.5

$docker rm 9f017e610f24
2014/01/11 23:03:11 DELETE /v1.8/containers/9f017e610f24
[error] api.go:1064 Error: Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
Error: Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
2014/01/11 23:03:11 Error: failed to remove one or more containers

vjeantet commented Jan 11, 2014

same here

Client version: 0.7.5
Go version (client): go1.2
Git commit (client): c348c04
Server version: 0.7.5
Git commit (server): c348c04
Go version (server): go1.2
Last stable version: 0.7.5

$docker rm 9f017e610f24
2014/01/11 23:03:11 DELETE /v1.8/containers/9f017e610f24
[error] api.go:1064 Error: Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
Error: Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
2014/01/11 23:03:11 Error: failed to remove one or more containers
@LordFPL

This comment has been minimized.

Show comment
Hide comment
@LordFPL

LordFPL Jan 15, 2014

Same problem here with 0.7.5.
"Resolved" with a lazy umount :
for fs in $(cat /proc/mounts | grep '.dockerinit\040(deleted)' | awk '{print $2}' | sed 's//rootfs/.dockerinit\040(deleted)//g'); do umount -l $fs; done

(or just the umount -l on the FS)

All the question is why some FS are in "/rootfs/.dockerinit\040(deleted) " state ?

LordFPL commented Jan 15, 2014

Same problem here with 0.7.5.
"Resolved" with a lazy umount :
for fs in $(cat /proc/mounts | grep '.dockerinit\040(deleted)' | awk '{print $2}' | sed 's//rootfs/.dockerinit\040(deleted)//g'); do umount -l $fs; done

(or just the umount -l on the FS)

All the question is why some FS are in "/rootfs/.dockerinit\040(deleted) " state ?

@joelmoss

This comment has been minimized.

Show comment
Hide comment
@joelmoss

joelmoss Jan 15, 2014

I can confirm that this is an issue on 0.7.5

joelmoss commented Jan 15, 2014

I can confirm that this is an issue on 0.7.5

@vjeantet

This comment has been minimized.

Show comment
Hide comment
@vjeantet

vjeantet Jan 15, 2014

I don't know if it is related but :
My docker data were in /var/lib/docker which was a symlink to /home/docker

/home is a mount point

Container's mount points on a symlink to a mount may be the cause ?
Since I told docker to use /home/docker instead of /var/lib/docker I don't have this issue anymore.

vjeantet commented Jan 15, 2014

I don't know if it is related but :
My docker data were in /var/lib/docker which was a symlink to /home/docker

/home is a mount point

Container's mount points on a symlink to a mount may be the cause ?
Since I told docker to use /home/docker instead of /var/lib/docker I don't have this issue anymore.

@LordFPL

This comment has been minimized.

Show comment
Hide comment
@LordFPL

LordFPL Jan 15, 2014

I'm already using a different base directory. Problems may be coming when docker daemon is restarted without stop properly containers... there is a bad thing somewhere in the stop/start of a new docker start...

LordFPL commented Jan 15, 2014

I'm already using a different base directory. Problems may be coming when docker daemon is restarted without stop properly containers... there is a bad thing somewhere in the stop/start of a new docker start...

@tianon

This comment has been minimized.

Show comment
Hide comment
@tianon

tianon Jan 16, 2014

Member

+1 I've got three containers on my devicemapper machine now that I can't remove because their devices fail to be removed in devicemapper (and none of them are even mounted in /proc/mounts)

Also, nothing in dmesg, and the only useful daemon output is highly cryptic and not very helpful:

[debug] deviceset.go:358 libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: remove ioctl on docker-8:3-43647873-f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31 failed: Device or resource busy
[debug] devmapper.go:495 [devmapper] removeDevice END
[debug] deviceset.go:574 Error removing device: Error running removeDevice
[error] api.go:1064 Error: Cannot destroy container f4985ed89768: Driver devicemapper failed to remove root filesystem f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31: Error running removeDevice
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container f4985ed89768: Driver devicemapper failed to remove root filesystem f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31: Error running removeDevice
Member

tianon commented Jan 16, 2014

+1 I've got three containers on my devicemapper machine now that I can't remove because their devices fail to be removed in devicemapper (and none of them are even mounted in /proc/mounts)

Also, nothing in dmesg, and the only useful daemon output is highly cryptic and not very helpful:

[debug] deviceset.go:358 libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: remove ioctl on docker-8:3-43647873-f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31 failed: Device or resource busy
[debug] devmapper.go:495 [devmapper] removeDevice END
[debug] deviceset.go:574 Error removing device: Error running removeDevice
[error] api.go:1064 Error: Cannot destroy container f4985ed89768: Driver devicemapper failed to remove root filesystem f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31: Error running removeDevice
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container f4985ed89768: Driver devicemapper failed to remove root filesystem f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31: Error running removeDevice
@mriehl

This comment has been minimized.

Show comment
Hide comment
@mriehl

mriehl Jan 16, 2014

+1 @vjeantet setting the docker base directory in /etc/default/docker instead of using a symlinked /var/lib/docker fixed these problems for me.

mriehl commented Jan 16, 2014

+1 @vjeantet setting the docker base directory in /etc/default/docker instead of using a symlinked /var/lib/docker fixed these problems for me.

@SamSaffron

This comment has been minimized.

Show comment
Hide comment
@SamSaffron

SamSaffron Jan 17, 2014

+1 seen this as well, quite easy to repro, recommending people only use aufs for now

SamSaffron commented Jan 17, 2014

+1 seen this as well, quite easy to repro, recommending people only use aufs for now

@mikesimons

This comment has been minimized.

Show comment
Hide comment
@mikesimons

mikesimons Jan 19, 2014

As a workaround I managed to successfully remove a container stuck in this fashion by renaming the offending DM device (using dmsetup rename), executing dmsetup wipe_table <stuck_id>, restarting docker and re-running docker rm.

You need to use the full DM id of the device which is at the end of the error (e.g docker-8:9-7880790-bc945261c1f97e7145604a4248e2c84535fb204c8e214fa394448e0b2dcd064a ).

The stuck device also disappeared on reboot.

This was achieved after much messing about with dmsetup so it's plausible something I did in between was also required. YMMV but it worked for me.

Edit: Needed to restart docker and run wipe_table too

mikesimons commented Jan 19, 2014

As a workaround I managed to successfully remove a container stuck in this fashion by renaming the offending DM device (using dmsetup rename), executing dmsetup wipe_table <stuck_id>, restarting docker and re-running docker rm.

You need to use the full DM id of the device which is at the end of the error (e.g docker-8:9-7880790-bc945261c1f97e7145604a4248e2c84535fb204c8e214fa394448e0b2dcd064a ).

The stuck device also disappeared on reboot.

This was achieved after much messing about with dmsetup so it's plausible something I did in between was also required. YMMV but it worked for me.

Edit: Needed to restart docker and run wipe_table too

@lgs

This comment has been minimized.

Show comment
Hide comment
@lgs

lgs Jan 19, 2014

... same problem with Docker version 0.7.6, build bc3b2ec

lsoave@basenode:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
53a9a8c4e29c        8dbd9e392a96        bash                17 minutes ago      Exit 0                                  thirsty_davinci     
lsoave@basenode:~$ docker rm 53a9a8c4e29c
Error: Cannot destroy container 53a9a8c4e29c: Driver aufs failed to remove root filesystem 53a9a8c4e29c2c99fdd8d5355833f07eca69cbfbefcd02915e267517111fbde8: device or resource busy
2014/01/19 20:38:50 Error: failed to remove one or more containers
lsoave@basenode:~$ 

by re-booting the host and running docker rm 53a9a8c4e29c again it works. My env:

lsoave@basenode:~$ uname -a
Linux basenode 3.11.0-15-generic #23-Ubuntu SMP Mon Dec 9 18:17:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
lsoave@basenode:~$ docker -v
Docker version 0.7.6, build bc3b2ec

lgs commented Jan 19, 2014

... same problem with Docker version 0.7.6, build bc3b2ec

lsoave@basenode:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
53a9a8c4e29c        8dbd9e392a96        bash                17 minutes ago      Exit 0                                  thirsty_davinci     
lsoave@basenode:~$ docker rm 53a9a8c4e29c
Error: Cannot destroy container 53a9a8c4e29c: Driver aufs failed to remove root filesystem 53a9a8c4e29c2c99fdd8d5355833f07eca69cbfbefcd02915e267517111fbde8: device or resource busy
2014/01/19 20:38:50 Error: failed to remove one or more containers
lsoave@basenode:~$ 

by re-booting the host and running docker rm 53a9a8c4e29c again it works. My env:

lsoave@basenode:~$ uname -a
Linux basenode 3.11.0-15-generic #23-Ubuntu SMP Mon Dec 9 18:17:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
lsoave@basenode:~$ docker -v
Docker version 0.7.6, build bc3b2ec
@mikesimons

This comment has been minimized.

Show comment
Hide comment
@mikesimons

mikesimons Jan 21, 2014

Happened again today; machine went in to suspend with docker containers running but did not come out of suspend cleanly. Needed a reboot.

Upon reboot the DM device for one of the containers that was running was stuck.

> uname -a
Linux mv 3.9.9-1-ARCH #1 SMP PREEMPT Wed Jul 3 22:45:16 CEST 2013 x86_64 GNU/Linux

Running docker 0.7.4 build 010d74e

mikesimons commented Jan 21, 2014

Happened again today; machine went in to suspend with docker containers running but did not come out of suspend cleanly. Needed a reboot.

Upon reboot the DM device for one of the containers that was running was stuck.

> uname -a
Linux mv 3.9.9-1-ARCH #1 SMP PREEMPT Wed Jul 3 22:45:16 CEST 2013 x86_64 GNU/Linux

Running docker 0.7.4 build 010d74e

@lgs

This comment has been minimized.

Show comment
Hide comment
@lgs

lgs Jan 21, 2014

@mikesimons ... did you remember the operational flow which brings you to the failure ?

lgs commented Jan 21, 2014

@mikesimons ... did you remember the operational flow which brings you to the failure ?

@kklepper

This comment has been minimized.

Show comment
Hide comment
@kklepper

kklepper Jan 21, 2014

Solved

At least in my case -- look for yourself if you don't have the same cause.

I had created a MySQL server container myself, which worked fine. As I was puzzled about the size of the containers, I decided to create new MySQL containers based on the work of somebody else.

This was indeed very interesting, as I found that size may differ substantially even when the Dockerfile looks similar or even identical. For example, my first has nearly 700 MB:

kklepper/Ms           latest              33280c9a70a7        5 days ago          695.7 MB

The container based on dhrp/mysql is nearly half the size of mine, and it works equally good:

kklepper/mysqld       latest              49223549bf47        24 hours ago        359.8 MB

The 2nd example produced the above-mentioned error, I'll get to that in just a second.

When I tried to repeat my findings today, I got a lot more size with exactly the same Dockerfile, seemingly without reason:

kklepper/mysqlda      latest              6162b0c95e8c        2 hours ago         374.4 MB

It was no problem to remove this container as well.

The next example introduced the problem, based on the 2nd result of my search https://index.docker.io/search?q=mysql: brice/mysql

As I had enhanced his approach, I couldn't see right at the spot where the problem was, but diligent tracking down finally showed, that in this case the offense was the command

VOLUME ["/var/lib/mysql", "/var/log/mysql"]

in the Dockerfile, which I had spent no thought at.

Both directories exist in the container:

root@mysql:/# ls /var/lib/mysql
debian-5.5.flag  ib_logfile0  ib_logfile1  ibdata1  mysql  performance_schema  test  voxx_biz_db1
root@mysql:/# ls /var/log/mysql
error.log

But not in the host:

vagrant@precise64:~$ ls /var/lib/mysql
ls: cannot access /var/lib/mysql: No such file or directory
vagrant@precise64:~$ ls /var/log/mysql
ls: cannot access /var/log/mysql: No such file or directory

The VOLUME directive ties the volume of the container to the correspondent volume of the host (or rather the other way around).

Docker should throw an error if the directory does not exist in the host; by design it will "create" the directory in the container if it does not exist.

Unfortunately I'm not able to write a patch, but I'm sure many of you can.

kklepper commented Jan 21, 2014

Solved

At least in my case -- look for yourself if you don't have the same cause.

I had created a MySQL server container myself, which worked fine. As I was puzzled about the size of the containers, I decided to create new MySQL containers based on the work of somebody else.

This was indeed very interesting, as I found that size may differ substantially even when the Dockerfile looks similar or even identical. For example, my first has nearly 700 MB:

kklepper/Ms           latest              33280c9a70a7        5 days ago          695.7 MB

The container based on dhrp/mysql is nearly half the size of mine, and it works equally good:

kklepper/mysqld       latest              49223549bf47        24 hours ago        359.8 MB

The 2nd example produced the above-mentioned error, I'll get to that in just a second.

When I tried to repeat my findings today, I got a lot more size with exactly the same Dockerfile, seemingly without reason:

kklepper/mysqlda      latest              6162b0c95e8c        2 hours ago         374.4 MB

It was no problem to remove this container as well.

The next example introduced the problem, based on the 2nd result of my search https://index.docker.io/search?q=mysql: brice/mysql

As I had enhanced his approach, I couldn't see right at the spot where the problem was, but diligent tracking down finally showed, that in this case the offense was the command

VOLUME ["/var/lib/mysql", "/var/log/mysql"]

in the Dockerfile, which I had spent no thought at.

Both directories exist in the container:

root@mysql:/# ls /var/lib/mysql
debian-5.5.flag  ib_logfile0  ib_logfile1  ibdata1  mysql  performance_schema  test  voxx_biz_db1
root@mysql:/# ls /var/log/mysql
error.log

But not in the host:

vagrant@precise64:~$ ls /var/lib/mysql
ls: cannot access /var/lib/mysql: No such file or directory
vagrant@precise64:~$ ls /var/log/mysql
ls: cannot access /var/log/mysql: No such file or directory

The VOLUME directive ties the volume of the container to the correspondent volume of the host (or rather the other way around).

Docker should throw an error if the directory does not exist in the host; by design it will "create" the directory in the container if it does not exist.

Unfortunately I'm not able to write a patch, but I'm sure many of you can.

@pwaller

This comment has been minimized.

Show comment
Hide comment
@pwaller

pwaller Jan 21, 2014

Contributor

@kklepper, I'm misunderstanding the relationship between the issue and your post. From what I read your "issue" was that you overlooked the behaviour of the VOLUME directive, but the issue at hand is that docker rm won't actually remove a stopped container in some circumstances, so I don't see in any sense how this issue is solved?

Contributor

pwaller commented Jan 21, 2014

@kklepper, I'm misunderstanding the relationship between the issue and your post. From what I read your "issue" was that you overlooked the behaviour of the VOLUME directive, but the issue at hand is that docker rm won't actually remove a stopped container in some circumstances, so I don't see in any sense how this issue is solved?

@kklepper

This comment has been minimized.

Show comment
Hide comment
@kklepper

kklepper Jan 21, 2014

Sorry for the confusion, I should have clarified that the VOLUME error caused the docker rm error, exactly as reported above. I found this thread because I searched for exactly this error message. Obviously nobody was able to track the conditions down yet.

kklepper commented Jan 21, 2014

Sorry for the confusion, I should have clarified that the VOLUME error caused the docker rm error, exactly as reported above. I found this thread because I searched for exactly this error message. Obviously nobody was able to track the conditions down yet.

@lgs

This comment has been minimized.

Show comment
Hide comment
@lgs

lgs Jan 21, 2014

@kklepper thanks for detailed report.

Can you print on this board the Dockerfile which produce our object fault please ?

I was looking for you on the pubbic index but no kklepper user found over there. Then, no way to me to reproduce your containers :

kklepper/Ms           latest              33280c9a70a7        5 days ago          695.7 MB
kklepper/mysqld       latest              49223549bf47        24 hours ago        359.8 MB
kklepper/mysqlda      latest              6162b0c95e8c        2 hours ago         374.4 MB

I'd like having a test on my own about what you're saying, because in my understanding of docker's VOLUME, it shouldn't need the same path hosting side. They should be just mount points.

Moreover unfortunately, by my side I cannot remember the things I did to get to the point I received that Cannot destroy container ... I mencioned before.

That's way I was asking @mikesimons for the steps he went through.

lgs commented Jan 21, 2014

@kklepper thanks for detailed report.

Can you print on this board the Dockerfile which produce our object fault please ?

I was looking for you on the pubbic index but no kklepper user found over there. Then, no way to me to reproduce your containers :

kklepper/Ms           latest              33280c9a70a7        5 days ago          695.7 MB
kklepper/mysqld       latest              49223549bf47        24 hours ago        359.8 MB
kklepper/mysqlda      latest              6162b0c95e8c        2 hours ago         374.4 MB

I'd like having a test on my own about what you're saying, because in my understanding of docker's VOLUME, it shouldn't need the same path hosting side. They should be just mount points.

Moreover unfortunately, by my side I cannot remember the things I did to get to the point I received that Cannot destroy container ... I mencioned before.

That's way I was asking @mikesimons for the steps he went through.

@kklepper

This comment has been minimized.

Show comment
Hide comment
@kklepper

kklepper Jan 21, 2014

@lgs Well, here it goes -- you will see some experimentation along the lines; actually none of this has anything to do with the problem the thread started with except the aside I included at the end.

Ms relies on scripts which manipulate MySQL and is derived from Supervisor which in turn is derived from quintenk/supervisor which installs Supervisor -- I just manipulate time information.

I first invoked Supervisor because at the time this seemed to be the only way so far for me to start Apache as a background container. Later, I found that I could do without, so it is no longer needed, but for completeness sake, I show it here anyway.

Supervisor can restart services automatically; after having had some experiences with this feature, I am not all that sure that this is a good idea.

Supervisor

FROM quintenk/supervisor
# 333 (original)/401 MB (mine)

MAINTAINER kklepper <gmcgmx@googlemail.com>
# Update the APT cache
RUN sed -i.bak 's/main$/main universe/' /etc/apt/sources.list
RUN apt-get update
RUN apt-get -y upgrade

# set timezone
RUN echo "Europe/Berlin" |  tee /etc/timezone
run dpkg-reconfigure --frontend noninteractive tzdata

WORKDIR /var/www

Ms

Here you see pwgen for automatic password generation; I certainly learned something, but don't know if this is a good idea as such either.

Disabling ENTRYPOINT shows that this Dockerfile can be used to produce a container running in the background as well as a container with a shell for detailed inspection from this same Dockerfile.

FROM kklepper/Supervisor
# 695 MB

MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN apt-get update
RUN apt-get -y upgrade

RUN dpkg-divert --local --rename --add /sbin/initctl
RUN ln -s /bin/true /sbin/initctl

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install mysql-server pwgen

ADD ./start.sh /start.sh
ADD ./supervisord.conf /etc/supervisor/conf.d/supervisord.conf

RUN chmod 755 /start.sh

EXPOSE 3306

CMD ["/bin/bash", "/start.sh", "&"]

#ENTRYPOINT ["/start.sh", "&"]

mysqld

Here the network instruction in mysql.conf is manipulated more intelligently than in my start.sh script. Also, the command to invoke the MySQL server is different. I use the mysqld_safe script out of habit without having a look at it. My application runs just as well with this version, saving anout 300 MB.

FROM dhrp/mysql
# 360 MB
MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install pwgen

RUN sed -i -e 's/127.0.0.1/0.0.0.0/' /etc/mysql/my.cnf
# to make sure we can connect from the network

ADD ./start.sh /start.sh
RUN chmod 755 /start.sh
RUN /start.sh
# will set privileges

EXPOSE 3306

CMD ["sh", "-c", "mysqld"]

mysqlda

This should be exactly the same, but the image consumes more memory nevertheless.

FROM dhrp/mysql
# 374 MB
MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install pwgen

RUN sed -i -e 's/127.0.0.1/0.0.0.0/' /etc/mysql/my.cnf
# to make sure we can connect from the network

ADD ./start.sh /start.sh
RUN chmod 755 /start.sh
RUN /start.sh
# will set privileges

EXPOSE 3306

CMD ["sh", "-c", "mysqld"]

start.sh

Mostly taken from somewhere else, advanced from there, for example with debugging information.

As you can see, I make sure that I see immediately what is going on when I enter the container with a shell; otherwise I can consult the log.

In my application, I connect via reader to do some tests.

#!/bin/bash

if [ ! -f /mysql-del.sql ]; then
    sed -i 's/bind-address/#bind-address/'  /etc/mysql/my.cnf
    # to be able to connect from the network

    /usr/bin/mysqld_safe &
    # start the server

    sleep 10s
    # give it some time

    MYSQL_PASSWORD=`pwgen -c -n -1 12`
    # generate random password

    echo -------------------
    echo mysql root password: $MYSQL_PASSWORD
    echo -------------------
    echo $MYSQL_PASSWORD > /mysql-root-pw.txt
    mysqladmin -uroot password $MYSQL_PASSWORD
    # use mysqladmin to set the password for root
    echo "14------------------- mysqladmin -uroot password $MYSQL_PASSWORD"

    PRIV="GRANT ALL PRIVILEGES ON *.* TO  'root'@'172.17.%' IDENTIFIED BY '$MYSQL_PASSWORD';"
    echo $PRIV > /mysql-grant.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-grant.sql
    # make sure you can connect from within our network
    echo 21------------------- $PRIV

    PRIV="DELETE FROM mysql.user WHERE password = '';FLUSH PRIVILEGES;"
    echo $PRIV > /mysql-del.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-del.sql
    # get rid of users without password (this is not really necessary, as we are in a safe box anyway)
    echo 26------------------- $PRIV

    PRIV="SELECT user, host, password FROM mysql.user;"
    echo $PRIV > /mysql-test.sql
    echo 30------------------- $PRIV
    echo ===============================================================================
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-test.sql
    # let's see if everything worked fine so far, just to test our approach
    echo ===============================================================================

    VX_PASSWORD=thisisnotnice
    PRIV="GRANT SELECT ON db1.* TO  'reader'@'172.17.%' IDENTIFIED BY '$VX_PASSWORD';"
    echo $PRIV > /mysql-reader.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-reader.sql
    # we will use this user for reading
    echo 39------------------- $PRIV

    CRUD_PASSWORD=somethingelse
    PRIV="GRANT INSERT, SELECT, UPDATE, DELETE ON voxx_biz_db1.* TO  'crud'@'172.17.%' IDENTIFIED BY '$CRUD_PASSWORD';"
    echo $PRIV > /mysql-crud.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-crud.sql
    # we might need to use it for manipulation
    echo 45------------------- $PRIV

    killall mysqld
    sleep 10s

fi

mysqld_safe &
# reload privileges

different approach -- as an aside

While searching for the offending Dockerfile, I stumbled across somebody who used that same container, but was wise enough to delete that line with the VOLUME instruction, although not giving an explanation for this decision -- we can safely assume that he stumbled into the same problem as all of us and found the solution for himself:
https://www.google.de/search?q=%22MAINTAINER+Brandon+Rice%22+mysql

He had a different problem which I inspect at the moment. For this kind of testing, I use his approach which has a lot of appeal to me (I wasn't aware of the -e option):

mysql -e "\
UPDATE mysql.user SET password = password('thisismypassword') WHERE user = 'root';\
FLUSH PRIVILEGES;\
DELETE FROM mysql.user WHERE password = '';\
FLUSH PRIVILEGES;\
GRANT ALL ON *.* to 'root'@'172.17.%' IDENTIFIED BY 'thisismypassword'; \
GRANT SELECT ON voxx_biz_db1.* TO  'reader'@'172.17.%' IDENTIFIED BY 'thisisnotnice';\
GRANT INSERT, SELECT, UPDATE, DELETE ON voxx_biz_db1.* TO  'crud'@'172.17.%' IDENTIFIED BY 'somethingelse';\
FLUSH PRIVILEGES;\
"
# a lot of instructions at once, nice to read

The problem he had came with a DROP DATABASE command.

ERROR 6 (HY000) at line 1: Error on delete of './my_db//db.opt' (Errcode: 1)

I can confirm that I get a similar error both on the Dockerfile and manually in the MySQL client:

ERROR 1010 (HY000): Error dropping database (can't rmdir './test', errno: 1)

Now what does that mean? Our MySQL-container tells us:

root@mysql:/# perror 6
OS error code   6:  No such device or address

root@mysql:/# perror 1010
MySQL error code 1010 (ER_DB_DROP_RMDIR): Error dropping database (can't rmdir '%-.192s', errno: %d)

Unfortunately, he doesn't tell us anything about the database he is trying to drop. My problem, upon which I stumbled by chance here, obviously is different.

I just tested the hypothesis that my problem might stem from using a volume from another container, so I dropped this link to my fully populated database, but the error is the same.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

mysql> drop database test;
ERROR 1010 (HY000): Error dropping database (can't rmdir './test', errno: 1)
mysql> create database test2;
Query OK, 1 row affected (0.02 sec)

mysql> drop database test2;
Query OK, 0 rows affected (0.00 sec)

Now this is funny. Obviously I have no problem to drop freshly created databases. I remember from earlier times, that it was no problem to drop the installation-standard empty database test -- this was one of the first measures to take.

It would be interesting to test if I could drop a linked-in database, but I will postpone that for later -- I should make sure that I have a backup of my living database if that test succeeds.

MySQL versus memcached

For my little web-application test, I query a simple key-value-table with a MySQL-compressed value of about 70 kB, uncompressed 680 kB in my app, called from XP/Firefox, and compare this with a memcached database in another container, also linked to the Apache/PHP-container, all residing in a Vagrant/Virtualbox. The additional compression for memcached is done with PHP. I don't know yet if memcached can do that on its own (yes, it can: Memcached::OPT_COMPRESSION).

Interestingly, when I built this Apache/PHP-container about a week ago, I could apt-get php5-memcached, but when I wanted to do another test with my Apache/PHP-setup yesterday, ubuntu couldn't find that package anymore -- in fact I (or Google) couldn't find it nowhere. How come?

Certainly I wasn't dreaming, the line

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install apache2 libapache2-mod-php5 python-setuptools nano php5-mysql php5-memcached

worked out with no problems at all last week.

~~MySQL have_query_cache: YES~~
~~Did mysql_query in 36.43 milliseconds~~
~~Did memcached in 1.25 milliseconds~~
~~Result: memcached appr. 30 times faster: 35.18 milliseconds saved: 36.43 :: 1.25~~

~~Did Memcached zipped in 0.76 milliseconds~~
~~Result: memcached zipped / mysql appr. 48 times faster: 35.67 milliseconds saved: 36.43 :: 0.76~~
~~Result: memcached / memcached zipped appr. 2 times faster: 0.49 milliseconds saved: 1.25:: 0.76~~

Sorry, had an error in my code. This is correct:

have_query_cache: YES
Did mysql_query in 40.69 milliseconds
Did memcached in 13.95 milliseconds
Result: memcached 2.92 times faster: 26.74 milliseconds saved: 40.69 :: 13.95

Did Memcached zipped in 10.27 milliseconds
Result: memcached zipped mysql 3.96 times faster: 30.42 milliseconds saved: 40.69 :: 10.27
Result: memcached/zipped 1.36 times faster: 3.68 milliseconds saved: 13.95 :: 10.27

I never did a test in this direction before and was surprised about the result.

My setup with 3 running and 2 data containers looks very promising.

I plan to look at other databases, set up a MySQL replication scheme, a load-balancing scheme with several Apache containers, and after that look into the problem of contacting containers residing in other machines, virtual or not.

Docker definitely rocks.

kklepper commented Jan 21, 2014

@lgs Well, here it goes -- you will see some experimentation along the lines; actually none of this has anything to do with the problem the thread started with except the aside I included at the end.

Ms relies on scripts which manipulate MySQL and is derived from Supervisor which in turn is derived from quintenk/supervisor which installs Supervisor -- I just manipulate time information.

I first invoked Supervisor because at the time this seemed to be the only way so far for me to start Apache as a background container. Later, I found that I could do without, so it is no longer needed, but for completeness sake, I show it here anyway.

Supervisor can restart services automatically; after having had some experiences with this feature, I am not all that sure that this is a good idea.

Supervisor

FROM quintenk/supervisor
# 333 (original)/401 MB (mine)

MAINTAINER kklepper <gmcgmx@googlemail.com>
# Update the APT cache
RUN sed -i.bak 's/main$/main universe/' /etc/apt/sources.list
RUN apt-get update
RUN apt-get -y upgrade

# set timezone
RUN echo "Europe/Berlin" |  tee /etc/timezone
run dpkg-reconfigure --frontend noninteractive tzdata

WORKDIR /var/www

Ms

Here you see pwgen for automatic password generation; I certainly learned something, but don't know if this is a good idea as such either.

Disabling ENTRYPOINT shows that this Dockerfile can be used to produce a container running in the background as well as a container with a shell for detailed inspection from this same Dockerfile.

FROM kklepper/Supervisor
# 695 MB

MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN apt-get update
RUN apt-get -y upgrade

RUN dpkg-divert --local --rename --add /sbin/initctl
RUN ln -s /bin/true /sbin/initctl

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install mysql-server pwgen

ADD ./start.sh /start.sh
ADD ./supervisord.conf /etc/supervisor/conf.d/supervisord.conf

RUN chmod 755 /start.sh

EXPOSE 3306

CMD ["/bin/bash", "/start.sh", "&"]

#ENTRYPOINT ["/start.sh", "&"]

mysqld

Here the network instruction in mysql.conf is manipulated more intelligently than in my start.sh script. Also, the command to invoke the MySQL server is different. I use the mysqld_safe script out of habit without having a look at it. My application runs just as well with this version, saving anout 300 MB.

FROM dhrp/mysql
# 360 MB
MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install pwgen

RUN sed -i -e 's/127.0.0.1/0.0.0.0/' /etc/mysql/my.cnf
# to make sure we can connect from the network

ADD ./start.sh /start.sh
RUN chmod 755 /start.sh
RUN /start.sh
# will set privileges

EXPOSE 3306

CMD ["sh", "-c", "mysqld"]

mysqlda

This should be exactly the same, but the image consumes more memory nevertheless.

FROM dhrp/mysql
# 374 MB
MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install pwgen

RUN sed -i -e 's/127.0.0.1/0.0.0.0/' /etc/mysql/my.cnf
# to make sure we can connect from the network

ADD ./start.sh /start.sh
RUN chmod 755 /start.sh
RUN /start.sh
# will set privileges

EXPOSE 3306

CMD ["sh", "-c", "mysqld"]

start.sh

Mostly taken from somewhere else, advanced from there, for example with debugging information.

As you can see, I make sure that I see immediately what is going on when I enter the container with a shell; otherwise I can consult the log.

In my application, I connect via reader to do some tests.

#!/bin/bash

if [ ! -f /mysql-del.sql ]; then
    sed -i 's/bind-address/#bind-address/'  /etc/mysql/my.cnf
    # to be able to connect from the network

    /usr/bin/mysqld_safe &
    # start the server

    sleep 10s
    # give it some time

    MYSQL_PASSWORD=`pwgen -c -n -1 12`
    # generate random password

    echo -------------------
    echo mysql root password: $MYSQL_PASSWORD
    echo -------------------
    echo $MYSQL_PASSWORD > /mysql-root-pw.txt
    mysqladmin -uroot password $MYSQL_PASSWORD
    # use mysqladmin to set the password for root
    echo "14------------------- mysqladmin -uroot password $MYSQL_PASSWORD"

    PRIV="GRANT ALL PRIVILEGES ON *.* TO  'root'@'172.17.%' IDENTIFIED BY '$MYSQL_PASSWORD';"
    echo $PRIV > /mysql-grant.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-grant.sql
    # make sure you can connect from within our network
    echo 21------------------- $PRIV

    PRIV="DELETE FROM mysql.user WHERE password = '';FLUSH PRIVILEGES;"
    echo $PRIV > /mysql-del.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-del.sql
    # get rid of users without password (this is not really necessary, as we are in a safe box anyway)
    echo 26------------------- $PRIV

    PRIV="SELECT user, host, password FROM mysql.user;"
    echo $PRIV > /mysql-test.sql
    echo 30------------------- $PRIV
    echo ===============================================================================
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-test.sql
    # let's see if everything worked fine so far, just to test our approach
    echo ===============================================================================

    VX_PASSWORD=thisisnotnice
    PRIV="GRANT SELECT ON db1.* TO  'reader'@'172.17.%' IDENTIFIED BY '$VX_PASSWORD';"
    echo $PRIV > /mysql-reader.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-reader.sql
    # we will use this user for reading
    echo 39------------------- $PRIV

    CRUD_PASSWORD=somethingelse
    PRIV="GRANT INSERT, SELECT, UPDATE, DELETE ON voxx_biz_db1.* TO  'crud'@'172.17.%' IDENTIFIED BY '$CRUD_PASSWORD';"
    echo $PRIV > /mysql-crud.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-crud.sql
    # we might need to use it for manipulation
    echo 45------------------- $PRIV

    killall mysqld
    sleep 10s

fi

mysqld_safe &
# reload privileges

different approach -- as an aside

While searching for the offending Dockerfile, I stumbled across somebody who used that same container, but was wise enough to delete that line with the VOLUME instruction, although not giving an explanation for this decision -- we can safely assume that he stumbled into the same problem as all of us and found the solution for himself:
https://www.google.de/search?q=%22MAINTAINER+Brandon+Rice%22+mysql

He had a different problem which I inspect at the moment. For this kind of testing, I use his approach which has a lot of appeal to me (I wasn't aware of the -e option):

mysql -e "\
UPDATE mysql.user SET password = password('thisismypassword') WHERE user = 'root';\
FLUSH PRIVILEGES;\
DELETE FROM mysql.user WHERE password = '';\
FLUSH PRIVILEGES;\
GRANT ALL ON *.* to 'root'@'172.17.%' IDENTIFIED BY 'thisismypassword'; \
GRANT SELECT ON voxx_biz_db1.* TO  'reader'@'172.17.%' IDENTIFIED BY 'thisisnotnice';\
GRANT INSERT, SELECT, UPDATE, DELETE ON voxx_biz_db1.* TO  'crud'@'172.17.%' IDENTIFIED BY 'somethingelse';\
FLUSH PRIVILEGES;\
"
# a lot of instructions at once, nice to read

The problem he had came with a DROP DATABASE command.

ERROR 6 (HY000) at line 1: Error on delete of './my_db//db.opt' (Errcode: 1)

I can confirm that I get a similar error both on the Dockerfile and manually in the MySQL client:

ERROR 1010 (HY000): Error dropping database (can't rmdir './test', errno: 1)

Now what does that mean? Our MySQL-container tells us:

root@mysql:/# perror 6
OS error code   6:  No such device or address

root@mysql:/# perror 1010
MySQL error code 1010 (ER_DB_DROP_RMDIR): Error dropping database (can't rmdir '%-.192s', errno: %d)

Unfortunately, he doesn't tell us anything about the database he is trying to drop. My problem, upon which I stumbled by chance here, obviously is different.

I just tested the hypothesis that my problem might stem from using a volume from another container, so I dropped this link to my fully populated database, but the error is the same.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

mysql> drop database test;
ERROR 1010 (HY000): Error dropping database (can't rmdir './test', errno: 1)
mysql> create database test2;
Query OK, 1 row affected (0.02 sec)

mysql> drop database test2;
Query OK, 0 rows affected (0.00 sec)

Now this is funny. Obviously I have no problem to drop freshly created databases. I remember from earlier times, that it was no problem to drop the installation-standard empty database test -- this was one of the first measures to take.

It would be interesting to test if I could drop a linked-in database, but I will postpone that for later -- I should make sure that I have a backup of my living database if that test succeeds.

MySQL versus memcached

For my little web-application test, I query a simple key-value-table with a MySQL-compressed value of about 70 kB, uncompressed 680 kB in my app, called from XP/Firefox, and compare this with a memcached database in another container, also linked to the Apache/PHP-container, all residing in a Vagrant/Virtualbox. The additional compression for memcached is done with PHP. I don't know yet if memcached can do that on its own (yes, it can: Memcached::OPT_COMPRESSION).

Interestingly, when I built this Apache/PHP-container about a week ago, I could apt-get php5-memcached, but when I wanted to do another test with my Apache/PHP-setup yesterday, ubuntu couldn't find that package anymore -- in fact I (or Google) couldn't find it nowhere. How come?

Certainly I wasn't dreaming, the line

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install apache2 libapache2-mod-php5 python-setuptools nano php5-mysql php5-memcached

worked out with no problems at all last week.

~~MySQL have_query_cache: YES~~
~~Did mysql_query in 36.43 milliseconds~~
~~Did memcached in 1.25 milliseconds~~
~~Result: memcached appr. 30 times faster: 35.18 milliseconds saved: 36.43 :: 1.25~~

~~Did Memcached zipped in 0.76 milliseconds~~
~~Result: memcached zipped / mysql appr. 48 times faster: 35.67 milliseconds saved: 36.43 :: 0.76~~
~~Result: memcached / memcached zipped appr. 2 times faster: 0.49 milliseconds saved: 1.25:: 0.76~~

Sorry, had an error in my code. This is correct:

have_query_cache: YES
Did mysql_query in 40.69 milliseconds
Did memcached in 13.95 milliseconds
Result: memcached 2.92 times faster: 26.74 milliseconds saved: 40.69 :: 13.95

Did Memcached zipped in 10.27 milliseconds
Result: memcached zipped mysql 3.96 times faster: 30.42 milliseconds saved: 40.69 :: 10.27
Result: memcached/zipped 1.36 times faster: 3.68 milliseconds saved: 13.95 :: 10.27

I never did a test in this direction before and was surprised about the result.

My setup with 3 running and 2 data containers looks very promising.

I plan to look at other databases, set up a MySQL replication scheme, a load-balancing scheme with several Apache containers, and after that look into the problem of contacting containers residing in other machines, virtual or not.

Docker definitely rocks.

@mikesimons

This comment has been minimized.

Show comment
Hide comment
@mikesimons

mikesimons Jan 22, 2014

@lgs I can 100% reproduce on my machine simply by starting a container (all containers I've tried have exhibited this behaviour thus far) and forcing a shutdown:

  • Start container
  • Hard power down machine (i.e. don't let shutdown scripts run)
  • Power up machine
  • Container you started before shutdown is now "stuck"

mikesimons commented Jan 22, 2014

@lgs I can 100% reproduce on my machine simply by starting a container (all containers I've tried have exhibited this behaviour thus far) and forcing a shutdown:

  • Start container
  • Hard power down machine (i.e. don't let shutdown scripts run)
  • Power up machine
  • Container you started before shutdown is now "stuck"
@joelmoss

This comment has been minimized.

Show comment
Hide comment
@joelmoss

joelmoss Jan 22, 2014

FYI, I get no such issues with <= 0.7.2

joelmoss commented Jan 22, 2014

FYI, I get no such issues with <= 0.7.2

@kklepper

This comment has been minimized.

Show comment
Hide comment
@kklepper

kklepper Jan 22, 2014

Ok, it boils down to this:

Dockerfile

#FROM busybox 
# ok
FROM ubuntu
# Error: Cannot destroy container test: Driver aufs failed to remove init filesystem f27ab92e572681b81aecd30b6b03a67613c092055a8ef973f9e5450e941afbba-init: invalid argument

VOLUME ["/var/lib/not_exit"]

busybox

Shell 1:

vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker build -t kklepper/test .
vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker run -i -t -rm -h test -name test kklepper/test ash

Shell 2:

vagrant@precise64:~$ id=test
vagrant@precise64:~$ docker stop $id && docker rm $id
test
test

ubuntu

Shell 1:

vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker build -t kklepper/test .
vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker run -i -t -rm -h test -name test kklepper/test ash

Shell 2:

vagrant@precise64:~$ docker stop $id && docker rm $id
test
Error: Cannot destroy container test: Driver aufs failed to remove init filesystem f27ab92e572681b81aecd30b6b03a67613c092055a8ef973f9e5450e941afbba-init: invalid argument
2014/01/22 10:10:37 Error: failed to remove one or more containers

The trick with the boilerplate from Brandon was probably that he had a MySQL database sitting in the volume he used, whereas other people, like me, might not. As you usually want to use a volume you already have, this problem shouldn't occur too often.

kklepper commented Jan 22, 2014

Ok, it boils down to this:

Dockerfile

#FROM busybox 
# ok
FROM ubuntu
# Error: Cannot destroy container test: Driver aufs failed to remove init filesystem f27ab92e572681b81aecd30b6b03a67613c092055a8ef973f9e5450e941afbba-init: invalid argument

VOLUME ["/var/lib/not_exit"]

busybox

Shell 1:

vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker build -t kklepper/test .
vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker run -i -t -rm -h test -name test kklepper/test ash

Shell 2:

vagrant@precise64:~$ id=test
vagrant@precise64:~$ docker stop $id && docker rm $id
test
test

ubuntu

Shell 1:

vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker build -t kklepper/test .
vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker run -i -t -rm -h test -name test kklepper/test ash

Shell 2:

vagrant@precise64:~$ docker stop $id && docker rm $id
test
Error: Cannot destroy container test: Driver aufs failed to remove init filesystem f27ab92e572681b81aecd30b6b03a67613c092055a8ef973f9e5450e941afbba-init: invalid argument
2014/01/22 10:10:37 Error: failed to remove one or more containers

The trick with the boilerplate from Brandon was probably that he had a MySQL database sitting in the volume he used, whereas other people, like me, might not. As you usually want to use a volume you already have, this problem shouldn't occur too often.

@lgs

This comment has been minimized.

Show comment
Hide comment

lgs commented Jan 22, 2014

@inthecloud247

This comment has been minimized.

Show comment
Hide comment
@inthecloud247

inthecloud247 Jan 24, 2014

seeing same issue with 0.7.6 on ubuntu 13.10.

i'm running this on a disk with full-disk encryption (ecryptfs) enabled under ubuntu. maybe that's the issue here?

inthecloud247 commented Jan 24, 2014

seeing same issue with 0.7.6 on ubuntu 13.10.

i'm running this on a disk with full-disk encryption (ecryptfs) enabled under ubuntu. maybe that's the issue here?

@sameersbn

This comment has been minimized.

Show comment
Hide comment
@sameersbn

sameersbn Jan 24, 2014

@ydavid365 don't think its got any thing to do with ecryptfs. I am seeing this issue since 0.7.3 on ubuntu 13.10 as well as 12.04.

sameersbn commented Jan 24, 2014

@ydavid365 don't think its got any thing to do with ecryptfs. I am seeing this issue since 0.7.3 on ubuntu 13.10 as well as 12.04.

@lgs

This comment has been minimized.

Show comment
Hide comment
@lgs

lgs Jan 24, 2014

@ydavid365 the majority of reported cases are not on encrypted environments, then I would exclude ecryptfs is causing any issue here ...

lgs commented Jan 24, 2014

@ydavid365 the majority of reported cases are not on encrypted environments, then I would exclude ecryptfs is causing any issue here ...

@joelmoss

This comment has been minimized.

Show comment
Hide comment
@joelmoss

joelmoss May 19, 2014

FYI: on Ubuntu 14.04 and docker 0.11.1, we have no such problems. Succesfull removal 100% of the time.

joelmoss commented May 19, 2014

FYI: on Ubuntu 14.04 and docker 0.11.1, we have no such problems. Succesfull removal 100% of the time.

@crosbymichael crosbymichael added the bug label May 19, 2014

@vieux

This comment has been minimized.

Show comment
Hide comment
@vieux

vieux May 22, 2014

Collaborator

can somebody reproduce with 0.11.1 ?

Collaborator

vieux commented May 22, 2014

can somebody reproduce with 0.11.1 ?

@srid

This comment has been minimized.

Show comment
Hide comment
@srid

srid May 22, 2014

Contributor

Yes, we are seeing this everyday with 0.11.1 (Ubuntu 12.04.4).

Contributor

srid commented May 22, 2014

Yes, we are seeing this everyday with 0.11.1 (Ubuntu 12.04.4).

@vieux

This comment has been minimized.

Show comment
Hide comment
@vieux

vieux May 22, 2014

Collaborator

@srid are you running lxc ?

Collaborator

vieux commented May 22, 2014

@srid are you running lxc ?

@srid

This comment has been minimized.

Show comment
Hide comment
@srid

srid May 22, 2014

Contributor

@vieux – I’m not sure what you mean to ask.

This is the default docker install (apt-get install lxc-docker-0.11.1)[1], and everything generally works fine (creating and running containers) … except a certain number of ‘docker rm’ operations fail.

If you are looking to reproduce this, I’d recommend running the script[2] provided by @lcarstensen above on an Ubuntu 12.04 VM.


[1] docker daemon is run as: /usr/bin/docker -d -D -s aufs -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock and here’s the output of docker version:

Client version: 0.11.1
Client API version: 1.11
Go version (client): go1.2.1
Git commit (client): fb99f99
Server version: 0.11.1
Server API version: 1.11
Git commit (server): fb99f99
Go version (server): go1.2.1
Last stable version: 0.11.1

with more system info:

$ uname -a
Linux stackato-jgz6 3.11.0-20-generic #35~precise1-Ubuntu SMP Fri May 2 21:32:55 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="12.04.4 LTS, Precise Pangolin"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu precise (12.04.4 LTS)"
VERSION_ID="12.04" 

[2] https://gist.github.com/lcarstensen/10513578

Contributor

srid commented May 22, 2014

@vieux – I’m not sure what you mean to ask.

This is the default docker install (apt-get install lxc-docker-0.11.1)[1], and everything generally works fine (creating and running containers) … except a certain number of ‘docker rm’ operations fail.

If you are looking to reproduce this, I’d recommend running the script[2] provided by @lcarstensen above on an Ubuntu 12.04 VM.


[1] docker daemon is run as: /usr/bin/docker -d -D -s aufs -H tcp://127.0.0.1:4243 -H unix:///var/run/docker.sock and here’s the output of docker version:

Client version: 0.11.1
Client API version: 1.11
Go version (client): go1.2.1
Git commit (client): fb99f99
Server version: 0.11.1
Server API version: 1.11
Git commit (server): fb99f99
Go version (server): go1.2.1
Last stable version: 0.11.1

with more system info:

$ uname -a
Linux stackato-jgz6 3.11.0-20-generic #35~precise1-Ubuntu SMP Fri May 2 21:32:55 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="12.04.4 LTS, Precise Pangolin"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu precise (12.04.4 LTS)"
VERSION_ID="12.04" 

[2] https://gist.github.com/lcarstensen/10513578

@lcarstensen

This comment has been minimized.

Show comment
Hide comment
@lcarstensen

lcarstensen May 22, 2014

With 0.11.1-4 from koji on RHEL 6.5 with native (not LXC) and selinux enabled I haven't been able to reproduce docker rm issues, either with my script or without, over the last week.  Using the native execution driver seems like the key on RHEL.

lcarstensen commented May 22, 2014

With 0.11.1-4 from koji on RHEL 6.5 with native (not LXC) and selinux enabled I haven't been able to reproduce docker rm issues, either with my script or without, over the last week.  Using the native execution driver seems like the key on RHEL.

@srid

This comment has been minimized.

Show comment
Hide comment
@srid

srid May 22, 2014

Contributor

fwiw, we are using the native execution driver:

$ docker info
Containers: 9
Images: 132
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Dirs: 152
Execution Driver: native-0.2
Kernel Version: 3.11.0-20-generic
Debug mode (server): true
Debug mode (client): false
Fds: 109
Goroutines: 163
EventsListeners: 2
Init Path: /usr/bin/docker
$
Contributor

srid commented May 22, 2014

fwiw, we are using the native execution driver:

$ docker info
Containers: 9
Images: 132
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Dirs: 152
Execution Driver: native-0.2
Kernel Version: 3.11.0-20-generic
Debug mode (server): true
Debug mode (client): false
Fds: 109
Goroutines: 163
EventsListeners: 2
Init Path: /usr/bin/docker
$
@srid

This comment has been minimized.

Show comment
Hide comment
@srid

srid May 22, 2014

Contributor

@alexlarsson -

Maybe we can drop the private on /var/lib/docker once we start mounting container roots inside the container namespace, as we then won't have any long-running mounts visible in the root namespace (only short-lived ones).

could you explain what you mean by "drop the private on /var/lib/docker" and "make /var/lib/docker --private in the daemon"?

just as you've observed, i'm seeing aufs mnt directories appearing but in the mount namespace of several processes.

Contributor

srid commented May 22, 2014

@alexlarsson -

Maybe we can drop the private on /var/lib/docker once we start mounting container roots inside the container namespace, as we then won't have any long-running mounts visible in the root namespace (only short-lived ones).

could you explain what you mean by "drop the private on /var/lib/docker" and "make /var/lib/docker --private in the daemon"?

just as you've observed, i'm seeing aufs mnt directories appearing but in the mount namespace of several processes.

@alexlarsson

This comment has been minimized.

Show comment
Hide comment
@alexlarsson

alexlarsson May 22, 2014

Contributor

@srid At startup docker effectively does mount --make-rprivate /var/lib/docker to work around some efficiency problems. Without this every mount (from e.g. a container start) is broadcasted to all sub-namespaces (including all other containers), which leads to a O(n^2) slowness.

The problem with it being private is that unmounts in the global namespace are not sent to the containers either. Which causes EBUSY issues like the above if something creates a new mount namespace and doesn't unmount uninteresting parts of the hierarchy.

Contributor

alexlarsson commented May 22, 2014

@srid At startup docker effectively does mount --make-rprivate /var/lib/docker to work around some efficiency problems. Without this every mount (from e.g. a container start) is broadcasted to all sub-namespaces (including all other containers), which leads to a O(n^2) slowness.

The problem with it being private is that unmounts in the global namespace are not sent to the containers either. Which causes EBUSY issues like the above if something creates a new mount namespace and doesn't unmount uninteresting parts of the hierarchy.

@cywjackson

This comment has been minimized.

Show comment
Hide comment
@cywjackson

cywjackson May 22, 2014

ran into this in our prod today. docker was 0.9.1
this solution allowed me to resolve the problem without restarting the host :)
so for those who are interested:
https://coderwall.com/p/h24pgw

cywjackson commented May 22, 2014

ran into this in our prod today. docker was 0.9.1
this solution allowed me to resolve the problem without restarting the host :)
so for those who are interested:
https://coderwall.com/p/h24pgw

@vieux

This comment has been minimized.

Show comment
Hide comment
@vieux

vieux May 27, 2014

Collaborator

@cywjackson can you tell us which mountpoints are still there before running the your umount all command ?

Collaborator

vieux commented May 27, 2014

@cywjackson can you tell us which mountpoints are still there before running the your umount all command ?

@cywjackson

This comment has been minimized.

Show comment
Hide comment
@cywjackson

cywjackson May 28, 2014

hey @vieux , unfortunately i can't now, since we've resolved our issue. But i think it's pretty much all the containers and graphs (and maybe the aufs?) ... But if it helps, our problem probably triggered by running an older docker version to begin with (0.76_?), then chef-client was run and updated the version to > 0.9_. During the process it restarted the daemon but probably not the containers, resulting the filesystem looks as if it is still being used and the containers is still in the "running" state. We did examine the config.json in those containers' paths and it has the Running flag as true. (Simply updated that to false didn't resolve the problem though).

i guess for us, we should have stopped the containers first before running the chef-client.

cywjackson commented May 28, 2014

hey @vieux , unfortunately i can't now, since we've resolved our issue. But i think it's pretty much all the containers and graphs (and maybe the aufs?) ... But if it helps, our problem probably triggered by running an older docker version to begin with (0.76_?), then chef-client was run and updated the version to > 0.9_. During the process it restarted the daemon but probably not the containers, resulting the filesystem looks as if it is still being used and the containers is still in the "running" state. We did examine the config.json in those containers' paths and it has the Running flag as true. (Simply updated that to false didn't resolve the problem though).

i guess for us, we should have stopped the containers first before running the chef-client.

@vieux

This comment has been minimized.

Show comment
Hide comment
@vieux

vieux May 29, 2014

Collaborator

Anybody has an easy way to reproduce ?

Collaborator

vieux commented May 29, 2014

Anybody has an easy way to reproduce ?

@vieux vieux removed this from the 1.0 milestone Jun 3, 2014

@nickleefly

This comment has been minimized.

Show comment
Hide comment
@nickleefly

nickleefly Jun 12, 2014

@vieux I can reproduce with latest docker

$ boot2docker status
running

$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ubuntu              latest              ad892dd21d60        7 days ago          275.5 MB
busybox             ubuntu-14.04        37fca75d01ff        7 days ago          5.609 MB
busybox             ubuntu-12.04        fd5373b3d938        7 days ago          5.455 MB
busybox             latest              a9eb17255234        7 days ago          2.433 MB

# run a few times
$ docker run  a9eb17255234 echo Hello world

$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                      PORTS               NAMES
32a3a2a21df0        busybox:latest      echo hello world    35 minutes ago      Exited (0) 35 minutes ago                       evil_mayer
bec3a46051f0        busybox:latest      echo hello world    35 minutes ago      Exited (0) 35 minutes ago                       hopeful_mclean
274ac126eb1f        busybox:latest      echo hello world    35 minutes ago      Exited (0) 35 minutes ago                       sharp_ptolemy

$ docker info
Containers: 3
Images: 12
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Dirs: 18
Execution Driver: native-0.2
Kernel Version: 3.14.1-tinycore64
Debug mode (server): true
Debug mode (client): false
Fds: 11
Goroutines: 10
EventsListeners: 0
Init Path: /usr/local/bin/docker
Username: nickleefly
Registry: [https://index.docker.io/v1/]

$ docker version
Client version: 1.0.0
Client API version: 1.12
Go version (client): go1.2.1
Git commit (client): 63fe64c
Server version: 1.0.0
Server API version: 1.12
Go version (server): go1.2.1
Git commit (server): 63fe64c

$ docker ps -a | grep Exit | awk '{print $1}' | sudo xargs docker rm
dial unix /var/run/docker.sock: no such file or directory
dial unix /var/run/docker.sock: no such file or directory
dial unix /var/run/docker.sock: no such file or directory
2014/06/11 22:34:25 Error: failed to remove one or more containers

But if I do

docker rm containerID

It could remove stopped containers

nickleefly commented Jun 12, 2014

@vieux I can reproduce with latest docker

$ boot2docker status
running

$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ubuntu              latest              ad892dd21d60        7 days ago          275.5 MB
busybox             ubuntu-14.04        37fca75d01ff        7 days ago          5.609 MB
busybox             ubuntu-12.04        fd5373b3d938        7 days ago          5.455 MB
busybox             latest              a9eb17255234        7 days ago          2.433 MB

# run a few times
$ docker run  a9eb17255234 echo Hello world

$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                      PORTS               NAMES
32a3a2a21df0        busybox:latest      echo hello world    35 minutes ago      Exited (0) 35 minutes ago                       evil_mayer
bec3a46051f0        busybox:latest      echo hello world    35 minutes ago      Exited (0) 35 minutes ago                       hopeful_mclean
274ac126eb1f        busybox:latest      echo hello world    35 minutes ago      Exited (0) 35 minutes ago                       sharp_ptolemy

$ docker info
Containers: 3
Images: 12
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Dirs: 18
Execution Driver: native-0.2
Kernel Version: 3.14.1-tinycore64
Debug mode (server): true
Debug mode (client): false
Fds: 11
Goroutines: 10
EventsListeners: 0
Init Path: /usr/local/bin/docker
Username: nickleefly
Registry: [https://index.docker.io/v1/]

$ docker version
Client version: 1.0.0
Client API version: 1.12
Go version (client): go1.2.1
Git commit (client): 63fe64c
Server version: 1.0.0
Server API version: 1.12
Go version (server): go1.2.1
Git commit (server): 63fe64c

$ docker ps -a | grep Exit | awk '{print $1}' | sudo xargs docker rm
dial unix /var/run/docker.sock: no such file or directory
dial unix /var/run/docker.sock: no such file or directory
dial unix /var/run/docker.sock: no such file or directory
2014/06/11 22:34:25 Error: failed to remove one or more containers

But if I do

docker rm containerID

It could remove stopped containers

@geku

This comment has been minimized.

Show comment
Hide comment
@geku

geku Jul 12, 2014

@vieux I still have this problem with version Docker 1.1.1 and can reproduce it. It only happens when one or more ports are published to the host. If no ports are published forced remove works.

How to reproduce

Docker install is fresh: no image pulled and no other container run previously.

$ docker pull tutum/redis
$ docker run -d -p 6379 tutum/redis
$ docker ps
CONTAINER ID        IMAGE                COMMAND             CREATED             STATUS              PORTS                     NAMES
5ffa59ef0879        tutum/redis:latest   /run.sh             2 seconds ago       Up 1 seconds        0.0.0.0:49153->6379/tcp   compassionate_ardinghelli
$ docker rm -f 5ffa59ef0879
Error response from daemon: Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy
2014/07/12 12:40:12 Error: failed to remove one or more containers

I simply chose tutum/redis as it is a simple daemon, but had the problem with the ubuntu:14.04 image as well. As far as I remember I didn't have the problem with Docker 0.9.

Could somebody please try to reproduce the problem with the same setup as mine, thanks.

Environment:

Ubuntu 14.04 running with Vagrant/VirtualBox on OSX, the exact image is ubuntu/trusty64. Docker v1.1.1 is installed through official Docker repository:

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
sudo sh -c "echo deb http://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
sudo apt-get update
sudo apt-get install -y lxc-docker
Docker Info
$ docker info
Containers: 0
Images: 16
Storage Driver: devicemapper
 Pool Name: docker-8:1-140092-pool
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 676.6 Mb
 Data Space Total: 102400.0 Mb
 Metadata Space Used: 1.3 Mb
 Metadata Space Total: 2048.0 Mb
Execution Driver: native-0.2
Kernel Version: 3.13.0-24-generic
WARNING: No swap limit support
Docker log /var/log/upstart/docker.log
[bd9f5304.initserver()] Creating pidfile
[bd9f5304.initserver()] Setting up signal traps
[bd9f5304] -job initserver() = OK (0)
[bd9f5304] +job acceptconnections()
[bd9f5304] -job acceptconnections() = OK (0)
2014/07/12 12:34:28 GET /v1.13/containers/json
[bd9f5304] +job containers()
[bd9f5304] -job containers() = OK (0)
2014/07/12 12:34:36 POST /images/create?fromImage=tutum%2Fredis&tag=
[bd9f5304] +job pull(tutum/redis, )
[bd9f5304] -job pull(tutum/redis, ) = OK (0)
2014/07/12 12:39:29 POST /v1.13/containers/create
[bd9f5304] +job create()
[bd9f5304] -job create() = OK (0)
2014/07/12 12:39:30 POST /v1.13/containers/5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5/start
[bd9f5304] +job start(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
[bd9f5304] +job allocate_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
[bd9f5304] -job allocate_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
[bd9f5304] +job allocate_port(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
[bd9f5304] -job allocate_port(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
[bd9f5304] -job start(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
2014/07/12 12:39:31 GET /v1.13/containers/json
[bd9f5304] +job containers()
[bd9f5304] -job containers() = OK (0)
2014/07/12 12:39:54 GET /v1.13/containers/json
[bd9f5304] +job containers()
[bd9f5304] -job containers() = OK (0)
2014/07/12 12:40:01 DELETE /v1.13/containers/5ffa59ef0879?force=1
[bd9f5304] +job container_delete(5ffa59ef0879)
[bd9f5304] +job release_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
2014/07/12 12:40:01 Stopping proxy on tcp/[::]:49153 for tcp/172.17.0.2:6379 (accept tcp [::]:49153: use of closed network connection)
[bd9f5304] -job release_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy
[bd9f5304] -job container_delete(5ffa59ef0879) = ERR (1)
[error] server.go:1048 Error making handler: Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy
[error] server.go:90 HTTP Error: statusCode=500 Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy

geku commented Jul 12, 2014

@vieux I still have this problem with version Docker 1.1.1 and can reproduce it. It only happens when one or more ports are published to the host. If no ports are published forced remove works.

How to reproduce

Docker install is fresh: no image pulled and no other container run previously.

$ docker pull tutum/redis
$ docker run -d -p 6379 tutum/redis
$ docker ps
CONTAINER ID        IMAGE                COMMAND             CREATED             STATUS              PORTS                     NAMES
5ffa59ef0879        tutum/redis:latest   /run.sh             2 seconds ago       Up 1 seconds        0.0.0.0:49153->6379/tcp   compassionate_ardinghelli
$ docker rm -f 5ffa59ef0879
Error response from daemon: Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy
2014/07/12 12:40:12 Error: failed to remove one or more containers

I simply chose tutum/redis as it is a simple daemon, but had the problem with the ubuntu:14.04 image as well. As far as I remember I didn't have the problem with Docker 0.9.

Could somebody please try to reproduce the problem with the same setup as mine, thanks.

Environment:

Ubuntu 14.04 running with Vagrant/VirtualBox on OSX, the exact image is ubuntu/trusty64. Docker v1.1.1 is installed through official Docker repository:

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
sudo sh -c "echo deb http://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
sudo apt-get update
sudo apt-get install -y lxc-docker
Docker Info
$ docker info
Containers: 0
Images: 16
Storage Driver: devicemapper
 Pool Name: docker-8:1-140092-pool
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 676.6 Mb
 Data Space Total: 102400.0 Mb
 Metadata Space Used: 1.3 Mb
 Metadata Space Total: 2048.0 Mb
Execution Driver: native-0.2
Kernel Version: 3.13.0-24-generic
WARNING: No swap limit support
Docker log /var/log/upstart/docker.log
[bd9f5304.initserver()] Creating pidfile
[bd9f5304.initserver()] Setting up signal traps
[bd9f5304] -job initserver() = OK (0)
[bd9f5304] +job acceptconnections()
[bd9f5304] -job acceptconnections() = OK (0)
2014/07/12 12:34:28 GET /v1.13/containers/json
[bd9f5304] +job containers()
[bd9f5304] -job containers() = OK (0)
2014/07/12 12:34:36 POST /images/create?fromImage=tutum%2Fredis&tag=
[bd9f5304] +job pull(tutum/redis, )
[bd9f5304] -job pull(tutum/redis, ) = OK (0)
2014/07/12 12:39:29 POST /v1.13/containers/create
[bd9f5304] +job create()
[bd9f5304] -job create() = OK (0)
2014/07/12 12:39:30 POST /v1.13/containers/5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5/start
[bd9f5304] +job start(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
[bd9f5304] +job allocate_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
[bd9f5304] -job allocate_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
[bd9f5304] +job allocate_port(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
[bd9f5304] -job allocate_port(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
[bd9f5304] -job start(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
2014/07/12 12:39:31 GET /v1.13/containers/json
[bd9f5304] +job containers()
[bd9f5304] -job containers() = OK (0)
2014/07/12 12:39:54 GET /v1.13/containers/json
[bd9f5304] +job containers()
[bd9f5304] -job containers() = OK (0)
2014/07/12 12:40:01 DELETE /v1.13/containers/5ffa59ef0879?force=1
[bd9f5304] +job container_delete(5ffa59ef0879)
[bd9f5304] +job release_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5)
2014/07/12 12:40:01 Stopping proxy on tcp/[::]:49153 for tcp/172.17.0.2:6379 (accept tcp [::]:49153: use of closed network connection)
[bd9f5304] -job release_interface(5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5) = OK (0)
Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy
[bd9f5304] -job container_delete(5ffa59ef0879) = ERR (1)
[error] server.go:1048 Error making handler: Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy
[error] server.go:90 HTTP Error: statusCode=500 Cannot destroy container 5ffa59ef0879: Driver devicemapper failed to remove root filesystem 5ffa59ef08796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy
@tiborvass

This comment has been minimized.

Show comment
Hide comment
@tiborvass

tiborvass Jul 15, 2014

Collaborator

I could reproduce this on devicemapper too with latest docker on CentOS 6.5

Collaborator

tiborvass commented Jul 15, 2014

I could reproduce this on devicemapper too with latest docker on CentOS 6.5

@stuartpb

This comment has been minimized.

Show comment
Hide comment
@stuartpb

stuartpb Aug 16, 2014

I'm seeing this issue (removing a container with ports bound giving a "Device is Busy" error) in my own tests.

The weirdest thing is, it seems that it actually does remove the container, just after it prints this error and crashes the script running the docker rm.

stuartpb commented Aug 16, 2014

I'm seeing this issue (removing a container with ports bound giving a "Device is Busy" error) in my own tests.

The weirdest thing is, it seems that it actually does remove the container, just after it prints this error and crashes the script running the docker rm.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Aug 16, 2014

Member

Think I've seen similar behavior sometimes, with the same Device is busy on "/some/path/[container-id]-removed", and was unable to find any directory with a -removed suffix (might have been -deleted), will have to look if I have saved those messages somewhere.

Member

thaJeztah commented Aug 16, 2014

Think I've seen similar behavior sometimes, with the same Device is busy on "/some/path/[container-id]-removed", and was unable to find any directory with a -removed suffix (might have been -deleted), will have to look if I have saved those messages somewhere.

@howitzers

This comment has been minimized.

Show comment
Hide comment
@howitzers

howitzers Aug 19, 2014

Same device is busy error. Ubuntu trusty, docker 1.1.2, defaults.

docker version

Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070

docker info

Containers: 91
Images: 181
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Dirs: 3851
Execution Driver: native-0.2
Kernel Version: 3.13.0-34-generic

/var/log/upstart/docker.log

2014/08/19 15:30:41 DELETE /v1.13/containers/859796f54423
[069e87c2] +job container_delete(859796f54423)
Cannot destroy container 859796f54423: Driver aufs failed to remove root filesystem 859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11: rename /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235
696c091bc88f8b11 /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11-removing: device or resource busy
[069e87c2] -job container_delete(859796f54423) = ERR (1)
[error] server.go:1048 Error making handler: Cannot destroy container 859796f54423: Driver aufs failed to remove root filesystem 859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11: rename /var/lib/docker/aufs/mnt/859
796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11 /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11-    removing: device or resource busy
[error] server.go:90 HTTP Error: statusCode=500 Cannot destroy container 859796f54423: Driver aufs failed to remove root filesystem 859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11: rename /var/lib/docker/aufs/mnt/
859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11 /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11-removing: device or resource busy
2014/08/19 15:30:45 DELETE /v1.13/containers/859796f54423
[069e87c2] +job container_delete(859796f54423)

The server 500's breaking scripts, but it looks like in these cases the container actually is removed.

This container does have ports exposed to the host as well as bindmounted volumes (it's from fig up); similar cases were mentioned upthread and it might be relevant.

These errors aren't new; I've seen them since the early days at varying rates, but it's kind of silly to have to hack around them all the time.

@thaJeztah is this what you were seeing?

howitzers commented Aug 19, 2014

Same device is busy error. Ubuntu trusty, docker 1.1.2, defaults.

docker version

Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070

docker info

Containers: 91
Images: 181
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Dirs: 3851
Execution Driver: native-0.2
Kernel Version: 3.13.0-34-generic

/var/log/upstart/docker.log

2014/08/19 15:30:41 DELETE /v1.13/containers/859796f54423
[069e87c2] +job container_delete(859796f54423)
Cannot destroy container 859796f54423: Driver aufs failed to remove root filesystem 859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11: rename /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235
696c091bc88f8b11 /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11-removing: device or resource busy
[069e87c2] -job container_delete(859796f54423) = ERR (1)
[error] server.go:1048 Error making handler: Cannot destroy container 859796f54423: Driver aufs failed to remove root filesystem 859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11: rename /var/lib/docker/aufs/mnt/859
796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11 /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11-    removing: device or resource busy
[error] server.go:90 HTTP Error: statusCode=500 Cannot destroy container 859796f54423: Driver aufs failed to remove root filesystem 859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11: rename /var/lib/docker/aufs/mnt/
859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11 /var/lib/docker/aufs/mnt/859796f544232c45fc7c086f8e20fa38ed79689be5256235696c091bc88f8b11-removing: device or resource busy
2014/08/19 15:30:45 DELETE /v1.13/containers/859796f54423
[069e87c2] +job container_delete(859796f54423)

The server 500's breaking scripts, but it looks like in these cases the container actually is removed.

This container does have ports exposed to the host as well as bindmounted volumes (it's from fig up); similar cases were mentioned upthread and it might be relevant.

These errors aren't new; I've seen them since the early days at varying rates, but it's kind of silly to have to hack around them all the time.

@thaJeztah is this what you were seeing?

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Aug 19, 2014

Member

@howitzers yes! Exactly those kind of messages; and your example contains the -removing suffix (which I incorrectly remembered as -removed)

Also, I think (at least some of) my containers were started/created using fig as well. Not sure if this is cause of this problem, but just to add that info for my situation as well. @bfirsh ?

Thanks!

Member

thaJeztah commented Aug 19, 2014

@howitzers yes! Exactly those kind of messages; and your example contains the -removing suffix (which I incorrectly remembered as -removed)

Also, I think (at least some of) my containers were started/created using fig as well. Not sure if this is cause of this problem, but just to add that info for my situation as well. @bfirsh ?

Thanks!

@howitzers

This comment has been minimized.

Show comment
Hide comment
@howitzers

howitzers Aug 19, 2014

Fig just uses the plain docker remote API with no funny stuff, so the naughty behavior is definitely docker's business, not fig's. What fig does do, though, is a lot of container recreate and delete, making hitting this more likely (it's got a very race condition flavor to it).

Poked around a little more and this only happens after creating/removing a lot of containers (I usually hit this error under my own stress tests). @cywjackson's unmount suggestion does not fix it, nor does reboot.

What does fix it for me is a total wipe of /var/lib/docker, so it looks like some resource or timing issue with a lot of containers.

In any case, a convenient way to repro this is to just loop fig up -d with a few host-bound services in the fig.yaml. You'll eventually error out when the removes start 500'ing in the aufs driver as above.

howitzers commented Aug 19, 2014

Fig just uses the plain docker remote API with no funny stuff, so the naughty behavior is definitely docker's business, not fig's. What fig does do, though, is a lot of container recreate and delete, making hitting this more likely (it's got a very race condition flavor to it).

Poked around a little more and this only happens after creating/removing a lot of containers (I usually hit this error under my own stress tests). @cywjackson's unmount suggestion does not fix it, nor does reboot.

What does fix it for me is a total wipe of /var/lib/docker, so it looks like some resource or timing issue with a lot of containers.

In any case, a convenient way to repro this is to just loop fig up -d with a few host-bound services in the fig.yaml. You'll eventually error out when the removes start 500'ing in the aufs driver as above.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Aug 19, 2014

Contributor

I'm thinking about closing this issue because it has been open for too long and has become some sort of catch all for any type of error remotely related to a failed rm.

I think we will be better able to debug and fix new issues, reported from new docker versions if we have separate issues opened that are current and easier to ready. Any reason why we should keep this open right now?

Contributor

crosbymichael commented Aug 19, 2014

I'm thinking about closing this issue because it has been open for too long and has become some sort of catch all for any type of error remotely related to a failed rm.

I think we will be better able to debug and fix new issues, reported from new docker versions if we have separate issues opened that are current and easier to ready. Any reason why we should keep this open right now?

@stuartpb

This comment has been minimized.

Show comment
Hide comment
@stuartpb

stuartpb Aug 19, 2014

While it's definitely true that this has been open for too long and become a catch-all for lots of different bugs (holy cow, 60 participants), I think it should be left open (possibly with a name change) to address this specific current issue (where docker rm specifically fails with Cannot destroy container abcd123456: Driver devicemapper failed to remove root filesystem abcd123456796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy), and then close it once that specific race is resolved.

stuartpb commented Aug 19, 2014

While it's definitely true that this has been open for too long and become a catch-all for lots of different bugs (holy cow, 60 participants), I think it should be left open (possibly with a name change) to address this specific current issue (where docker rm specifically fails with Cannot destroy container abcd123456: Driver devicemapper failed to remove root filesystem abcd123456796a2526e2d7b7c2a980f30f37f2216112cc764725d2c99a9aa6d5: Device is Busy), and then close it once that specific race is resolved.

@howitzers

This comment has been minimized.

Show comment
Hide comment
@howitzers

howitzers Aug 19, 2014

I split out the specific 1.1.2 "failed to remove root filesystem" case to a new issue with a backref, which might be easier to track.

Can close either that one or this one, but there's definitely a specific open issue here, between myself, stuartpb, and thaJeztah.

howitzers commented Aug 19, 2014

I split out the specific 1.1.2 "failed to remove root filesystem" case to a new issue with a backref, which might be easier to track.

Can close either that one or this one, but there's definitely a specific open issue here, between myself, stuartpb, and thaJeztah.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Aug 20, 2014

Member

@crosbymichael (and indeed, wow, 59 others) I agree on closing this; the title has become outdated and it is collecting similar, but unrelated, issues (guilty myself I think).

If the original devicemapper failed to remove root filesystem issue still exists (can anybody confirm?) I think a new issue should be created for that with a clear title and a link to the other issue so that people are guided to the right one.

Please if this issue is closed, add a clear comment to explain why and point people to the right issues as well.

Member

thaJeztah commented Aug 20, 2014

@crosbymichael (and indeed, wow, 59 others) I agree on closing this; the title has become outdated and it is collecting similar, but unrelated, issues (guilty myself I think).

If the original devicemapper failed to remove root filesystem issue still exists (can anybody confirm?) I think a new issue should be created for that with a clear title and a link to the other issue so that people are guided to the right one.

Please if this issue is closed, add a clear comment to explain why and point people to the right issues as well.

@unclejack

This comment has been minimized.

Show comment
Hide comment
@unclejack

unclejack Aug 20, 2014

Contributor

I'm closing and locking this issue right now. I agree that it's become a catch all for any failure to remove a container in any conditions.

Please try to find existing issues related to the problem you're running into with the specific backend (btrfs, devicemapper, aufs) and comment there. If there are no existing issues which seem to be about the same problem using the same storage backend, please open a new issue.

Contributor

unclejack commented Aug 20, 2014

I'm closing and locking this issue right now. I agree that it's become a catch all for any failure to remove a container in any conditions.

Please try to find existing issues related to the problem you're running into with the specific backend (btrfs, devicemapper, aufs) and comment there. If there are no existing issues which seem to be about the same problem using the same storage backend, please open a new issue.

@unclejack unclejack closed this Aug 20, 2014

@moby moby locked and limited conversation to collaborators Aug 20, 2014

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.