Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker kill leaves directories behind. #197

Closed
simonjohansson opened this issue Mar 26, 2013 · 21 comments
Closed

docker kill leaves directories behind. #197

simonjohansson opened this issue Mar 26, 2013 · 21 comments
Milestone

Comments

@simonjohansson
Copy link

Doing a docker kill UUID have left some directories behind in /var/lib/docker/containers
Doing a ls reveals

$ls /var/lib/docker/containers/0a50ba2e6217fe8234fe6a29f84e97b541631697777515f92259f276d7f83d3e/
ls: cannot access /var/lib/docker/containers/0a50ba2e6217fe8234fe6a29f84e97b541631697777515f92259f276d7f83d3e/rootfs: Stale NFS file handle
rootfs

I am running docker inside a rather slow virtualbox-vm (Ubuntu 12.04, 3.5.0-23-generic). I have right now 7 of these directories, two of them comes from containers where I have made big changes(apt-get update), the other five have only been "echo hello world"-containers.

Relevant IRC-chat

23:11 < DinMamma> Ah, this is interesting, when looking into the cointaners in /var/lib/docker/containers I get "ls: cannot access 
                  rootfs: Stale NFS file handle"
23:11 < DinMamma> So I wonder if this is a issue with my system rather than docker.
23:11 <@shykes> DinMamma: no, this is a known issue with aufs, which we thought we had neutralized
23:12 <@shykes> basically aufs umount is asynchronous
23:12 <@shykes> it does background cleanup
23:12 <@shykes> if you remove the mountpoint too quickly before aufs is done with cleanup, it gets stuck
23:12 <@shykes> and you get that error message
23:13 < DinMamma> I should say that I am running my tests inside a rather slow virtualbox-vm.
23:13 <@shykes> I'm surprised that you hit this. We have a workaround which includes checking the stat() on the mountpoint in a loop, 
                until its inode changes
23:19 <@shykes> DinMamma: so am I :)
23:19 <@shykes> mmm that could be it
23:20 <@shykes> DinMamma: did one of these containers have a lot of filesystem changes on them?
23:20 <@shykes> like a big apt-get, or something like that?
23:20 < DinMamma> Yep
23:20 < DinMamma> Two of them.
23:20 <@shykes> maybe slow machine + lots of data on the aufs rw layer means -> our workaround timed out, and gave up waiting for aufs
@shykes
Copy link
Contributor

shykes commented Mar 26, 2013

Just an extra comment: it is normal for 'docker kill' to leave the container directory. By default all containers are stored, so you can inspect their filesystem state, commit them into images, restart them etc.

But of course it is not normal to see "stale NFS handle" errors :)

@vieux
Copy link
Contributor

vieux commented Apr 11, 2013

I can't reproduce.

My host is ubuntu12.10 and I used the base as guest.
Anybody can reproduce ?

@hansent
Copy link
Contributor

hansent commented Apr 15, 2013

Is there a way to manually repair the directory so I can delete the directories without rebooting the host?

@shykes
Copy link
Contributor

shykes commented Apr 15, 2013

Not that I know of. Note that there is no known side-effect outside the
scope of that container.

On Monday, April 15, 2013, Thomas Hansen wrote:

Is there a way to manually repair the directory so I can delete the
directories without rebooting the host?


Reply to this email directly or view it on GitHubhttps://github.com//issues/197#issuecomment-16385541
.

@shykes
Copy link
Contributor

shykes commented Apr 23, 2013

As discussed earlier, this is probably due to the asynchronous nature of aufs unmount.

I'm downgrading this to minor bug, since:

a) it occurs very rarely (1 known occurrence so far)
b) it has no impact on the behavior of docker or the system,
c) it's very hard to reproduce

@ricardoamaro
Copy link

+1 on a fix for this since i just bumped into it:

~# docker rm 5cbb64c3279a
Error: Error destroying container 5cbb64c3279a: stat /var/lib/docker/containers/5cbb64c3279a76acaac4769e4a6c57c39a7fff6027b51d14ecff08040d252d13/rootfs: stale NFS file handle

@creack
Copy link
Contributor

creack commented Jul 24, 2013

@simonjohansson Since #816, did you get the error?

@vieux
Copy link
Contributor

vieux commented Jul 30, 2013

ping @simonjohansson

@simonjohansson
Copy link
Author

Hi guys, sorry I didn't see this until now. I have some holiday coming up in the next couple of days, Ill make sure to see if #816 fixed the issue!

@dscape
Copy link

dscape commented Aug 1, 2013

Just encountered the same issue:

root@dscape:~# docker ps -a | grep 'Exit' |  awk '{print $1}' | xargs docker rm
Error: Error destroying container 38b561af34e1: stat /var/lib/docker/containers/38b561af34e1bb0b3e92d7b1fe734aeabf223d6a5c36757be8925514e28e8b45/rootfs: stale NFS file handle

Error: Error destroying container 112a0c0b9c95: stat /var/lib/docker/containers/112a0c0b9c9546697f20dd7ed21899b789f981eb5195d189b1503ab1893184e4/rootfs: stale NFS file handle

Error: Error destroying container ef13c73b64a9: stat /var/lib/docker/containers/ef13c73b64a991e2b937fbcb1fae412d7b6404dcb67ae105c06ebd5b62926f35/rootfs: stale NFS file handle

Error: Error destroying container e0178615f6d8: stat /var/lib/docker/containers/e0178615f6d8be7ca343c89c398536713542413fa7ac04d172bb268f626a252a/rootfs: stale NFS file handle

Error: Error destroying container 3c8659a041c9: stat /var/lib/docker/containers/3c8659a041c9217e35c056e96da0fe5dc9d5eae43f37874ff372190ed8867277/rootfs: stale NFS file handle

Error: Error destroying container 99dee8e5a486: stat /var/lib/docker/containers/99dee8e5a486b8eeff3855e6750e1dee90ec4c8af022ed9a43304edda411b507/rootfs: stale NFS file handle

Error: Error destroying container b7ac0d3f3f79: stat /var/lib/docker/containers/b7ac0d3f3f79ae35883d09e796332726322e56bdd715e5484210bf84099cc513/rootfs: stale NFS file handle

Error: Error destroying container 7329c9be9795: stat /var/lib/docker/containers/7329c9be97957b187cdb6cbb825ab506e3a8610c01b4055ad5cc64fc58a6e985/rootfs: stale NFS file handle
root@dscape:~# docker version
Client version: 0.4.8
Server version: 0.4.8
Git commit: ??
Go version: go1.1.1

@simonjohansson
Copy link
Author

I cannot reproduce anymore.

Client version: 0.5.0
Server version: 0.5.0
Git commit: 51f6c4a
Go version: go1.1.1

GG :)

@creack
Copy link
Contributor

creack commented Aug 7, 2013

@dscape can you try again with docker 0.5.1?

@dtabuenc
Copy link

I keep seeing this issue over and over using docker inside VirtualBox. I usually run docker rm $(docker ps -a |cut -d " " -f 1) to remove all containers but many of them fail with stale NFS file handle.

@paulosuzart
Copy link

Just to add, I tried some brutal force removing the directories of such containers. After that, trying to remove them via docker rm still prints the same message.

Managed to remove after restarting docker host.

@ricardoamaro
Copy link

This seems fixed to me.
Using:

# docker version
Client version: 0.5.3
Server version: 0.5.3
Git commit: 5d25f32
Go version: go1.1.1

Also make sure you have no bash running inside the container path.

@ghost
Copy link

ghost commented Aug 16, 2013

Was the asynchronous unmount theory ever proven? I wonder if this is the "deleted a container's image while the container is running" bug:

# Pane 1
$ docker run -i -t foo /bin/bash
root@d6d23b36b613:/#

# Pane 2
$ docker rmi foo
Untagged: 1cfaa4fe8724
Deleted: 1cfaa4fe8724
$

# Pane 1
root@d6d23b36b613:/# exit
$ docker rm `docker ps -l -q`
Error: Error destroying container d6d23b36b613: stat /var/lib/docker/containers/d6d23b36b613337b8e8bbc2ee90af11da3c5fab78a07a01a43ba7262359292ca/rootfs: stale NFS file handle

$

@pungoyal
Copy link

@dsissitka i think that is exactly what it is. happened with me.

 $ docker version
Go version (client): go1.1.1
Go version (server): go1.1.1
Last stable version: 0.6.3

how can the container be removed now?

@crosbymichael
Copy link
Contributor

The original issue is resolved in 0.7 because kill does not do an umount anymore. Containers are unmounted when the daemon is stopped.

@eliasp
Copy link
Contributor

eliasp commented Nov 30, 2013

In case anyone has a /var/lib/docker/volumes directory full of orphaned volumes, feel free to use the following Python script (make sure to understand what it does before executing it):

#!/usr/bin/python

import json
import os
import shutil
import subprocess
import re

dockerdir = '/var/lib/docker'
volumesdir = os.path.join(dockerdir, 'volumes')

containers = dict((line, 1) for line in subprocess.check_output('docker ps -a -q -notrunc', shell=True).splitlines())

volumes = os.walk(os.path.join(volumesdir, '.')).next()[1]
for volume in volumes:
    if not re.match('[0-9a-f]{64}', volume):
        print volume + ' is not a valid volume identifier, skipping...'
        continue
    volume_metadata = json.load(open(os.path.join(volumesdir, volume, 'json')))
    container_id = volume_metadata['container']
    if container_id in containers:
        print 'Container ' + container_id[:12] + ' does still exist, not clearing up volume ' + volume
        continue
    print 'Deleting volume ' + volume + ' (container: ' + container_id[:12] + ')'
    volumepath = os.path.join(volumesdir, volume)
    print 'Volumepath: ' + volumepath
    shutil.rmtree(volumepath)

@mindreframer
Copy link

thanks for the script! I fixed the indentation and a small bug:

container_id = volume_metadata['id'] # (not container anymore)

https://gist.github.com/mindreframer/7787702

@eliasp
Copy link
Contributor

eliasp commented Dec 4, 2013

Thanks! No idea why the indentation was messed up in my post, edited + fixed it.

I used volume_metadata['container'] because I was still on 0.6.6 when I wrote the script, but anyone using 0.7.0 (or later) should use your changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests