cannot destroy dataset: dataset is busy #1810

Closed
seletskiy opened this Issue Oct 24, 2013 · 14 comments

Comments

Projects
None yet
7 participants
Contributor

seletskiy commented Oct 24, 2013

I've see similar issues, but all of them were closed.

I'm experiencing such kind of problem right now:

# zfs destroy zroot/2013-10-15T065955229209
cannot destroy 'zroot/2013-10-15T065955229209': dataset is busy

# zfs umount zroot/2013-10-15T065955229209
cannot unmount 'zroot/2013-10-15T065955229209': not currently mounted

# zfs list | grep zroot/2013-10-15T065955229209
zroot/2013-10-15T065955229209                2.86G  25.0G  11.0G  /var/lib/heaver/instances/2013-10-15T065955229209

# umount /var/lib/heaver/instances/2013-10-15T065955229209
umount: /var/lib/heaver/instances/2013-10-15T065955229209: not mounted

# pacman -Qi zfs | grep Version
Version        : 0.6.1_3.9.9-1

# uname -a
Linux hub.host.s 3.9.9-1-apparmor #1 SMP PREEMPT Thu Jul 11 17:45:29 NOVT 2013 x86_64 GNU/Linux

It's quite annoying, because of I need to constantly reboot host to get it to work.

I can provide any debug information needed.

Contributor

DeHackEd commented Oct 24, 2013

It's a bit of a shot in the dark, but try this:

# grep zroot/2013-10-15T065955229209 /proc/*/mounts

This searches for other instances of the mount in other namespaces. I've had this with LXC and since you have such a new kernel I assume you're running an OS that might use it as well.

Contributor

lundman commented Oct 24, 2013

We had something similar to this, smal chance, check

zfs holds $snapshotname

to see if it has any holds, and if so, zfs release to remove the hold

Contributor

seletskiy commented Oct 24, 2013

@DeHackEd: Oh, man, thank you! That's the issue! It's grabbed by ntpd... Yep, we use LXC as well.

@seletskiy seletskiy closed this Oct 24, 2013

Contributor

seletskiy commented Oct 24, 2013

Hmmm, it's quite interesting. For whatever reason after umounting any zfs filesystem it still presents in /proc/<ntp_pid>/mounts:

# zfs create zroot/test

# zfs list | grep zroot/test
zroot/test                                     30K   108G    30K  /test

# systemctl status ntpd
ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled)
   Active: active (running) since Thu 2013-10-24 12:07:42 UTC; 4s ago
  Process: 17371 ExecStart=/usr/bin/ntpd -g -u ntp:ntp (code=exited, status=0/SUCCESS)
 Main PID: 17372 (ntpd)
   CGroup: name=systemd:/system/ntpd.service
           └─17372 /usr/bin/ntpd -g -u ntp:ntp

# grep zroot/test /proc/17372/mounts
zroot/test /test zfs rw,relatime,xattr 0 0

# zfs umount zroot/test

# grep zroot/test /proc/17372/mounts
zroot/test /test zfs rw,relatime,xattr 0 0

# grep zroot/test /proc/*/mounts
/proc/17372/mounts:zroot/test /test zfs rw,relatime,xattr 0 0   <-- it's still only here, WTF?

# zfs destroy zroot/test
cannot destroy 'zroot/test': dataset is busy

# systemctl stop ntpd

# zfs destroy zroot/test && echo $?
0

I doesn't make any sense.

@seletskiy seletskiy reopened this Oct 24, 2013

Contributor

DeHackEd commented Oct 24, 2013

There's a linux feature called mount namespaces. You can make one by running unshare -m /bin/sh - in this shell all actions performed by mount and umount are independent of the main host. Inside this container you can unmount /home even if you're logged in elsewhere. But likewise if /home gets unmounted on the main system within the namespace it's still mounted since they're independent.

It's a sandboxing technique. LXC uses this but it's not strictly an LXC feature. Unfortunately the ZFS tools can't tell this has happened and says it's unmounted because /etc/mtab (and even /proc/mounts) says all is well. I guess ntpd is doing this. It's a systemd feature so maybe that's at play as well.

Contributor

seletskiy commented Oct 25, 2013

Further investigations revealed, that it's looks like bug in systemd. It's repeatable even on loop mount with ext4. It appears when running ntpd because of PrivateTmp=true in corresponding systemd unit.

So, I'm closing the issue, I've think it's not ZFS bug.

@seletskiy seletskiy closed this Oct 25, 2013

Contributor

seletskiy commented Oct 25, 2013

@DeHackEd Thanks for assistance. I've filed a bug if you're intersted: https://bugs.freedesktop.org/show_bug.cgi?id=70856

Contributor

seletskiy commented Nov 5, 2013

Just encounter this issue again, but now there is no proc that holds mount.

# zfs rename zroot/virtubuntu-sphinxed zroot/test  
cannot rename 'zroot/virtubuntu-sphinxed': dataset is busy

# grep zroot/virtubuntu-sphinxed /proc/*/mounts
... nothing ...

@seletskiy seletskiy reopened this Nov 5, 2013

Owner

behlendorf commented Nov 6, 2013

@seletskiy This may be a duplicate of #1792. A fix for this, 7ec0928, was merged in to master a few days ago, could you try the latest code.

Owner

behlendorf commented Dec 6, 2013

@seletskiy Can you still reproduce this in master? We believe it was fixed, unless I hear otherwise I'll close this one out in a few days.

Contributor

seletskiy commented Dec 9, 2013

@behlendorf: Looks like issue is fixed. Thanks a lot!

@seletskiy seletskiy closed this Dec 9, 2013

@behlendorf is master (or what will become 0.6.3) effectively what can be found in Ubuntu Daily Builds?

Or would I have to build from source to get this fix?

Member

dajhorn commented Apr 22, 2014

@RLovelett, the Trusty daily builds are tracking master and are current. The packages in ppa:zfs-native/daily for Precise are stale, but they do have this particular fix.

Whatever fix was done here isn't enough for my case.

I ran timedatectl set-ntp true and then started running into this error on debian jessie with docker 17.06.1-ce. Now when I try to remove a dataset (when docker+zfs cleans up old containers) it errors. Following it with grep led me to systemd. Turning set-ntp to false works, but my clock is drifting so I would prefer not to do that. I'm not sure what the repercussions of changing PrivateTmp=true to something else are or even what file to change to do that. Any suggestions? I read the systemd bug report and they closed this as "not a bug." I think the fix might need to be in docker somewhere but am not sure about this.

Steps to reproduce:

  1. On a debian Jessie host with docker 17.06.1-ce installed, run timedatectl set-ntp true

  2. Create a docker-compose.yml:

version: '3'

volumes:
  data:

services:
  test:
    image: alpine
    command: tail -f /dev/null
    restart: always
    volumes:
      - data:/data/
  1. Run docker-compose up -d and everything should work fine.

  2. Now modify the docker-compose.yml so compose will want to restart it:

version: '3'

services:
  test:
    image: alpine
    command: tail -f /dev/null
    restart: always
  1. Run docker-compose up -d again and it will fail the same way the original report showed:
$ docker-compose up -d
Recreating quick_test_1 ... 
Recreating quick_test_1 ... error

ERROR: for quick_test_1  driver "zfs" failed to remove root filesystem for 5f310a1d949f02084468a58142b8f00a70c7dce612076e188c3ba47d28dca737: exit status 1: "/sbin/zfs zfs destroy -r storage/docker/01689fe6f409b0486a136023b150f4436d7d1a83bb9102d76970aa4e11cd82e0" => cannot destroy 'storage/docker/01689fe6f409b0486a136023b150f4436d7d1a83bb9102d76970aa4e11cd82e0': dataset is busy


ERROR: for test  driver "zfs" failed to remove root filesystem for 5f310a1d949f02084468a58142b8f00a70c7dce612076e188c3ba47d28dca737: exit status 1: "/sbin/zfs zfs destroy -r storage/docker/01689fe6f409b0486a136023b150f4436d7d1a83bb9102d76970aa4e11cd82e0" => cannot destroy 'storage/docker/01689fe6f409b0486a136023b150f4436d7d1a83bb9102d76970aa4e11cd82e0': dataset is busy
# grep storage/docker/01689fe6f409b0486a136023b150f4436d7d1a83bb9102d76970aa4e11cd82e0 /proc/*/mounts
/proc/919/mounts:storage/docker/01689fe6f409b0486a136023b150f4436d7d1a83bb9102d76970aa4e11cd82e0 /var/lib/docker/zfs/graph/01689fe6f409b0486a136023b150f4436d7d1a83bb9102d76970aa4e11cd82e0 zfs rw,relatime,xattr,noacl 0 0

# ps -p 919
  PID TTY          TIME CMD
  919 ?        00:00:00 systemd-timesyn

Let me know if I should open this task for the docker team instead. For now I'll disable ntp.

@WyseNynja WyseNynja referenced this issue in moby/moby Dec 3, 2017

Merged

zfs: fix ebusy on umount etc #35674

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment