Skip to content

Commit 8aef188

Browse files
author
Al Viro
committed
VFS: Fix vfsmount overput on simultaneous automount
[Kudos to dhowells for tracking that crap down] If two processes attempt to cause automounting on the same mountpoint at the same time, the vfsmount holding the mountpoint will be left with one too few references on it, causing a BUG when the kernel tries to clean up. The problem is that lock_mount() drops the caller's reference to the mountpoint's vfsmount in the case where it finds something already mounted on the mountpoint as it transits to the mounted filesystem and replaces path->mnt with the new mountpoint vfsmount. During a pathwalk, however, we don't take a reference on the vfsmount if it is the same as the one in the nameidata struct, but do_add_mount() doesn't know this. The fix is to make sure we have a ref on the vfsmount of the mountpoint before calling do_add_mount(). However, if lock_mount() doesn't transit, we're then left with an extra ref on the mountpoint vfsmount which needs releasing. We can handle that in follow_managed() by not making assumptions about what we can and what we cannot get from lookup_mnt() as the current code does. The callers of follow_managed() expect that reference to path->mnt will be grabbed iff path->mnt has been changed. follow_managed() and follow_automount() keep track of whether such reference has been grabbed and assume that it'll happen in those and only those cases that'll have us return with changed path->mnt. That assumption is almost correct - it breaks in case of racing automounts and in even harder to hit race between following a mountpoint and a couple of mount --move. The thing is, we don't need to make that assumption at all - after the end of loop in follow_manage() we can check if path->mnt has ended up unchanged and do mntput() if needed. The BUG can be reproduced with the following test program: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char **argv) { int pid, ws; struct stat buf; pid = fork(); stat(argv[1], &buf); if (pid > 0) wait(&ws); return 0; } and the following procedure: (1) Mount an NFS volume that on the server has something else mounted on a subdirectory. For instance, I can mount / from my server: mount warthog:/ /mnt -t nfs4 -r On the server /data has another filesystem mounted on it, so NFS will see a change in FSID as it walks down the path, and will mark /mnt/data as being a mountpoint. This will cause the automount code to be triggered. !!! Do not look inside the mounted fs at this point !!! (2) Run the above program on a file within the submount to generate two simultaneous automount requests: /tmp/forkstat /mnt/data/testfile (3) Unmount the automounted submount: umount /mnt/data (4) Unmount the original mount: umount /mnt At this point the kernel should throw a BUG with something like the following: BUG: Dentry ffff880032e3c5c0{i=2,n=} still in use (1) [unmount of nfs4 0:12] Note that the bug appears on the root dentry of the original mount, not the mountpoint and not the submount because sys_umount() hasn't got to its final mntput_no_expire() yet, but this isn't so obvious from the call trace: [<ffffffff8117cd82>] shrink_dcache_for_umount+0x69/0x82 [<ffffffff8116160e>] generic_shutdown_super+0x37/0x15b [<ffffffffa00fae56>] ? nfs_super_return_all_delegations+0x2e/0x1b1 [nfs] [<ffffffff811617f3>] kill_anon_super+0x1d/0x7e [<ffffffffa00d0be1>] nfs4_kill_super+0x60/0xb6 [nfs] [<ffffffff81161c17>] deactivate_locked_super+0x34/0x83 [<ffffffff811629ff>] deactivate_super+0x6f/0x7b [<ffffffff81186261>] mntput_no_expire+0x18d/0x199 [<ffffffff811862a8>] mntput+0x3b/0x44 [<ffffffff81186d87>] release_mounts+0xa2/0xbf [<ffffffff811876af>] sys_umount+0x47a/0x4ba [<ffffffff8109e1ca>] ? trace_hardirqs_on_caller+0x1fd/0x22f [<ffffffff816ea86b>] system_call_fastpath+0x16/0x1b as do_umount() is inlined. However, you can see release_mounts() in there. Note also that it may be necessary to have multiple CPU cores to be able to trigger this bug. Tested-by: Jeff Layton <jlayton@redhat.com> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1 parent 50338b8 commit 8aef188

File tree

1 file changed

+16
-8
lines changed

1 file changed

+16
-8
lines changed

fs/namei.c

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -812,19 +812,21 @@ static int follow_automount(struct path *path, unsigned flags,
812812
if (!mnt) /* mount collision */
813813
return 0;
814814

815+
if (!*need_mntput) {
816+
/* lock_mount() may release path->mnt on error */
817+
mntget(path->mnt);
818+
*need_mntput = true;
819+
}
815820
err = finish_automount(mnt, path);
816821

817822
switch (err) {
818823
case -EBUSY:
819824
/* Someone else made a mount here whilst we were busy */
820825
return 0;
821826
case 0:
822-
dput(path->dentry);
823-
if (*need_mntput)
824-
mntput(path->mnt);
827+
path_put(path);
825828
path->mnt = mnt;
826829
path->dentry = dget(mnt->mnt_root);
827-
*need_mntput = true;
828830
return 0;
829831
default:
830832
return err;
@@ -844,9 +846,10 @@ static int follow_automount(struct path *path, unsigned flags,
844846
*/
845847
static int follow_managed(struct path *path, unsigned flags)
846848
{
849+
struct vfsmount *mnt = path->mnt; /* held by caller, must be left alone */
847850
unsigned managed;
848851
bool need_mntput = false;
849-
int ret;
852+
int ret = 0;
850853

851854
/* Given that we're not holding a lock here, we retain the value in a
852855
* local variable for each dentry as we look at it so that we don't see
@@ -861,7 +864,7 @@ static int follow_managed(struct path *path, unsigned flags)
861864
BUG_ON(!path->dentry->d_op->d_manage);
862865
ret = path->dentry->d_op->d_manage(path->dentry, false);
863866
if (ret < 0)
864-
return ret == -EISDIR ? 0 : ret;
867+
break;
865868
}
866869

867870
/* Transit to a mounted filesystem. */
@@ -887,14 +890,19 @@ static int follow_managed(struct path *path, unsigned flags)
887890
if (managed & DCACHE_NEED_AUTOMOUNT) {
888891
ret = follow_automount(path, flags, &need_mntput);
889892
if (ret < 0)
890-
return ret == -EISDIR ? 0 : ret;
893+
break;
891894
continue;
892895
}
893896

894897
/* We didn't change the current path point */
895898
break;
896899
}
897-
return 0;
900+
901+
if (need_mntput && path->mnt == mnt)
902+
mntput(path->mnt);
903+
if (ret == -EISDIR)
904+
ret = 0;
905+
return ret;
898906
}
899907

900908
int follow_down_one(struct path *path)

0 commit comments

Comments
 (0)