Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow mounting datasets more than once #7207

Closed
wants to merge 1 commit into from

Conversation

alek-p
Copy link
Contributor

@alek-p alek-p commented Feb 21, 2018

This is a refresh of #7120
I've added a couple extra tests along with the test that is described below.
Let's see what buildbot thinks.

Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

  • Create a dataset
  • Create a snapshot of the dataset
  • Create a clone of the snapshot
  • Create a new mount namespace
  • Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Signed-off-by: Seth Forshee seth.forshee@canonical.com
Closes #5796

Description

Motivation and Context

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

return 0
}

log_assert "Verify recovery from a lazy unmount is possilbe"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: possible.


log_must pkill -P $PPID tail

log_assert "Recovering from a lazy unmount is possilbe"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_pass?


log_must pkill -P $PPID tail

log_assert "Recovering from a lazy unmount is possilbe"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: possible.

@@ -98,7 +98,7 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_6ARGS_SECURITY_INODE_INIT_SECURITY
ZFS_AC_KERNEL_CALLBACK_SECURITY_INODE_INIT_SECURITY
ZFS_AC_KERNEL_MOUNT_NODEV
ZFS_AC_KERNEL_FST_MOUNT
Copy link
Contributor

@vozhyk- vozhyk- Feb 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Looks like the .m4 file defining this is missing.

@vozhyk-
Copy link
Contributor

vozhyk- commented Feb 21, 2018

Is this also going to allow multiple normal mounts of a dataset in a single namespace?

mount -t zfs -o zfsutil pool/ds /mnt/first
mount -t zfs -o zfsutil pool/ds /mnt/second

#define SB_SILENT MS_SILENT
#endif

#ifndef SB_ACTIVE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should be a tab.

if (IS_ERR(s))
return (ERR_CAST(s));

if (!s->s_root) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s->s_root == NULL.

log_onexit cleanup

# 1. Create fs
TESTFS="$TESTPOOL/multi-mount-test"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than redefine TESTFS, which is already defined in default.cfg, how about using the existing definitition and $TESTPOOL/$TESTFS. Or alternately TESTDS=$TESTPOOL/multi-mount-test to avoid the reusing the variable name which could otherwise be confusing.


# 2. Create and hold open file in filesystem
FILENAME="$MNTPFS/file"
log_must dd if=/dev/urandom of=$FILENAME bs=128k count=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mkfile is perhaps a simpler alternative here.

TAILPPID=$!

# 3. Lazy umount
log_must umount -l $MNTPFS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also check that it was removed from the namespace.

# 5. Verify multiple mounts of the same dataset are possible
log_must mkdir $MNTFS2
log_must mount -t zfs zfsutil $TESTFS $MNTPFS2
log_must mkdir $MNTFS2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean $MNTFS3 here.


# 5. Verify multiple mounts of the same dataset are possible
log_must mkdir $MNTFS2
log_must mount -t zfs zfsutil $TESTFS $MNTPFS2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be -o zfsutil.

@behlendorf
Copy link
Contributor

Is this also going to allow multiple normal mounts of a dataset in a single namespace?

@vozhyk- yes.

@alek-p alek-p force-pushed the multimount branch 4 times, most recently from 12843f1 to 93ba799 Compare February 21, 2018 23:58
@alek-p
Copy link
Contributor Author

alek-p commented Feb 22, 2018

I think I've addressed all of the review comments now


static struct super_block *
zpl_mount_impl(struct file_system_type *fs_type, int flags,
zfs_mnt_t zm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this fits on the previous line, and you should pass this as a zfs_mnt_t *.

}
#else
static int
zpl_get_sb(struct file_system_type *fs_type, int flags,
const char *osname, void *data, struct vfsmount *mnt)
{
zfs_mnt_t zm = { .mnt_osname = osname, .mnt_data = data };

return (get_sb_nodev(fs_type, flags, &zm, zpl_fill_super, mnt));
struct super_block *sb = zpl_mount_impl(zm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be, zpl_mount_impl(fs_type, flags, &zm).

@alek-p alek-p force-pushed the multimount branch 5 times, most recently from 690a924 to 058ed29 Compare February 22, 2018 01:41
@alek-p
Copy link
Contributor Author

alek-p commented Feb 22, 2018

thanks for taking another look, I've updated the patch with the latest feedback and I've also added the "bind mount, then rename" test from Seth's original PR

objset_t *os;
int err;

err = dmu_objset_hold(zm.mnt_osname, FTAG, &os);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/zm./zm->/

return (ERR_CAST(s));

if (s->s_root == NULL) {
err = zpl_fill_super(s, &zm, flags & SB_SILENT ? 1 : 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/&zm/zm/

@alek-p
Copy link
Contributor Author

alek-p commented Feb 23, 2018

I think I've addressed all the build issues, not sure what's going on w/ the tests though.

@behlendorf
Copy link
Contributor

@alek-p my guess is they all panicked while running the zpool_create test group. I'd suggest giving it a spin locally and see if you can reproduce the issue.

sudo ./scripts/zfs.sh
./scripts/zfs-tests.sh -vx -T zpool_create

@alek-p
Copy link
Contributor Author

alek-p commented Feb 23, 2018

no panic, but zpool_create_024_pos is locking up the kernel. I'll look into it

static struct dentry *
zpl_mount(struct file_system_type *fs_type, int flags,
const char *osname, void *data)
{
zfs_mnt_t zm = { .mnt_osname = osname, .mnt_data = data };

return (mount_nodev(fs_type, flags, &zm, zpl_fill_super));
struct super_block *sb = zpl_mount_impl(fs_type, flags, &zm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be an:

if (IS_ERR(sb))
        return (PTR_ERR(sb));

check here to avoid NULL dereferencing sb->s_root in the next line. Otherwise this can happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for all the help @behlendorf
I've added something similar here and in zpl_test_super(). I'm now seeing test failures in the cli_root group, looking into those.

@alek-p
Copy link
Contributor Author

alek-p commented Mar 7, 2018

zpool_import test group has zpool_import_012_pos and zpool_import_rename_001_pos failing.
zpool_import_errata3 now seems to cause a NULL pointer dereference so will be looking into that first

Copy link
Contributor

@tcaputi tcaputi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think this is correct. I am not an expert on the Linux kernel's mounting code, but I understand the fix and this seems reasonable. Just a few things I would like to see cleaned up before it gets merged.

}
s->s_flags |= SB_ACTIVE;
} else if ((flags ^ s->s_flags) & SB_RDONLY) {
return (ERR_PTR(-EBUSY));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaks locked super?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, yes we're missing a call to deactivate_locked_super here.

log_must zfs rename $TESTDS $RENAMEFS
log_must zfs rename $RENAMEFS $TESTDS

log_pass "Multiple mounts are possible"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sections 5 and 6 I would want a check for the created files as well (just to be sure).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alek-p in addition to addressing @tcaputi's comment when you refreshed this you accidentally dropped the previous fix and need to add it back in.

}
s->s_flags |= SB_ACTIVE;
} else if ((flags ^ s->s_flags) & SB_RDONLY) {
return (ERR_PTR(-EBUSY));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, yes we're missing a call to deactivate_locked_super here.

log_must zfs rename $TESTDS $RENAMEFS
log_must zfs rename $RENAMEFS $TESTDS

log_pass "Multiple mounts are possible"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

@behlendorf
Copy link
Contributor

@alek-p when you get a chance the only thing holding up this PR is to run down the CentOS 6 zfs_multi_mount test failure.

Test: /usr/share/zfs/zfs-tests/tests/functional/cli_root/zfs_mount/zfs_multi_mount (run as root) [10:00] [KILLED]
21:49:56.41 ASSERTION: Verify multiple mounts into one namespace are possible
21:49:56.45 SUCCESS: zfs create testpool/multi-mount-test
21:49:56.46 SUCCESS: mkfile 128k /testpool/multi-mount-test/file
21:49:56.47 SUCCESS: umount -l /testpool/multi-mount-test
21:49:56.47 tail: cannot watch `/testpool/multi-mount-test/file': No such file or directory

@alek-p
Copy link
Contributor Author

alek-p commented Apr 2, 2018

I haven't forgotten about this PR, but I was preempted with other work. I should be able to return to this task shortly.

@alek-p alek-p force-pushed the multimount branch 5 times, most recently from 50cf75a to 230be2c Compare April 10, 2018 01:32
@alek-p
Copy link
Contributor Author

alek-p commented Apr 11, 2018

Thanks for all the help here @behlendorf, I think this one is ready to go now. I had to change the test so that execution order of commands is guaranteed.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thank's for wrapping this up!

if [ ! -f $FILENAME ]; then
log_fail "Rename failed"
fi
log_must zfs rename $RENAMEFS $TESTDS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, one last thing. We need a cleanup function to make sure this new filesystem has been unmounted and destroyed when this test case exits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


function cleanup
{
datasetexists $TESTDS && datasetdestroy $TESTDS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no datasetdestroy funciton. You can use the destroy_dataset hlper function for this. You should also make sure to explicitly unmount the additional mountpoints first.

        destroy_dataset "$TESTDS" "-f"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn it, I left out log_must on that destroy so didn't notice that function didn't exist... Made updates to cleanup and it's looking better now:

SUCCESS: umount /testpool/multi-mount-test
SUCCESS: umount /testpool/multi-mount-test-second
SUCCESS: umount /testpool/multi-mount-test-third
SUCCESS: zfs destroy -f testpool/multi-mount-test
SUCCESS: destroy_dataset testpool/multi-mount-test -f```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better! Thanks, just waiting on the test results now.

Currently mounting an already mounted zfs dataset results in an                                                                              error, whereas it is typically allowed with other filesystems.                                                                               This causes some bad interactions with mount namespaces. Take                                                                                this sequence for example:                                                                                                                                                                                                                                                                 - Create a dataset                                                                                                                           - Create a snapshot of the dataset                                                                                                           - Create a clone of the snapshot                                                                                                             - Create a new mount namespace                                                                                                               - Rename the original dataset                                                                                                                                                                                                                                                            The rename results in unmounting and remounting the clone in the                                                                             original mount namespace, however the remount fails because the                                                                              dataset is still mounted in the new mount namespace. (Note that                                                                              this means the mount in the new mount namespace is never being                                                                               unmounted, so perhaps the unmount/remount of the clone isn't                                                                                 actually necessary.)                                                                                                                                                                                                                                                                      The problem here is a result of the way mounting is implemented                                                                              in the kernel module. Since it is not mounting block devices it                                                                              uses mount_nodev() instead of the usual mount_bdev(). However,                                                                               mount_nodev() is written for filesystems for which each mount is                                                                             a new instance (i.e. a new super block), and zfs should be able                                                                              to detect when a mount request can be satisfied using an existing                                                                            super block.                                                                                                                                                                                                                                                                              Change zpl_mount() to call sget() directly with it's own test                                                                                callback. Passing the objset_t object as the fs data allows                                                                                  checking if a superblock already exists for the dataset, and in                                                                              that case we just need to return a new reference for the sb's                                                                                root dentry.                                                                                                                                                                                                                                                                              Signed-off-by: Seth Forshee <seth.forshee@canonical.com>                                                                                     Closes openzfs#5796
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tcaputi can you look at this one final time.

@tcaputi
Copy link
Contributor

tcaputi commented Apr 12, 2018

LGTM (to the best of my ability).

@codecov
Copy link

codecov bot commented Apr 13, 2018

Codecov Report

Merging #7207 into master will increase coverage by 0.19%.
The diff coverage is 84%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7207      +/-   ##
==========================================
+ Coverage    76.2%   76.39%   +0.19%     
==========================================
  Files         330      330              
  Lines      104294   104259      -35     
==========================================
+ Hits        79474    79647     +173     
+ Misses      24820    24612     -208
Flag Coverage Δ
#kernel 76.33% <84%> (+0.39%) ⬆️
#user 65.72% <ø> (+0.31%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7fab636...12bffa9. Read the comment docs.

@alek-p alek-p deleted the multimount branch April 13, 2018 18:17
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request May 3, 2018
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes openzfs#5796
Closes openzfs#7207
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request May 3, 2018
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes openzfs#5796
Closes openzfs#7207
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request May 4, 2018
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes openzfs#5796
Closes openzfs#7207
tonyhutter pushed a commit that referenced this pull request May 10, 2018
Currently mounting an already mounted zfs dataset results in an
error, whereas it is typically allowed with other filesystems.
This causes some bad interactions with mount namespaces. Take
this sequence for example:

- Create a dataset
- Create a snapshot of the dataset
- Create a clone of the snapshot
- Create a new mount namespace
- Rename the original dataset

The rename results in unmounting and remounting the clone in the
original mount namespace, however the remount fails because the
dataset is still mounted in the new mount namespace. (Note that
this means the mount in the new mount namespace is never being
unmounted, so perhaps the unmount/remount of the clone isn't
actually necessary.)

The problem here is a result of the way mounting is implemented
in the kernel module. Since it is not mounting block devices it
uses mount_nodev() instead of the usual mount_bdev(). However,
mount_nodev() is written for filesystems for which each mount is
a new instance (i.e. a new super block), and zfs should be able
to detect when a mount request can be satisfied using an existing
super block.

Change zpl_mount() to call sget() directly with it's own test
callback. Passing the objset_t object as the fs data allows
checking if a superblock already exists for the dataset, and in
that case we just need to return a new reference for the sb's
root dentry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Closes #5796
Closes #7207
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ZFS allows unmounting filesystem that is still in use
4 participants