New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bind mounts in fstab work improperly when pointing to ZFS #971
Comments
(i'm uisng bind mounts as a transitional mechanism without issues) when it's not working can you show what
says? |
Allow me to shed some light on this. Let's consider an old-school nfs4 export using a native Linux filesystem, one share called 'pmr':
When the system is booting, the xfs filesystem will be mounted first, followed by a bind mount from /storage/pmr to /exports/pmr. The latter then is exported via /etc/exports using nfs4 and we're all happy. Now consider a zfs-based scenario. Since there are no zfs entries in fstab, it becomes:
When the system boots, a bind-type mount will be created from /storage/pmr to /exports/pmr which is effectively mounting the underlying filesystem (most likely / ) to the bind point and exporting that. The clients will see the contents of an empty directory as the exporter uses the / bind mount. On the server, the confused administrator will see the actual zfs and will scratch their head. I don't think this is a bug in zfs rather a race condition between the distribution's native localfs init script and zfs. Perhaps localfs should depend on zfs and not the other way around. Alternatively, the zfs service should parse some file that will tell it how the binds go and bind after mounting the zfs filesystem. Perhaps a file in /etc/zfs/ like 'binds' would work. Personally (sysadmin cap on) /etc/zfs/binds would work for me (together with a bit of parsing in /etc/init.d/zfs) as it's sufficiently low-tech and doesn't require changes in the actual zfs stack. |
Proposed patch (only lsb script, others are most likely derivative): --- etc/init.d/zfs.lsb.in.orig 2013-07-15 12:47:20.055257882 +0100
+++ etc/init.d/zfs.lsb.in 2013-07-15 12:49:44.732137370 +0100
@@ -29,6 +29,7 @@
ZFS="@sbindir@/zfs"
ZPOOL="@sbindir@/zpool"
ZPOOL_CACHE="@sysconfdir@/zfs/zpool.cache"
+ZFS_NFS4_BINDS="@sysconfdir@/zfs/binds"
# Source zfs configuration.
[ -r '/etc/default/zfs' ] && . /etc/default/zfs
@@ -78,6 +79,26 @@
log_end_msg $?
fi
+ # Create (optional) binds to the NFS4 export tree
+ if [ -e "$ZFS_NFS4_BINDS" ] ; then
+ log_begin_msg "Binding NFS4 mounts"
+ sed -e "s/#.*//" -e "/^$/d" $ZFS_NFS4_BINDS | while read LINE
+ do
+ MODE="`echo $LINE | awk '{print $1}'`"
+ SRC="`echo $LINE | awk '{print $2}'`"
+ DEST="`echo $LINE | awk '{print $3}'`"
+ case $MODE in
+ bind) MOUNTPOINT="`zfs get mountpoint $SRC | grep "$SRC" | awk '{print $3}'`"
+ mount -o $MODE $MOUNTPOINT $DEST
+ log_end_msg $?
+ ;;
+ *) echo "Unknown bind mode ($MODE) in $ZFS_NFS4_BINDS. Aborting."
+ exit 4
+ ;;
+ esac
+ done
+ fi
+
touch "$LOCKFILE"
}
@@ -85,6 +106,25 @@
{
[ ! -f "$LOCKFILE" ] && return 3
+ if [ -e "$ZFS_NFS4_BINDS" ] ; then
+ log_begin_msg "Detaching NFS4 binds"
+ sed -e "s/#.*//" -e "/^$/d" $ZFS_NFS4_BINDS | while read LINE
+ do
+ MODE="`echo $LINE | awk '{print $1}'`"
+ SRC="`echo $LINE | awk '{print $2}'`"
+ DEST="`echo $LINE | awk '{print $3}'`"
+ case $MODE in
+ bind) MOUNTPOINT="`zfs get mountpoint $SRC | grep "$SRC" | awk '{print $3}'`"
+ umount $DEST
+ log_end_msg $?
+ ;;
+ *) echo "Unknown bind mode ($MODE) in $ZFS_NFS4_BINDS. Aborting."
+ exit 4
+ ;;
+ esac
+ done
+ fi
+
log_begin_msg "Unmounting ZFS filesystems"
"$ZFS" umount -a
log_end_msg $? $MODE may look redundant but perhaps could be kept for future expansion, maybe there could be other bind types. The /etc/zfs/binds file would look like this:
Of course the distribution source would only contain the first line. I believe this is consistent with other files in /etc/zfs. Cheers, |
@bhodgens I agree with @jzachwieja first comment - this is not a fault in ZoL. It simply runs later than the initial mounts (which is basically the first thing that happens at bootup). There's no way we can have ZoL run that early. @jzachwieja I'm not sure I agree with your solution though. It seems way to 'hackish' - it shouldn't be the responsibility of the ZoL init script to do this. Instead, I think it's up to the system administrator to write sufficient code to solve this in I vote to close this as a 'not a ZoL issue'. @behlendorf ? |
@FransUrbo Are you suggesting that instead of editing a config file (present, documented) you would rather ask everyone to roll their own code, manually create bind mounts? That doesn't sound like a sane systems management practice. When ZoL filesystem needs to be exported over NFS4, a bind mount must be created. No standard mechanism in GNU/Linux will allow for it if the filesystem is not present in If you don't like my solution, that's fine, please provide a better one or show where exactly am I incorrect. Saying something is 'hackish' and then suggesting that sysadmins 'sort it out in rc.local' isn't constructive. Regards, |
The init script(s) is complicated enough as it is without adding even more complexity (that will have to be maintained - for a very limited number of users). The majority (?) isn't even using NFS, so why force ZoL to go around kludges in another piece of software (nfs daemon etc)? Soon(er than later), we'd have to do the same for iSCSI, Samba and what not. ZoL is not 'do everything for everyone'.
|
What use is a filesystem that cannot be exported over network? NFS4 exports are different from NFS3 exports. There is a certain, established standard of creating them in GNU/Linux, there exists a well documented process that is different from Solaris-isms still present in ZoL. I wasn't aware What bugs in other software are you referring to? Exporting NFS4 works perfectly fine in GNU/Linux. Since ZoL provides PV, VG and LV management as well as filesystem mount points in a way that is abstracted from the current device paradigm on Linux, certain steps need to be taken to make those two work together. While you are free to disagree, I still haven't seen a patch that solves the problem. GNU/Linux nfs-kernel-server (and this is ZFS on Linux) requires mount points bound into a central exports tree. Since binding is done early (and you can't make the NFS4 provides capabilities like idmapd (how would you propose to integrate The logical way to do it (and I have consulted this with a number of Linux Sysadmins before presenting it here) is for the init script to have a mechanism to create the required bound mounts to the exports tree. The section in the init script is self-contained, fails safe (no action if the config file isn't present) and does introduce required compatibility with the host operating system. In one file that is owned by the ZFS package. If you continue to disagree, please produce a patch that solves the issue for NFS4 and ZoL or provide a way of exporting NFS4, including all the required export options like the following excerpt from a production environment:
Please understand, |
This is on the todo list for us to work out (we haven't decided if we should redesign libshare or move it to the zevent deamon). NFS (in ZoL) isn't my forté, but start looking at #1029. If you have further discussion about this, take it to the list. This isn't a support forum, and I consider this besides the point - the issue is about bind mounts not working from fstab. And there's very little we can do about that. If you have a specific problem/bug/issue with NFS in ZoL (after reading the documentation, checking Admin Guides AND actually testing your way around), then feel free to create a new issue about this. Just for the record, I sometimes use NFS (both v3 and v4) for testing on my test rigs, and both works just fine. Although I use rather straight forward rules, nothing like what you showed, so it's perfectly possible that such a complex share won't be possible. But take it to the list and I'm sure someone will give you hints either way.
|
The issue #1029 you referred me to is highlighting the problem I've solved; no way to correctly set up NFS4 shares using Solaris-isms under Linux.
Unless you meant the four dots at the end of I'm not going to continue this conversation with you as it's no longer productive. |
@jzachwieja Just because you're incapable of doing it, doesn't mean it isn't possible (I just did it again).... @bhodgens @behlendorf How should we proceed? Should we close this issue or possible tag it as 'Documentation'? |
Unfortunately, I don't think we can wash our hands of blame here so easily. The only reason this is an issue is because ZFS behaves differently than other Linux filesystems. That makes it our problem like it or not. I also agree with @jzachwieja that forcing people to add something to their There are two cases:
I'm not an NFS4 expert but I don't see any reason why this logic can't be pulled in to the zfs utilities. My suggestion would be to add another property to the dataset called But as @FransUrbo alluded too, longer term we're seriously thinking about moving all of this infrastructure to the ZED. Our hope is that it will make all of this machinery more transparent, flexible, and maintainable. But we still have some infrastructure to build before that's possible. Then we'll want to prototype it out to see if it's a good idea or not. So it's a ways off. |
@behlendorf Wouldn't adding a If we insist that this is something we should deal with (and I'm still not convinced), then how about a separate init script that deals with this? I think that's a bad idea, but at least it's better than adding a new property... |
Btw, shouldn't be required to adding a new property. We could always do what https://github.com/zfsonlinux/zfs-auto-snapshot do - use a |
Yes, if we added a That said, I think you're right the best way to go about this for now would be to use generic user property. That way we can avoid any changes to the ZFS code proper until we figure out what the best way to handle this is. We should probably use something like
I suspect we'll need to add additional NFS related properties at some point as well. The current translation scheme we're using only works for the simplest configurations. It would be nice to just be able to provide the native linux options, many of which don't have Illumos equivalents. But I digress. As for a separate script that's OK by me. The systemd support was broken up in to multiple units and the ubuntu packaging ships a zfs-mount and zfs-share script. |
Is there realistic potential for this issue to be addressed? I'm in the same scenario - NFSv4 + bind mounts - and would very much like to have a reasonable, manageable solution. |
@jbnance IMO, the correct Linux solution to this problem is described at the bottom of this article on the Arch wiki. Basically, you want to add a mount option in your fstab that will cause the system to wait for the file system to be mounted.
|
@afontenot thanks for the info, I'll try it out. If it works as described on other distros (especially EL7+) I would agree. I've been using the |
I had this problem on a media nas box and the solution suggest by @afontenot worked perfectly. My nfs shares could not reach into the zpool without manually "fixing" after boot but adding the mount option gave me a clean boot that worked fine first time. Adding a mount option (x-systemd.requires=zfs-mount.service) seems to me a simple and sensible solution for this issue. It probably just needs to be documented and better known. |
@afontenot, Just chiming in to say, the approach of changing the bind mounts in |
I prefer to close this issue, 0.8 will have a better systemd integration, and there is a workaround #971 (comment) |
…nzfs#971) Bumps [rustls-pemfile](https://github.com/rustls/pemfile) from 1.0.2 to 1.0.3. - [Commits](https://github.com/rustls/pemfile/commits) --- updated-dependencies: - dependency-name: rustls-pemfile dependency-type: indirect update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
If using bind mounts from within fstab to a location within a zpool, the mount point will behave oddly, not listing the full/actual contents of the destination (just directories, from what I can tell - might this be due to fstab being parsed and the source getting mounted prior to zfs taking control, with an odd behavior resulting?)
If mounted via mount after the system has fully booted (eg. mount -o bind /home /zpool/home), it behaves properly.
To replicate, make an entry in fstab and reboot (or export the pool, run the mount, and reimport):
/zpool/home /home none bind,defaults 0 0
Distro: Ubuntu 12.04.1
zfsonlinux: 0.6.0rc10 from PPA
I'm using the zfs-mount and zfs-share scripts, not the 'mountall' modified binary from the PPA (which appears to require the mount=legacy option?)
The text was updated successfully, but these errors were encountered: