Support .zfs/snapshot access via NFS #616

Closed
behlendorf opened this Issue Mar 22, 2012 · 13 comments

Comments

Projects
None yet
7 participants
@behlendorf
Member

behlendorf commented Mar 22, 2012

While the .zfs/snapshot directory is supported in the latest source, it is not yet accessible via NFS. There are a few integrations issues with the Linux NFS kernel servers which still need to be worked out. This issue is to track that work.

@behlendorf

This comment has been minimized.

Show comment
Hide comment
@behlendorf

behlendorf Mar 31, 2012

Member

The issue appears to be that zpl_encode_fh() is generating short rather than long fids for the snapshot entries. Short fids should be used for the normal zfs filesystem and the .zfs entires. However, long fids need to be used for the snapshot directories to ensure the object set is encoded. zfs_fid() is setup to do this but it assumes that the .zfsctl are part of a different filesystem from the normal filesystem. This is true on Solaris but not on Linux due to an implementation detail. We're going to need a different way to make the same check, we may also need to trigger automounting of the snapshot here.

Member

behlendorf commented Mar 31, 2012

The issue appears to be that zpl_encode_fh() is generating short rather than long fids for the snapshot entries. Short fids should be used for the normal zfs filesystem and the .zfs entires. However, long fids need to be used for the snapshot directories to ensure the object set is encoded. zfs_fid() is setup to do this but it assumes that the .zfsctl are part of a different filesystem from the normal filesystem. This is true on Solaris but not on Linux due to an implementation detail. We're going to need a different way to make the same check, we may also need to trigger automounting of the snapshot here.

@bubble1975

This comment has been minimized.

Show comment
Hide comment
@bubble1975

bubble1975 Nov 19, 2012

Just checking back on this item - has any progress been made on this item? Maybe rc12 has some developments on it?

Just checking back on this item - has any progress been made on this item? Maybe rc12 has some developments on it?

@behlendorf

This comment has been minimized.

Show comment
Hide comment
@behlendorf

behlendorf Nov 19, 2012

Member

@bubble1975 I'd love to see this fixed, but so far nobody has found the time to work on it.

Member

behlendorf commented Nov 19, 2012

@bubble1975 I'd love to see this fixed, but so far nobody has found the time to work on it.

@imp

This comment has been minimized.

Show comment
Hide comment
@imp

imp Jan 28, 2013

Contributor

This issue hurts me badly enough to make me look into it. @behlendorf Do you believe your comment above about Linux vs Solaris behavior still holds true ? It feels like latest changes to .zfs/snapshots hierarchy invalidated it. I've experimented with different approaches, but without any particular success meanwhile. I will summarize my observations and update this issue later.

Contributor

imp commented Jan 28, 2013

This issue hurts me badly enough to make me look into it. @behlendorf Do you believe your comment above about Linux vs Solaris behavior still holds true ? It feels like latest changes to .zfs/snapshots hierarchy invalidated it. I've experimented with different approaches, but without any particular success meanwhile. I will summarize my observations and update this issue later.

@behlendorf

This comment has been minimized.

Show comment
Hide comment
@behlendorf

behlendorf Jan 29, 2013

Member

@imp First off, thanks for starting to work on this. This is support I'd love to be using as well but I haven't had the time to dedicate to it. Doing it right is more complicated than it initially appears.

The above comments about the .zfs directory being part of the mounted dataset on Linux are true. But upon further reflection it's not clear to me that's part of the problem. Someone still needs to dig in to this and determine exactly what the problem is so I'm looking forward to your analysis. A few things to keep in mind when looking at this.

  • Under Linux the snapshot won't be automounted until the .zfs/snapshot directory gets traversed which invokes the automounted. Under Solaris the automount is done as part of the lookup so a stat(2) would trigger it. Exactly how this applies to the NFS kernel server isn't totally clear to me.
  • I believe when I last left this zfs_fid() was correctly generating long FIDs which encoded the snapshot id. See include/sys/zfs_vfsops.h:106 for a little discussion about how the NFS fids are encoded.
  • There are the usual NFS issues about traversing file systems which need to be understood in this context.
Member

behlendorf commented Jan 29, 2013

@imp First off, thanks for starting to work on this. This is support I'd love to be using as well but I haven't had the time to dedicate to it. Doing it right is more complicated than it initially appears.

The above comments about the .zfs directory being part of the mounted dataset on Linux are true. But upon further reflection it's not clear to me that's part of the problem. Someone still needs to dig in to this and determine exactly what the problem is so I'm looking forward to your analysis. A few things to keep in mind when looking at this.

  • Under Linux the snapshot won't be automounted until the .zfs/snapshot directory gets traversed which invokes the automounted. Under Solaris the automount is done as part of the lookup so a stat(2) would trigger it. Exactly how this applies to the NFS kernel server isn't totally clear to me.
  • I believe when I last left this zfs_fid() was correctly generating long FIDs which encoded the snapshot id. See include/sys/zfs_vfsops.h:106 for a little discussion about how the NFS fids are encoded.
  • There are the usual NFS issues about traversing file systems which need to be understood in this context.
@illenseer

This comment has been minimized.

Show comment
Hide comment
@illenseer

illenseer Feb 28, 2013

Are there any improvements made? This feature is very important for us, we need to export the snapshots via NFS. Are there any workarounds?

Are there any improvements made? This feature is very important for us, we need to export the snapshots via NFS. Are there any workarounds?

@behlendorf

This comment has been minimized.

Show comment
Hide comment
@behlendorf

behlendorf Feb 28, 2013

Member

@illenseer Short of properly implementing this for Linux the only workaround I can think of is to manually mount the snapshot mount -t zfs dataset mountpoint and then explicitly export it via NFS.

Member

behlendorf commented Feb 28, 2013

@illenseer Short of properly implementing this for Linux the only workaround I can think of is to manually mount the snapshot mount -t zfs dataset mountpoint and then explicitly export it via NFS.

@imp

This comment has been minimized.

Show comment
Hide comment
@imp

imp Aug 15, 2013

Contributor

Just want to give a heads up for all the interested party - my college @andrey-ve and me spent a lot of time recently on this issue. The problem turned out to be quite entangled. We have a working solution which we are going to push for review shortly. (All the gory details are going to be in review)

Contributor

imp commented Aug 15, 2013

Just want to give a heads up for all the interested party - my college @andrey-ve and me spent a lot of time recently on this issue. The problem turned out to be quite entangled. We have a working solution which we are going to push for review shortly. (All the gory details are going to be in review)

@sriccio

This comment has been minimized.

Show comment
Hide comment
@sriccio

sriccio Aug 15, 2013

Just want to let you know that I've compiled the zfs-snap branch of @andrey-ve including those patches and now snapshots are accessible from NFS export mounted on another linux box. Good work :)

sriccio commented Aug 15, 2013

Just want to let you know that I've compiled the zfs-snap branch of @andrey-ve including those patches and now snapshots are accessible from NFS export mounted on another linux box. Good work :)

@micw

This comment has been minimized.

Show comment
Hide comment
@micw

micw Feb 25, 2015

This also might affect bind mounts (e.g. when trying to access a snapshot from within a linux container or a docker box): http://comments.gmane.org/gmane.linux.file-systems.zfs.user/20179

@behlendorf , what's the reason to have the milestone set to 0.8.0? It seems that there are working patches since more than a year. Solving this would allow much new use cases for zfs on linux.

micw commented Feb 25, 2015

This also might affect bind mounts (e.g. when trying to access a snapshot from within a linux container or a docker box): http://comments.gmane.org/gmane.linux.file-systems.zfs.user/20179

@behlendorf , what's the reason to have the milestone set to 0.8.0? It seems that there are working patches since more than a year. Solving this would allow much new use cases for zfs on linux.

@micw

This comment has been minimized.

Show comment
Hide comment
@micw

micw Feb 25, 2015

For bind mounts (e.g. in lxc containers) the issue can be workarounded by calling

mount --make-shared /path/to/mounted/volume

before creating the bind mount.

micw commented Feb 25, 2015

For bind mounts (e.g. in lxc containers) the issue can be workarounded by calling

mount --make-shared /path/to/mounted/volume

before creating the bind mount.

@behlendorf

This comment has been minimized.

Show comment
Hide comment
@behlendorf

behlendorf Feb 27, 2015

Member

@micw Yes it's definitely not that far off, #2797 actually contains the latest version of the patches and they're something I hope to merge shortly after the next tag. It would be great if you could give the latest patches with a recent kernel and verify they work properly.

Member

behlendorf commented Feb 27, 2015

@micw Yes it's definitely not that far off, #2797 actually contains the latest version of the patches and they're something I hope to merge shortly after the next tag. It would be great if you could give the latest patches with a recent kernel and verify they work properly.

@behlendorf behlendorf modified the milestones: 0.6.5, 0.8.0 Feb 27, 2015

@segdy

This comment has been minimized.

Show comment
Hide comment
@segdy

segdy Apr 9, 2015

@micw Can you elaborate on the workaround with "--make-shared"? I was pretty excited when I saw it but unfortunately it does not work for me:

On the "root machine" (hardware node):

hwnode: # zfs create pool/test
hwnode: # zfs set snapdir=visible pool/test
hwnode: # mount --make-shared /pool/test
hwnode: # mount -n --bind /pool/test /var/lib/vz/root/200/mnt/test
hwnode: # zfs snapshot pool/test@first
hwnode: # zfs snapshot pool/test@second
hwnode: # touch /pool/test/testfile

Then, entering the OpenVZ container (conceptually should be the same as LXC):

container: # cd /mnt/test
container: # ls -a
.  ..  .zfs  testfile
container: # cd .zfs/snapshot
container: # ls -a
.  ..  first  second
container: # cd first
-bash: cd: first: Too many levels of symbolic links

... exactly the same result as without "--make-shared" :-(

segdy commented Apr 9, 2015

@micw Can you elaborate on the workaround with "--make-shared"? I was pretty excited when I saw it but unfortunately it does not work for me:

On the "root machine" (hardware node):

hwnode: # zfs create pool/test
hwnode: # zfs set snapdir=visible pool/test
hwnode: # mount --make-shared /pool/test
hwnode: # mount -n --bind /pool/test /var/lib/vz/root/200/mnt/test
hwnode: # zfs snapshot pool/test@first
hwnode: # zfs snapshot pool/test@second
hwnode: # touch /pool/test/testfile

Then, entering the OpenVZ container (conceptually should be the same as LXC):

container: # cd /mnt/test
container: # ls -a
.  ..  .zfs  testfile
container: # cd .zfs/snapshot
container: # ls -a
.  ..  first  second
container: # cd first
-bash: cd: first: Too many levels of symbolic links

... exactly the same result as without "--make-shared" :-(

@behlendorf behlendorf modified the milestones: 0.7.0, 0.6.5 Jul 16, 2015

behlendorf added a commit to behlendorf/zfs that referenced this issue Sep 3, 2015

Support accessing .zfs/snapshot via NFS
This patch is based on the previous work done by @andrey-ve and
@yshui.  It triggers the automount by using kern_path() to traverse
to the known snapshout mount point.  Once the snapshot is mounted
NFS can access the contents of the snapshot.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2797
Issue #1655
Issue #616

behlendorf added a commit to behlendorf/zfs that referenced this issue Sep 4, 2015

Support accessing .zfs/snapshot via NFS
This patch is based on the previous work done by @andrey-ve and
@yshui.  It triggers the automount by using kern_path() to traverse
to the known snapshout mount point.  Once the snapshot is mounted
NFS can access the contents of the snapshot.

Allowing NFS clients to access to the .zfs/snapshot directory would
normally mean that a root user on a client mounting an export with
'no_root_squash' would be able to use mkdir/rmdir/mv to manipulate
snapshots on the server.  To prevent configuration mistakes a
zfs_admin_snapshot module option was added which disables the
mkdir/rmdir/mv functionally.  System administators desiring this
functionally must explicitly enable it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #2797
Issue #1655
Issue #616

@behlendorf behlendorf closed this in 0500e83 Sep 4, 2015

tomgarcia added a commit to tomgarcia/zfs that referenced this issue Sep 11, 2015

Support accessing .zfs/snapshot via NFS
This patch is based on the previous work done by @andrey-ve and
@yshui.  It triggers the automount by using kern_path() to traverse
to the known snapshout mount point.  Once the snapshot is mounted
NFS can access the contents of the snapshot.

Allowing NFS clients to access to the .zfs/snapshot directory would
normally mean that a root user on a client mounting an export with
'no_root_squash' would be able to use mkdir/rmdir/mv to manipulate
snapshots on the server.  To prevent configuration mistakes a
zfs_admin_snapshot module option was added which disables the
mkdir/rmdir/mv functionally.  System administators desiring this
functionally must explicitly enable it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2797
Closes #1655
Closes #616

ryao added a commit to ClusterHQ/zfs that referenced this issue Sep 16, 2015

Support accessing .zfs/snapshot via NFS
This patch is based on the previous work done by @andrey-ve and
@yshui.  It triggers the automount by using kern_path() to traverse
to the known snapshout mount point.  Once the snapshot is mounted
NFS can access the contents of the snapshot.

Allowing NFS clients to access to the .zfs/snapshot directory would
normally mean that a root user on a client mounting an export with
'no_root_squash' would be able to use mkdir/rmdir/mv to manipulate
snapshots on the server.  To prevent configuration mistakes a
zfs_admin_snapshot module option was added which disables the
mkdir/rmdir/mv functionally.  System administators desiring this
functionally must explicitly enable it.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #2797
Closes #1655
Closes #616

behlendorf added a commit to behlendorf/zfs that referenced this issue May 21, 2018

Fix cv_timedwait timeout
Perform the already past expiration time check before updating
cvp->cv_mutex with the provided mutex.  This check only depends
on local state.  Doing it first ensures that cvp->cv_mutex will not
be updated in the timeout case or if it's ever called with an
expire_time <= now.

Reviewed-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #616
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment