Skip to content
This repository

Snapshot Directory (.zfs) #173

Closed
edwinvaneggelen opened this Issue March 23, 2011 · 80 comments
edwinvaneggelen

Was unable to find the .zfs directory. Normally this directory is present after snapshots are created. However I could not find this directory with the rc-2 release after snapshots are created.

Brian Behlendorf
Owner

The .zfs snapshot directory is not yet supported. It's on the list of development items which need to be worked on. Snapshots can still be mounted directly read-only using 'mount -t zfs pool/dataset@snap /mntpoint'.

http://zfsonlinux.org/zfs-development-items.html

Brian Behlendorf
Owner

Summary of Required Work

While snapshots do work the .zfs snapshot directory has not yet been implemented. Snapshots can be manually mounted as needed with the mount command, mount -t zfs dataset@snap /mnt/snap. To implement the .zfs snapshot directory a special .zfs inode must be created. This inode will have custom hooks which allow it list available snapshots as part of readdir(), and when a list is traversed the dataset must be mounted on demand. This should all be doable using the existing Linux automounter framework which has the advantage of simplifying the zfs code.

bziller
bziller commented May 06, 2011

Following your advice to use automounter I came up with the following:

#!/bin/bash

# /etc/auto.zfs
# This file must be executable to work! chmod 755!

key="$1"
opts="-fstype=zfs"

for P in /bin /sbin /usr/bin /usr/sbin
do
    if [ -x $P/zfs ]
    then
        ZFS=$P/zfs
        break
    fi
done

[ -x $ZFS ] || exit 1

ZFS="$ZFS list -rHt snapshot -o name $key"

$ZFS | LC_ALL=C sort -k 1 | \
    awk -v key="$key" -v opts="$opts" -- '
    BEGIN   { ORS=""; first=1 }
        { if (first) { print opts; first=0 }; s=$1; sub(key, "", s); sub(/@/, "/", s); print " \\\n\t" s " :" $1 }
    END { if (!first) print "\n"; else exit 1 } ' 

and to /etc/auto.master add

/.zfs  /etc/auto.zfs

Snapshots can then be easily accessed through /.zfs/poolname/fsname/snapshotname.
While this gives us not the .zfs directory inside our filesystem, it at least gives an easy way to access the snapshots for now.

Brian Behlendorf
Owner

Neat. Thanks for posting this. As you say it provides a handy way to get to the snapshots until we get the .zfs directory in place.

Rohan Puri

Hi Brian, I would like to work on this.

Brian Behlendorf
Owner

Sounds good to me! I'd very like to see this get done, it's a big deal for people who want to use ZFS for an NFS server. I've tried to describe my initial thinking at a high level in the previous comment:

https://github.com/behlendorf/zfs/issues/173#issuecomment-1095388

Why don't you dig in to nitty gritty details of this and we can discuss and concern, problems, issue you run in too.

Rohan Puri

Thank you Brian, I will start with this and discuss if i have any problems. FYI, I have created a branch named snapshot of your fork 'zfs' repo and wil be working for this in that branch.

Rohan Puri

Hi Brian,

I am done with snapshot automounting framework. When a snapshot is created for example 'snap1' on a pool by name 'tank' which is default mounted on '/tank', one can access the contents of the snapshot by doing cd to '/tank/.zfs/snapshot/snap1', implementation is done using the linux automount framework as you have suggested. Also, when someone tries to destroy this dataset, then the snapshot is unmounted. When someone tries to destroy the pool with the snapshot mounted, then the pool can be destroyed. Multiple snapshot mounts/unmounts work. Other places where snapshot unmount is called are rename and promote, it also works now.

But, one issue is that the functions which I am calling are linux kernel exported GPL functions which conflicts with the ZFS CDDL license. Currently to check the implementation, I changed the CDDL license to GPL.

One way of solving this issue is, write a wrapper functions in SPL module which is GPL licensed and export them from spl and make use of them in ZFS, instead of directly calling the GPL exported symbols of linux kernel. But want to know your opinion on this.

These symbols are : - vfs_kern_mount(), do_add_mount(), mark_mounts_for_expiry().

BTW, I am currently working on the access of auto-mounted snapshots through the NFS.

The link to the branch is : - https://github.com/rohan-puri/zfs/tree/snapshot.

Please have a look at the branch, if you get time and let me know if the implementation way seems to be correct or not.

Brian Behlendorf
Owner

Hi Rohan,

Thanks again for working on this. Here are my initial review comments and questions:

  • This isn't exactly what I had in mind. I must not have explained myself well, so let me try again. What I want to do is integrate ZFS with the generic Linux automounter to manage snapshots as autofs mount points. See man automount. Basically, we should be able to set up an automount map (/etc/auto.zfs) which describes the .zfs snapshot directory for a dataset. We can then provide a map-type of program which gets passed the proper key and returns the correct snapshot entry. Then when the snapshots are accessed we can rely on the automounter to mount them read-only and umount then after they are idle. On the in-kernel ZFS side we need some minimal support to show this .zfs directory and the list of available snapshots. There are obviously some details to still be worked out, but that's basically the rough outline I'd like to persue. This approach has several advantages.

    • Once we're using the standard Linux automounter there should be no need to check /proc/mounts (PROC_MOUNTS) instead of /etc/mnttab (MNTTAB). By using the standard mount utility and mount.zfs helper to mount the snapshots we ensure /etc/mnttab is properly locked and updated.

    • We don't need to use any GPL-only symbols such as vfs_kern_mount(), do_add_mount(), and mark_mounts_for_expiry().

    • This should requite fewer changes on the ZFS kernel code and be less complex.

    • Using the automounter is the Linux-way, it should be more understandable to general Linux community.

  • Add the full CDDL header to zpl_snap.c. We want it to be clear that all of the code being added to the ZFS repository is licensed under the CDDL.

  • As an aside if you want to change the module license for debugging purposes you can just update the 'License' line in the META file. Change CDDL to GPL and all the module will be built as GPL. This way you don't need to modify all the files individually. Obviously, this is only for local debugging.

Jeremy Sanders

Would this automounter idea make it impossible to see the .zfs directories over nfs?

Brian Behlendorf
Owner

Good question, I hope not. We should see what the current Linux behavior is regarding automouts on NFS servers. There's no reason we can't use ext4 for this today and see if traversing in to an automount mapped directory via NFS triggers the automount. If it doesn't we'll need to figure out if there's nothing we can do about it, we absolutely want the .zfs directory to work for NFS mounts.

Gunnar Beutner
Collaborator

I've come across this issue while implementing libshare. Assuming auto-mounted directories aren't visible (haven't tested it yet) a possible work-around would be to use the "crossmnt" NFS option, although that would have the side effect of making other sub-volumes accessible via NFS which is different from what Solaris does.

Rohan Puri

Hello Brian,

I tried using the bziller's script posted above, was able to mount the zfs snapshots using the linux automounter. We can make changes to the script and write minimal kernel code to show the .zfs and snapshot dir lists.

I agree with the approach you have provided.

Only one thing which we need to take care of is the unmounting to happen not only in case when the mounted snapshot file-system is idle but also in the following cases : -

  1. When snapshot is destroyed.
  2. When the file-system is destroyed of which the snapshot was taken.
  3. When the pool is destroyed that had the snapshots of one or more filesystem/s.
  4. When a snapshot is renamed.
  5. When a promote command is used.

When we use linux automounter, to force expiry of mountpoint we need to send USR1 signal to automount
command is
killall -USR1 automount.

It unmounts the unused snapshots. Have checked this.

Now the thing is we need to trigger this command for each of the above cases.

Need your feedback on this.

Brian Behlendorf
Owner

Wonderful, I'm glad to have you working on this.

I also completely agree we need to handle the 5 cases your describing. I think the cleanest way to do this will be to update the zfs utilities to perform the umount. In all the cases your concerned about the umount is needed because a user ran some zfs command. In the context of that command you can call something like unshare_unmount() to do the umount. This has the advantage of also cleaning tearing down any nfs/smb share.

I don't think you need to inform the automounter of this in anyway, but I haven't checked so I could be wrong about that.

Gunnar Beutner
Collaborator

Ideally the snapshot directory feature should work across cgroups/OpenVZ containers, so (end-)users can access snapshots when using ZFS datasets to store the root filesystem for containers.

Rohan Puri

Hi Brian,

I have implemented the minimal set of directory hierarchy for per file-system (in kernel) in my fork (new snapshot branch created) of your zfs repository, which supports .zfs dir, snapshot dir and snap entries dir's creations.

Was playing with linux automounter, facing some issues : -

bziller above used the indirect map, in which we need to specify a mount-point in the auto.master file, under which the zfs snapshot datasets would be mounted (this list is generated by giving the key which in this case is the fs dataset to the auto.zfs map file which is specific for the zfs fs).

In this case bziller solved the problem using /.zfs as the autofs mountpoint.

But each snapshot needs to be mounted under the .zfs/snapshot dir of its mount-point, which is different for each file system.

So this autofs mount-point has to be different for each individual zfs file system, under which we will mount snapshots related to that fs later on (using some kind of script auto.zfs as you said).

So the problem over here is which mountpoint we need to specify in the auto.master file ?

  1. We need to specify mount-point as /fs-path-to-mntpt/.zfs/snapshot (if this is the case, in kernel support for minimal dir hierarchy is also not required, as autofs takes care of creation). The problem is that this list will vary , so on creation of each fs( zpool/zfs utility) or pool need to edit the file and restart the automount service.

  2. use '/' as the mount-point. (cannot do this).

  3. can we execute shell commands in auto.master so that we can get the list and so some string appending to get the proper mount-point. (execution of commands not supported by auto.master, but auto.zfs is specifically executable).

Need your view on this.

Rudd-O
Rudd-O commented July 06, 2011

This will be much easier to do once we integrate with systemd. Systemd will take care of doing the right thing with respect to configuring the Linux kernel automounter -- all we will have to do is simply export the filesystems over NFS and voila.

Brian Behlendorf
Owner

In my view the cleanest solution will be your number 1 above.

When a new zfs filesystem is created it will be automatically added to /etc/auto.master and the automount daemon signaled to pick up the change. Conversely, when a dataset is destroyed it must be removed and the automount daemon signaled. For example, if we create the filesystem tank/fish the /etc/auto.master would be updated like this.

/tank/fish/.zfs/snapshot        /etc/auto.zfs        -zpool=tank

The /etc/auto.zfs script can then be used as a generic indirect map as described above. However, since it would be good to validate the key against the known set of snapshots we also need to make the pool name available to the script. I believe this can be done by passing it as an option to the map. The man page says arguments with leading dashes are considered options for the maps but I haven't tested this.

Sound good?

Ulrich Petri

To play the devils advocate: I can think of quite a few sysadmins who wouldn't take kindly at all to "some random filesystem" changing system config files.

Isn't this a situation similar to the way mounting of zfs filesystems is handled?
They also "just get mounted" without zfs create modifying /etc/fstab.

Brian Behlendorf
Owner

The trouble is we want to leverage the automount daemon to automatically do the mount for us so we don't need to have all the snapshots mounted all the time. For that to work we need to keep the automount daemon aware of the available snapshots via the config file.

Ulrich Petri

I assumed (naively I'm sure) that there would be some kind of API that would allow to dynamically register / remove automount maps without having to modify the config file.

Brian Behlendorf
Owner

If only that were true. :)

K Henriksson

How about running multiple instances of the automount daemon? Then ZFS could have it's own separate auto.master file just for handling its own snapshot mounts.

Rudd-O

I believe the automounted daemon is phased out in favor of systemd units. Those should be used first. We must write a small systems helper to inject units into systemd without touching configuration or unit files.

Rudd-O

I believe the automounted daemon is phased out in favor of systemd units. Those should be used first. We must write a small systems helper to inject units into systemd without touching configuration or unit files.

Rohan Puri

Hello Rudd-O, I agree that we should leverage systemd instead of making changes to the current infra (automounte daemon). But not all the systems may come with systemd, in which case we must provide an alternate way.

Hello Brian,

I do agree with ulope's point, also when I was working on it earlier and trying to implement the first solution (#173 (comment)) that you mentioned. Even after restarting the automount daemon I was not seeing the changes & they were getting reflected only after reboot.

NFS, CIFS, etc all these file systems make use of in-kernel mount, in which case we dont have to rely on automount for mounting.
As per #173 (comment) this. We can do this support with the in-kernel mount and the snapshots would be mounted only when they are accessed through their default mounting path in .zfs dir which when mounted will be associated with the timer. When this timer expires & the mountpoint dir in not in use which trigger the unmount.

Also all the above 5 cases in which unmount is to be triggered can also covered in this approach also.

Need your input :)
Regards,
Rohan

Rudd-O

I totally agree there has got to be an alternate non-systemd way. It'll probably mean some code duplication for a couple of years. It's okay.

Rohan Puri

We can avoid this by following the approach described in #173 (comment) , need Brians input on it though.

Whats your opinion on that?

Brian Behlendorf
Owner

So I think there are a few things to consider here. I agree that using the automounter while it seemed desirable on the surface seems to be causing more trouble than it's worth. Implementing the .zfs snapshot directly by mounting the snapshot via a kernel upcall during .zfs path traversal seems like a reasonable approach.

However, it's now clear to me why Solaris does something entirely different. If we mount the snapshots like normal filesystems under .zfs they will not be available from nfsv3 clients because they will have a different fsid. Since this is probably a pretty common use case it may be worth reimplementing the Solaris solution. That said, I wouldn't object to including your proposed solution as a stop gap for the short to medium term.

Rudd-O
Rohan Puri

Thank you Brian, I agree with Rudd-O :)

I do agree with what you are saying. I had done some earlier work on it. Will start again and complete it. I understand there will be a problem with the .zfs dir access through nfs. Since if the mounted snapshot is will have its own super_block structure and hence also its own fsid.

I had a brief look at it and I think : -

sb->s_dev is the value which we assign to the fsid field of the vattr struct and also return in statfs super_operations.

Tracing this field I came to know that this is initialized in the mount_nodev/get_sb_nodev operation, which calls sget method to get the super_block. Now sget has a 3rd parameter as the "set" function ptr which is the setup callback.

This callback function is responsible to initialize the sb->s_dev field.

Now the thing is that sget() by default calls the "set_anon_super" set callback which itself takes care of for setting the sb->s_dev field with the proper value.

Proposed solution : -

If we can write our own set callback then we can check for snapshot and if the mount is for snapshot we will set the sb->s_dev field to the sb->s_dev field of the dataset who's snapshot is this one. This will ensure that the fsid of the snapshot's and the fsid of the dataset who's snapshot are the same. Will solves our problem and hence they will be acessible through nfs also.

Also ceph, cifs, etc many fs have written there "set callback" functions.

Need your comments.

Brian Behlendorf
Owner

Normally I'd say that's a very bad idea. Most Linux filesystems use the fsid and the inode number to generate a unique server wide file handle for nfs. That means that if you have two filesystems exported with the same fsid you risk having non-unique handles and potentially accessing the wrong files. That's very dangerous.

However, I just took another look at the zfs_fid() function which is used by zfs to generate file handles for nfs and it does things differently for exactly this reason. The file handles generated for the .zfs snapshot directory will include not only a inode and generation number but also the objset_id. So we won't have the issue described above.

So I'm on board with your proposed fix for setting the sb->s_dev field for snapshots to match their parents. This should allow the nfs kernel server to traverse the entire dataset and children as if they were one filesystem while at the same time making sure there are unique server wide file handles. However, rather than registering our own callback we should be able to safely set sb->s_dev in the dmu_objset_is_snapshot() case of zfs_domount(). This will be run as part of mount_nodev().

Rudd-O
Rohan Puri

Thank you Brian. Will get started on this :)

Turbo Fredriksson
Collaborator

There's a bug in the script:

--- /etc/auto.zfs~ 2011-10-20 23:53:39.000000000 +0200
+++ /etc/auto.zfs 2011-10-20 23:58:58.000000000 +0200
@@ -17,7 +17,7 @@

[ -x $ZFS ] || exit 1

-ZFS="zfs list -rHt snapshot -o name $key"
+ZFS="$ZFS list -rHt snapshot -o name $key"

$ZFS | LC_ALL=C sort -k 1 | \
awk -v key="$key" -v opts="$opts" -- '

Without that, the whole point with looking for the binary above is pointless...

bziller

Updated the script above, thanks.

Rohan Puri

Hello Brian,

I have pushed 1st set of changes to new branch snapshots. Basic operation automounting of the snapshot is performed when the snapshot named dir is accessed through .zfs. Temporarily changed the license to GPL. Will write wrappers to these GPL functions in SPL and export them so that they can be used in ZFS module. Also in the upcoming patches will make use of autoconf defined macros to make the selection of the infrastructure to deploy for automounting.

There was this change introduced in kernel 2.6.38.

Need your review. :)

Brian Behlendorf
Owner

Sorry, I've been busy with other issue. I'll try to get this this soon so you can keep making progress. One quick comment about how mount is called though. I would prefer that instead of using the spl to wrap the GPL-only mount that you use a user mode helper to invoke the mount from user space. The are a couple of existing function which already do this sort of thing for various reasons, see vdev_elevator_switch() as one example.

Rohan Puri

Its ok. I saw vdev_elevator_switch() function. tried to make use of call_usermodehelper() function to do the mount from the user-space in zfs_snapshots_dir_mountpoint_follow_link() function in zpl_snap.c file.

But its not working. When I try to access the snapshot mountpoint in .zfs/snapshot dir for eg snapshot name is snap1.
I see nested dirs being created by name snap1 inside snap1 recursively.
Maybe i am missing for termination condition.
Following is the patch which can be directly applied on my last commit on snapshot branch of my zfs repo.

diff --git a/module/zfs/zpl_snap.c b/module/zfs/zpl_snap.c
index e52e25a..5b76098 100644
--- a/module/zfs/zpl_snap.c
+++ b/module/zfs/zpl_snap.c
@@ -33,6 +33,7 @@
struct inode *
zfs_snap_linux_iget(struct super_block *sb, unsigned long ino);

+#if 0
static struct vfsmount *
zfs_do_automount(struct dentry *mntpt)
{
@@ -55,9 +56,10 @@ zfs_do_automount(struct dentry *mntpt)
kfree(snapname);
return mnt;
}
+#endif

-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,38)

+//#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,38)
+#if 0
struct vfsmount *zfs_d_automount(struct path *path)
{
struct vfsmount *newmnt;
@@ -81,12 +83,43 @@ const struct dentry_operations zfs_dentry_ops = {
.d_automount = zfs_d_automount,
};

-#else
+#endif /* #if 0 */
+//#else

static void*
zfs_snapshots_dir_mountpoint_follow_link(struct dentry *dentry,
struct nameidata *nd)
{

  • char *argv[] = { "/bin/mount", "-t", "zfs", NULL, NULL };
  • char *envp[] = { NULL };
  • int error;
  • //char *snapname = NULL;
  • char *zfs_fs_name = NULL;
  • zfs_sb_t *zsb = ITOZSB(dentry->d_inode); +
  • printk("inode->i_ops : %p \n", dentry->d_inode->i_op);
  • ASSERT(dentry->d_parent); +
  • zfs_fs_name = kzalloc(MAXNAMELEN, KM_SLEEP);
  • dmu_objset_name(zsb->z_os, zfs_fs_name); +/* snapname = kzalloc(strlen(zfs_fs_name) +
  • strlen(mntpt->d_name.name) + 2, KM_SLEEP);
  • snapname = strncpy(snapname, zfs_fs_name, strlen(zfs_fs_name) + 1);
  • snapname = strcat(snapname, "@");
  • snapname = strcat(snapname, mntpt->d_name.name); +*/
  • argv[3] = kmem_asprintf("%s@%s", zfs_fs_name, dentry->d_name.name);
  • argv[4] = kmem_asprintf("/%s/.zfs/snapshot/%s", zfs_fs_name,
  • dentry->d_name.name);
  • kfree(zfs_fs_name);
  • printk(" zfs_snap.c : argv[3] : %s\n", argv[3]);
  • printk("zfs_snap.c : argv[4] : %s\n", argv[4]);
  • error = call_usermodehelper(argv[0], argv, envp, 1);
  • ASSERT(!error);
  • strfree(argv[4]);
  • return ERR_PTR(error);
    +
    +#if 0
    struct vfsmount *mnt = ERR_PTR(-ENOENT);
    mnt = zfs_do_automount(dentry);
    mntget(mnt);
    @@ -135,13 +168,14 @@ out_err:

    out:

    return ERR_PTR(rc);
    +#endif /* #if 0 */
    }

    const struct inode_operations zfs_snapshots_dir_inode_operations = {
    .follow_link = zfs_snapshots_dir_mountpoint_follow_link,
    };

-#endif
+//#endif
static int
zfs_snap_dir_readdir(struct file *filp, void *dirent, filldir_t filldir)
{
@@ -208,7 +242,7 @@ zfs_snap_dir_lookup(struct inode *dir,struct dentry *dentry,
return ERR_CAST(ip);
}
dentry_to_return = d_splice_alias(ip, dentry);

  • d_set_d_op(dentry, &zfs_dentry_ops); +// d_set_d_op(dentry, &zfs_dentry_ops); return dentry_to_return; }

@@ -338,11 +372,11 @@ zfs_snap_linux_iget(struct super_block *sb, unsigned long ino)
inode->i_fop = &zfs_snap_dir_file_operations;
}
else {
-#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,38)

+//#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,38)

inode->i_op = &zfs_snapshots_dir_inode_operations;
-#else

  •   inode->i_flags |= S_AUTOMOUNT;
    

    -#endif
    +//#else
    +// inode->i_flags |= S_AUTOMOUNT;
    +//#endif
    inode->i_fop = &simple_dir_operations;
    }
    unlock_new_inode(inode);
    @@ -357,7 +391,6 @@ zfs_snap_create(zfs_sb_t *zsb)
    struct dentry *dentry_ctl_dir = NULL;
    struct dentry *dentry_snap_dir = NULL;

  • printk("in zfs_snap_create \n");
    ip_ctl_dir = zfs_snap_linux_iget(zsb->z_sb, ZFSCTL_INO_ROOT);

    ASSERT(!IS_ERR(ip_ctl_dir));

Also the pushed code on my snapshot branch seems to work completely fine except for promote case.

Rohan Puri

Now, instead of invoking mount from user in follow_link() i wrote that code in lookup of snapshot dir. Now that recursive problem is gone. But mount is not happening. snapshot directories are being created & accessible.

Verified arguments to call_usermodehelper() and those are correct.

Following is the patch : -

diff --git a/module/zfs/zpl_snap.c b/module/zfs/zpl_snap.c
index e52e25a..8a55f8e 100644
--- a/module/zfs/zpl_snap.c
+++ b/module/zfs/zpl_snap.c
@@ -33,6 +33,7 @@
struct inode *
zfs_snap_linux_iget(struct super_block *sb, unsigned long ino);

+#if 0
static struct vfsmount *
zfs_do_automount(struct dentry *mntpt)
{
@@ -55,9 +56,10 @@ zfs_do_automount(struct dentry *mntpt)
kfree(snapname);
return mnt;
}
+#endif

-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,38)

+//#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,38)
+#if 0
struct vfsmount *zfs_d_automount(struct path *path)
{
struct vfsmount *newmnt;
@@ -81,12 +83,43 @@ const struct dentry_operations zfs_dentry_ops = {
.d_automount = zfs_d_automount,
};

-#else
+#endif /* #if 0 */
+//#else

static void*
zfs_snapshots_dir_mountpoint_follow_link(struct dentry *dentry,
struct nameidata *nd)
{

  • char *argv[] = { "/bin/mount", "-t", "zfs", NULL, NULL };
  • char *envp[] = { NULL };
  • int error;
  • //char *snapname = NULL;
  • char *zfs_fs_name = NULL;
  • zfs_sb_t *zsb = ITOZSB(dentry->d_inode); +
  • printk("inode->i_ops : %p \n", dentry->d_inode->i_op);
  • ASSERT(dentry->d_parent); +
  • zfs_fs_name = kzalloc(MAXNAMELEN, KM_SLEEP);
  • dmu_objset_name(zsb->z_os, zfs_fs_name); +/* snapname = kzalloc(strlen(zfs_fs_name) +
  • strlen(mntpt->d_name.name) + 2, KM_SLEEP);
  • snapname = strncpy(snapname, zfs_fs_name, strlen(zfs_fs_name) + 1);
  • snapname = strcat(snapname, "@");
  • snapname = strcat(snapname, mntpt->d_name.name); +*/
  • argv[3] = kmem_asprintf("%s@%s", zfs_fs_name, dentry->d_name.name);
  • argv[4] = kmem_asprintf("/%s/.zfs/snapshot/%s", zfs_fs_name,
  • dentry->d_name.name);
  • kfree(zfs_fs_name);
  • printk(" zfs_snap.c : argv[3] : %s\n", argv[3]);
  • printk("zfs_snap.c : argv[4] : %s\n", argv[4]);
  • error = call_usermodehelper(argv[0], argv, envp, 1);
  • ASSERT(!error);
  • strfree(argv[4]);
  • return ERR_PTR(error);
    +
    +#if 0
    struct vfsmount *mnt = ERR_PTR(-ENOENT);
    mnt = zfs_do_automount(dentry);
    mntget(mnt);
    @@ -135,13 +168,14 @@ out_err:

    out:

    return ERR_PTR(rc);
    +#endif /* #if 0 */
    }

    const struct inode_operations zfs_snapshots_dir_inode_operations = {
    .follow_link = zfs_snapshots_dir_mountpoint_follow_link,
    };

-#endif
+//#endif
static int
zfs_snap_dir_readdir(struct file *filp, void *dirent, filldir_t filldir)
{
@@ -195,6 +229,13 @@ zfs_snap_dir_lookup(struct inode *dir,struct dentry *dentry,
struct dentry *dentry_to_return = NULL;
zfs_sb_t *zsb = ITOZSB(dir);

  • char *argv[] = { "/bin/mount", "-t", "zfs", NULL, NULL };
  • char *envp[] = { NULL };
  • int error;
  • //char *snapname = NULL;
  • char *zfs_fs_name = NULL;
  • zfs_sb_t *zsb1 = NULL; + if (dentry->d_name.len >= MAXNAMELEN) { return ERR_PTR(-ENAMETOOLONG); } @@ -204,11 +245,28 @@ zfs_snap_dir_lookup(struct inode *dir,struct dentry *dentry, } ip = zfs_snap_linux_iget(zsb->z_sb, ZFSCTL_INO_SHARES - id);
  • zsb1 = ITOZSB(ip); +
  • printk("inode->i_ops : %p \n", ip->i_op);
  • // ASSERT(dentry->d_parent); +
  • zfs_fs_name = kzalloc(MAXNAMELEN, KM_SLEEP);
  • dmu_objset_name(zsb1->z_os, zfs_fs_name);
  • dentry_to_return = d_splice_alias(ip, dentry);
  • argv[3] = kmem_asprintf("%s@%s", zfs_fs_name, dentry->d_name.name);
  • argv[4] = kmem_asprintf("/%s/.zfs/snapshot/%s", zfs_fs_name,
  • dentry->d_name.name);
  • kfree(zfs_fs_name);
  • printk(" zfs_snap.c : argv[3] : %s\n", argv[3]);
  • printk("zfs_snap.c : argv[4] : %s\n", argv[4]);
  • error = call_usermodehelper(argv[0], argv, envp, 1);
  • ASSERT(!error);
  • strfree(argv[4]); +// return ERR_PTR(error); if(unlikely(IS_ERR(ip))) { return ERR_CAST(ip); }
  • dentry_to_return = d_splice_alias(ip, dentry);
  • d_set_d_op(dentry, &zfs_dentry_ops); +// d_set_d_op(dentry, &zfs_dentry_ops); return dentry_to_return; }

@@ -338,11 +396,12 @@ zfs_snap_linux_iget(struct super_block *sb, unsigned long ino)
inode->i_fop = &zfs_snap_dir_file_operations;
}
else {
-#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,38)

  • inode->i_op = &zfs_snapshots_dir_inode_operations; -#else
  • inode->i_flags |= S_AUTOMOUNT; -#endif +//#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,38)
    +// inode->i_op = &zfs_snapshots_dir_inode_operations;
  •   inode->i_op = &simple_dir_inode_operations;
    

    +//#else
    +// inode->i_flags |= S_AUTOMOUNT;
    +//#endif
    inode->i_fop = &simple_dir_operations;
    }
    unlock_new_inode(inode);
    @@ -357,7 +416,6 @@ zfs_snap_create(zfs_sb_t *zsb)
    struct dentry *dentry_ctl_dir = NULL;
    struct dentry *dentry_snap_dir = NULL;

  • printk("in zfs_snap_create \n");
    ip_ctl_dir = zfs_snap_linux_iget(zsb->z_sb, ZFSCTL_INO_ROOT);

    ASSERT(!IS_ERR(ip_ctl_dir));

Brian Behlendorf
Owner

Hi Rohan, I've started to look at this but it's a little difficult since the changes are scattered in your branch. I'll provide an initial round of high level comments here, but can you please rebase your changes in a new branch as a series of commits against master. Then you can open a pull request and I'll be able to annotate the patch (or patches) with detailed comments inline.

Comments and questions:

  • The branch fails to build because it appears to be missing 'linux/snapshots_automount.h'.

  • Updating the utilities to check /proc/mounts instead of /etc/mtab is reasonable because /proc/mounts will be authoritative. The userspace mount utilities are responsible for update /etc/mtab so of course if your doing mounts from within the kernel /etc/mtab won't be updated. By using a usermodehelper we could keep /etc/mtab up to date although I don't think we want too for snapshots.

  • Please remember to remove all the stray white space which occurs throughout the patch. I've no idea what you editor you prefer but I personally use vim and most good editors can be configured it to automatically mark this sort of thing. For example see:

    http://vim.wikia.com/wiki/Highlight_unwanted_spaces

  • In zfs_domount() you can't move the zfs root inode and dentry allocations earlier in the function without also adjusting their error return paths. It's unsafe to call zfs_umount() until after zfs_sb_setup() is called. Some care needs to be taken with how all this is torn down on error.

  • Why do we need to change these #defines?

-#define        ZFSCTL_INO_ROOT         0x1
-#define        ZFSCTL_INO_SNAPDIR      0x2
-#define        ZFSCTL_INO_SHARES       0x3
+#define        ZFSCTL_INO_ROOT         0xFFFFFFFFFFFFFFFF
+#define        ZFSCTL_INO_SNAPDIR      0xFFFFFFFFFFFFFFFE
+#define        ZFSCTL_INO_SHARES       0xFFFFFFFFFFFFFFFD
  • Why did you need to invert this logic?
-#ifdef HAVE_SNAPSHOT
        zfs_sb_t *zsb = sb->s_fs_info;
 
-       if (zsb && dmu_objset_is_snapshot(zsb->z_os))
+       if (zsb && !dmu_objset_is_snapshot(zsb->z_os))
                zfs_snap_destroy(zsb);
-#endif /* HAVE_SNAPSHOT */
  • As for why your usermodehelper isn't working I'd need to see the error to be sure, but it looks like your argv[] array isn't properly NULL terminated which would cause problems.

  • Please use a zpl_* prefix for the functions in zpl_snap.c. I've tried to maintain this as a clean layer on top of the zfs_* sources with zpl_* functions only calling lower level zfs_* functions. The zfs_* functions must have Solaris like positive errnos returned, and the Linux zpl_* functions will return negative errors for Linux.

  • Use the crgetuid() and crgetgid() helpers from the spl to avoid the need for the #ifdef here.

#ifdef HAVE_CRED_STRUCT
        inode->i_uid = current->cred->uid;
        inode->i_gid = current->cred->gid;
#else
        inode->i_uid = current->uid;
        inode->i_gid = current->gid;
#endif
  • Avoid the following sorts of conditionals which rely on a specific version number. This sort of check is prone to breakage because each distribution cherry picks patches from upstream. RHEL is the biggest offender their 2.6.32 kernel doesn't look anything like a vanilla 2.6.32 kernel because it has so much backported to it. You will need to write a specific autoconf check for the functionality you need and add it as config/kernel*.m4.
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,38)
  • This is just a quick start I'll keep chewing on the actual contents of zpl_snap.c and get back to you. It would be best if I could do that commentary inline with an updated patch and pull request.

Thanks again for working on this.

Rohan Puri

Hello Brian,

I have rebased the changes to the new branch named "snapshot" for both spl & zfs. You can review this code. I still have to make other changes wrt to removal of unwanted spaces, autoconf macro checking instead of kernel version checking. Will do all that and then send you the pull request.

I had updated the master with the new changes which you had committed & later merged master within my branch. Because of that, I think the problem arised. I should have rebased by branch to the master instead of that. Right?

Thanks for the review.

  1. Branch failed to build because you need to compile it with the snapshots branch of spl also which contains snapshots_automount.h file.

  2. in zfs_domount() i agree i need to look in more detail regarding the error cases, thanks for pointing out. Basically before zfs_snap_create() is called, root inode should be allocated. In zfs_snap_create() i need this to alloc dentry for .zfs dir which is child dir of root dir. Will fix these error case bugs.

  3. root inode i printed for a dataset is 4. Now when i create dirs/files inside suppose a pool named tank they will get the inode number in incremented manner. inodes's inode number is basically theobj num of the znode.

Now this object number is incremented for each file/dir creation. basically we make a call to zfs_mknode() and this indeed makes a call to dmu_object_alloc() which does this incrementing thing.

So suppose a snapshot is created & accessed so it has a particular inode (which would be the on the lower end in case of its value) so if i continuously create file/dir its inode no will increase and a soon a case will come where a file/dir will want inode of inode number will has been already given to snapshot. This results in panic.

So the idea is to allocate the inodes for the automounted snapshot dirs from the upper level in decreasing order. (analogous to stack in linux). 2^64 is use no.

Trace of panic if i use only obj num : -

[32247.146744] root znode inode no : 4
[32262.458387] snapname : snap1 , inode no: 34 
[32262.464803] root znode inode no : 4
[32262.478896] snapname : snap1 , inode no: 34 
[32330.091775] snapname : snap2 , inode no: 38 
[32330.091802] snapname : snap1 , inode no: 34 
[32330.096832] root znode inode no : 4
[32330.108680] snapname : snap2 , inode no: 38 
[32330.108704] snapname : snap1 , inode no: 34 
[32331.272921] snapname : snap2 , inode no: 38 
[32331.272952] snapname : snap1 , inode no: 34 
[32331.285107] snapname : snap2 , inode no: 38 
[32331.285129] snapname : snap1 , inode no: 34 
[32348.152581] snapname : snap3 , inode no: 43 
[32348.152608] snapname : snap2 , inode no: 38 
[32348.152631] snapname : snap1 , inode no: 34 
[32348.157633] root znode inode no : 4
[32348.167320] snapname : snap3 , inode no: 43 
[32348.167343] snapname : snap2 , inode no: 38 
[32348.167362] snapname : snap1 , inode no: 34 
[32348.601668] snapname : snap3 , inode no: 43 
[32348.601694] snapname : snap2 , inode no: 38 
[32348.601720] snapname : snap1 , inode no: 34 
[32348.613351] snapname : snap3 , inode no: 43 
[32348.613383] snapname : snap2 , inode no: 38 
[32348.613405] snapname : snap1 , inode no: 34 
[32349.657005] snapname : snap3 , inode no: 43 
[32349.657034] snapname : snap2 , inode no: 38 
[32349.657058] snapname : snap1 , inode no: 34 
[32349.668230] snapname : snap3 , inode no: 43 
[32349.668252] snapname : snap2 , inode no: 38 
[32349.668272] snapname : snap1 , inode no: 34 
[32367.025889] rohan inode is : 7 
[32374.594891] rohan1 inode is : 8 
[32375.986640] rohan2 inode is : 9 
[32377.602464] rohan3 inode is : 10 
[32378.899405] rohan4 inode is : 11 
[32380.194271] rohan5 inode is : 12 
[32381.762687] rohan6 inode is : 13 
[32383.074447] rohan7 inode is : 14 
[32384.485695] rohan8 inode is : 15 
[32386.066885] rohan9 inode is : 16 
[32387.762166] rohan10 inode is : 17 
[32389.363479] rohan11 inode is : 18 
[32391.016345] rohan12 inode is : 19 
[32394.241993] rohan13 inode is : 20 
[32396.242524] rohan14 inode is : 21 
[32397.699871] rohan15 inode is : 22 
[32399.747182] rohan16 inode is : 23 
[32401.506395] rohan17 inode is : 24 
[32404.898176] rohan18 inode is : 25 
[32406.803612] rohan19 inode is : 26 
[32408.882151] rohan20 inode is : 27 
[32411.682436] rohan21 inode is : 28 
[32414.018510] rohan22 inode is : 29 
[32415.683234] rohan23 inode is : 30 
[32418.177889] rohan24 inode is : 31 
[32420.210657] rohan25 inode is : 32 
[32421.841839] rohan26 inode is : 33 
[32423.411262] ------------[ cut here ]------------
[32423.411284] WARNING: at fs/inode.c:901 unlock_new_inode+0x6c/0x80()
[32423.411295] Hardware name: HP dx2700 MT(RC738AV)
[32423.411302] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl zlib_deflate binfmt_misc snd_hda_codec_realtek i915 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi drm_kms_helper snd_seq_midi_event snd_seq drm ppdev r8169 snd_timer snd_seq_device psmouse usbhid hid snd parport_pc serio_raw i2c_algo_bit video soundcore lp snd_page_alloc parport [last unloaded: spl]
[32423.411423] Pid: 30102, comm: mkdir Tainted: P            3.0.4 #2
[32423.411434] Call Trace:
[32423.411454]  [] warn_slowpath_common+0x7f/0xc0
[32423.411472]  [] warn_slowpath_null+0x1a/0x20
[32423.411512]  [] unlock_new_inode+0x6c/0x80
[32423.411586]  [] zfs_znode_alloc+0x3c9/0x4e0 [zfs]
[32423.411690]  [] zfs_mknode+0x826/0xcd0 [zfs]
[32423.411761]  [] zfs_mkdir+0x44c/0x580 [zfs]
[32423.411854]  [] zpl_mkdir+0x9e/0xf0 [zfs]
[32423.411900]  [] vfs_mkdir+0xa4/0x100
[32423.411917]  [] sys_mkdirat+0x125/0x140
[32423.411941]  [] ? ftrace_call+0x5/0x2b
[32423.411982]  [] sys_mkdir+0x18/0x20
[32423.411994]  [] system_call_fastpath+0x16/0x1b
[32423.412006] ---[ end trace ea4d887a1277cc89 ]---
[32423.412062] general protection fault: 0000 [#1] SMP 
[32423.412169] CPU 0 
[32423.412205] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl zlib_deflate binfmt_misc snd_hda_codec_realtek i915 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi drm_kms_helper snd_seq_midi_event snd_seq drm ppdev r8169 snd_timer snd_seq_device psmouse usbhid hid snd parport_pc serio_raw i2c_algo_bit video soundcore lp snd_page_alloc parport [last unloaded: spl]
[32423.413118] 
[32423.413149] Pid: 30102, comm: mkdir Tainted: P        W   3.0.4 #2 Hewlett-Packard HP dx2700 MT(RC738AV)/0A78h
[32423.413369] RIP: 0010:[]  [] zfs_inode_destroy+0xbd/0xe0 [zfs]
[32423.413565] RSP: 0018:ffff8800065c5948  EFLAGS: 00010286
[32423.413644] RAX: ffff88000edf28f0 RBX: ffff88000edf2918 RCX: dead000000100100
[32423.413749] RDX: dead000000200200 RSI: dead000000200200 RDI: ffff88000d6fd3f8
[32423.413859] RBP: ffff8800065c5968 R08: 8018000000000000 R09: 000edf29400c0000
[32423.413962] R10: ffd320da77be5003 R11: 0000000000000001 R12: ffff88000edf27a0
[32423.414119] R13: ffff88000d6fd000 R14: ffff88000d6fd3f8 R15: 0000000000000000
[32423.414223] FS:  00007f74050ed7c0(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
[32423.414344] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[32423.414435] CR2: 00000000004023a0 CR3: 0000000006506000 CR4: 00000000000006f0
[32423.414539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[32423.414643] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[32423.414750] Process mkdir (pid: 30102, threadinfo ffff8800065c4000, task ffff88007b975bc0)
[32423.414871] Stack:
[32423.414908]  ffff88000edf2918 ffff88000edf2938 ffff88000d3ce000 ffffffffa03fc6e0
[32423.415046]  ffff8800065c5978 ffffffffa03f00be ffff8800065c5998 ffffffff8117b1ec
[32423.415204]  ffff88000edf2918 ffff88000edf2918 ffff8800065c59b8 ffffffff8117b7a6
[32423.415340] Call Trace:
[32423.415436]  [] zpl_inode_destroy+0xe/0x10 [zfs]
[32423.415541]  [] destroy_inode+0x3c/0x70
[32423.415627]  [] evict+0xe6/0x170
[32423.415704]  [] iput+0xea/0x1c0
[32423.415837]  [] zfs_znode_alloc+0x3d1/0x4e0 [zfs]
[32423.416002]  [] zfs_mknode+0x826/0xcd0 [zfs]
[32423.416175]  [] zfs_mkdir+0x44c/0x580 [zfs]
[32423.416344]  [] zpl_mkdir+0x9e/0xf0 [zfs]
[32423.416441]  [] vfs_mkdir+0xa4/0x100
[32423.416529]  [] sys_mkdirat+0x125/0x140
[32423.416617]  [] ? ftrace_call+0x5/0x2b
[32423.416721]  [] sys_mkdir+0x18/0x20
[32423.416805]  [] system_call_fastpath+0x16/0x1b
[32423.416925] Code: b5 f8 03 00 00 4c 89 f7 e8 d1 d2 1e e1 4c 89 e0 49 03 85 e0 03 00 00 48 be 00 02 20 00 00 00 ad de 4c 89 f7 48 8b 08 48 8b 50 08 
[32423.417384]  89 51 08 48 89 0a 48 b9 00 01 10 00 00 00 ad de 48 89 08 48 
[32423.417643] RIP  [] zfs_inode_destroy+0xbd/0xe0 [zfs]
[32423.417807]  RSP 
[32423.487769] ---[ end trace ea4d887a1277cc8a ]---

Regarding dmu_objset_is_snapshot() it returns true if its a snapshot and zfs_snap_destroy() should be called only in case of zfs filesystems as it does the cleanups of .zfs & .zfs/snapshot dirs, which are availabe only in case of normal filesystems and not snapshots, hence inversion.

Will take a more close look at usermodehelper.

Will fix issues related to naming conventions in zpl_snap.c

Fixed cred uid & gid issue

I will make the checks using autoconf.

I need to do more changes to the patches will all your comments, will make them ASAP so that i can send the pull request.

Thank you for helping and reviewing. :)

Gunnar Beutner
Collaborator

Hi,

I had to apply the following patch to make 'ls -l' in dataset root dirs work:

diff --git a/module/zfs/zfs_vnops.c b/module/zfs/zfs_vnops.c
index f2f3af4..1c98a23 100644
--- a/module/zfs/zfs_vnops.c
+++ b/module/zfs/zfs_vnops.c
@@ -2035,7 +2035,7 @@ zfs_readdir(struct inode *ip, void *dirent, filldir_t filldir,
                        dmu_prefetch(os, objnum, 0, 0);
                }

-               if (*pos >= 2) {
+               if (*pos > 2 || (*pos == 2 && !zfs_show_ctldir(zp))) {
                        zap_cursor_advance(&zc);
                        *pos = zap_cursor_serialize(&zc);
                } else {

Regards,
Gunnar

Brian Behlendorf
Owner

Thanks for the clarifications Rohan, I'm looking forward to taking a look at an updated patch rebased on master.

Rohan Puri

Fixed trailing white-space errors

Some issues with autoconf for checking the whether dentry_operations has d_automount fn ptr or not. Defined a new file in config dir also included that file in almost all the Makefile.in but the print message is not being printed.

My kernel-d-op-auto.m4 file is as follows : -

dnl #
dnl # 2.6.38 API change
dnl # Dentry automount operation available
dnl #
AC_DEFUN([ZFS_AC_KERNEL_D_OP_AUTO], [
        AC_MSG_CHECKING([whether super_block has s_bdi])
        ZFS_LINUX_TRY_COMPILE([
                #include 
        ],[
                struct dentry_operations d_op __attribute__ ((unused));
                d_op.d_automount = NULL;
        ],[
                AC_MSG_RESULT(yes)
                AC_DEFINE(HAVE_D_AUTO, 1, [struct dentry_operations has d_automount])
        ],[
                AC_MSG_RESULT(no)
        ])
])

I am sending you the pull request will rest every thing fixed

Brian Behlendorf behlendorf referenced this issue from a commit in behlendorf/zfs November 11, 2011
Rohan Puri Add .zfs control directory
This is still a work in progress, I would say it's roughly 90%
complete but it needs additional testing, verification, and some
more cleanup.  However, it's in good enough shape to start getting
some feedback on.  I've done testing with RHEL6.2 where everything
works, but there are a few lingering issues:

1) The .zfs/snapshot directory automount support requires a 2.6.37
   or newer kernel.  The exception is RHEL6.2 which has backported
   the d_automount patches.  Support for older kernels is possible
   with follow_link() but it needs to be done carefully.  Getting
   all the reference counting right and avoiding GPL-only functions
   will be tricky.

2) Support for mkdir/rmdir/mv has been implemented in the
   .zfs/snapshap directory just like Solaris but it needs more
   testing.  This functionality is only available to root until
   zfs delegations are finished.

      * mkdir - create a snapshot
      * rmdir - destroy a snapshot
      * mv    - rename a snapshot

3) The .zfs/shares directory is created but none of the smb
   functionality is implemented.

4) No testing via nfs over zfs has yet been done.  However, I did
   lay the ground work such that we should be able to traverse in
   in to the .zfs directory successfully.  It's mainly a matter
   of trying it and fixing the problems.

5) It's currently unsafe to manually unmount an automounted snapshot
   before it expires.  This should be an unlikely even since the
   mounts are suppressed from /etc/mtab but we still need to handle
   it safely.  My feeling is the best way to go about this will be
   to rework the unmount path so that the zfs_umount() call for a
   snapshot removes the matching entry from the parent z_ctldir_snaps
   tree.  This should allow for additional cleanup of the tree
   insertion/removal code since all removals manual and automatic
   will use the same call path.

6) As for cleanup I don't care for having to set MNT_SHRINKABLE in
   zpl_getattr().  We should be able to acheive this in the mount
   function by using follow_down() to get the new vfsmount.  We
   could then correctly return this instead of signaling a race
   by returning NULL.  However, getting the reference counting
   right might be tricky.  This could also be done when the code
   is refactored to support follow_link().

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #173
6e2c435
Brian Behlendorf
Owner

Alright Rohan (and others) sorry about the long delay. I've got an updated version of your patch for you to look at. I've taken your initial implementation and ran with a bit further. It's still not 100% complete but I was hoping you could help with that!

I wanted to stay as close to the Solaris implementation as was reasonable so I basically took the zfs_ctldir.[ch] files from Solaris and dropped them in to the tree. I then went through and removed all the HAVE_SNAPSHOT defines from the ZoL code. These are all the integration points which the Solaris version expects. I also readded some snapshot code from the original Solaris version which accidentally got dropped from ZoL. Finally, I updated the zfs_ctldir.c and zpl_ctldir.c files based on your initial implementation and the Solaris implementation.

You can see the updated version here:

https://github.com/behlendorf/zfs/tree/snapshot
behlendorf@6e2c435

This is still a work in progress, I would say it's roughly 90% complete but it needs additional testing, verification, and some more cleanup. However, it's in good enough shape to start getting some feedback on. I've done testing with RHEL6.2 where everything works, but there are a few lingering issues:

  • The .zfs/snapshot directory automount support requires a 2.6.37 or newer kernel. The exception is RHEL6.2 which has backported the d_automount patches. Support for older kernels is possible with follow_link() but it needs to be done carefully. Getting all the reference counting right and avoiding GPL-only functions will be tricky.

  • Support for mkdir/rmdir/mv has been implemented in the .zfs/snapshap directory just like Solaris (although I've never seen this functionality documented anyway, I only learned about it by reading the source) . This functionality is only available to root until zfs delegations are finished.

    • mkdir - create a snapshot
    • rmdir - destroy a snapshot
    • mv - rename a snapshot
  • The .zfs/shares directory is created but none of the smb functionality is implemented.

  • No testing via nfs over zfs has yet been done. However, I did lay the ground work such that we should be able to traverse in to the .zfs directory successfully. It's mainly a matter of trying it and fixing the problems.

  • It's currently unsafe to manually unmount an automounted snapshot before it expires. This should be an unlikely event since the mounts are suppressed from /etc/mtab but we still need to handle it safely. My feeling is the best way to go about this will be to rework the unmount path so that the zfs_umount() call for a snapshot removes the matching entry from the parent z_ctldir_snaps tree. This should allow for additional cleanup of the tree insertion/removal code since all removals manual and automatic will use the same call path.

  • As for cleanup I don't care for having to set MNT_SHRINKABLE in zpl_getattr(). We should be able to acheive this in the mount function by using follow_down() to get the new vfsmount. We could then correctly return this instead of signaling a race by returning NULL. However, getting the reference counting right might be tricky. This could also be done when the code is refactored to support follow_link().

Turbo Fredriksson
Collaborator

I wanted to stay as close to the Solaris implementation as was reasonable so I basically took the zfs_ctldir.[ch] files from Solaris and dropped them in to the tree. I then went through and removed all the HAVE_SNAPSHOT defines from the ZoL code.

Sometimes I wonder why we bother... :)

Brian Behlendorf
Owner

It does making cherry picking fixes and features from Illumous easier. But your right, sometime it's not worth the pain.

Turbo Fredriksson
Collaborator

I rather meant why we bother :). You'll throw away most and take it from upstream/solaris anyway :). Gunnar is doing the same thing with my iSCSI fixes.. He kept maybe... 10-15% of what I wrote and replaced it with stuff from ZoS...

Turbo Fredriksson
Collaborator

NOT that that's a bad thing, but while you're at it, just take all of it since you're familiar with the code anyway. I have a huge problem trying to understand the code (it jumps back and forth, to somewhere else - pure spaggetti :).

Brian Behlendorf
Owner

Please keep trying! We need the help! But it's mostly just a matter of consistency and I admit those rules really aren't written down anywhere. They should be, if I ever find a spare day or two I should write down basic contributor guidelines. I'd prefer not to have to tweak patches. :) Honestly, I prefer to just provide feedback on most patches and iterate with the original author. That's at least an approach that scales.

Rohan Puri

Thanks Brian, I will go through the patch and give my comments ASAP.

Rohan Puri

Just one word, superb!!!

Brian Behlendorf
Owner

Thanks, but it's not quite ready to be merged yet. I'd like to at least try and get the follow_link() support added for pre-2.6.37 kernels. Plus some basic NFS export testing needs to happen to see if that support works at all. Do you have any time to help finalize the patch?

Rohan Puri

Yes sir, I always have time to work on ZFS :) tell me what to do? Also this code is on your fork of zfsonlinux's zfs. I have also forked the repo from zfsonlinux, so how to get your these changes to merged into my repo to get started with it?

Brian Behlendorf
Owner

@rohan-puri Could you look in to adding the follow_link() support. This may or may not be possible I haven't spent the time reading the kernel code for how this hack was previously implemented. You'll probably want to refactor the mount helper code a little bit too so I can be used by both d_automount and follow link. Additionally, finding a clean way to result the manual umount issue needs to happen before this can be merged.

I've pushed the snapshot branch to zfsonlinux so go ahead and branch off of that. I believe you'll be able to open pull request with your proposed additions.

Geoffrey Huntley

ghuntley has added this issue to his watchlist.

Rohan Puri

Sorry Brian, for the delayed response. I will look into the follow_link() stuff this weekend. Will update you on this ASAP. Thanks for the patience :)

LVLAaron

I'm glad you guys are working on this. I was scratching my head for quite a while trying to find .zfs :)

Turbo Fredriksson
Collaborator

Has anyone else a problem with destroying snapshots after this patch? This is not my only patch, so I'm trying to figure out which one is the problem.

My thread on the list
http://thread.gmane.org/gmane.linux.file-systems.zfs.user/2118

Turbo Fredriksson
Collaborator

So it's definitely this patch, I remove all the others and first tried Rohans and then Brians.

It seems that the difference between Rohans patch and Brians additions is that with Brians code, I get:

debianzfs:~# zfs destroy -r share/home/datorer
cannot destroy 'share/home/datorer@20111021': dataset does not exist

but with Rohans:

debianzfs:~# zfs destroy -r share/home/datorer
cannot destroy 'share/home/datorer': dataset is busy

Destroying a filesystem that does NOT have any snapshots works, even if they have childs:

debianzfs:~# zfs create share/test4
debianzfs:~# zfs create share/test4/test4b
debianzfs:~# zfs destroy share/test4
cannot destroy 'share/test4': filesystem has children
use '-r' to destroy the following datasets:
share/test4/test4b
debianzfs:~# zfs destroy -r share/test4
debianzfs:~#

This with both patches. However, with Rohans patch, I CAN delete the snapshot manually, IF I use -r:

debianzfs:~# zfs create share/tests/destroy
debianzfs:~# zfs snapshot share/tests/destroy@20120205
debianzfs:~# zfs destroy share/tests/destroy
cannot destroy 'share/tests/destroy': filesystem has children
use '-r' to destroy the following datasets:
share/tests/destroy@20120205
debianzfs:~# zfs destroy -r share/tests/destroy
cannot destroy 'share/tests/destroy': dataset is busy
debianzfs:~# zfs destroy share/tests/destroy@20120205
debianzfs:~# zfs list -t filesystem,snapshot | grep tests/destroy
share/tests/destroy 38.6K 14.3T 38.6K /share/tests/destroy
share/tests/destroy@20120205 0 - 38.6K -
debianzfs:~# zfs destroy -r share/tests/destroy@20120205
debianzfs:~# zfs list -t filesystem,snapshot | grep tests/destroy
share/tests/destroy 38.6K 14.3T 38.6K /share/tests/destroy
debianzfs:~# zfs destroy share/tests/destroy
debianzfs:~#

So it seems (on command #5) that everything went ok, but the list show that it was never actually destroyed. However, adding the -r flag when destroying the snapshot (command #7) is successful.

This I can't do with Brians patch. There, it simply does not exist:

debianzfs:~# zfs create share/tests/destroy
debianzfs:~# zfs snapshot share/tests/destroy@20120205
debianzfs:~# zfs destroy -r share/tests/destroy
cannot destroy 'share/tests/destroy@20120205': dataset does not exist
debianzfs:~# zfs destroy share/tests/destroy@20120205
cannot destroy 'share/tests/destroy@20120205': dataset does not exist

PS. This on a 3.1.6 kernel.

Andrew Barnes b333z referenced this issue from a commit in b333z/zfs February 07, 2012
Andrew Barnes Fix destroy snapshots and race on zfs diff
Fixes "dataset not found" error on zfs destory <snapshot> see #173.
Fixes race in dsl_dataset_user_release_tmp() when the temp snapshot
from zfs diff dataset@snap command is used see #481.
2d69db2
Andrew Barnes

Think I have found what was causing the dataset does not exist issue, have submitted a pull request.

Turbo Fredriksson FransUrbo referenced this issue from a commit February 06, 2012
Commit has since been removed from the repository and is no longer available.
Turbo Fredriksson FransUrbo referenced this issue from a commit February 06, 2012
Commit has since been removed from the repository and is no longer available.
Brian Behlendorf behlendorf referenced this issue from a commit February 09, 2012
Commit has since been removed from the repository and is no longer available.
Brian Behlendorf
Owner

Refreshed version of the patch which includes zfs diff fix. Once we sort out any .zfs/snapshot issues over zfs this is ready to be merged. behlendorf/zfs@f34cd0a

Andrew Barnes

Am doing some testing on accessing snapshots over NFS, but currently am unable to access the .zfs directory from the client, after: zfs set snapdir=visible system, I can now see the .zfs directory on the NFS client, but attempting to access it gives: ls: cannot open directory /system/.zfs: No such file or directory...

So far it looks to be some issue looking up the attr's on the .zfs inode? Am seeing this from the client NFS: nfs_revalidate_inode: (0:17/-1) getattr failed, error=-2

Below are some debug messages from NFS on both sides and some output of a systemtap script I'm using to explore the issue, am a bit stuck of where to go from here... Anyone have this working, or have any suggestions of direction I can take to further debug the issue?

  • Client:
# cat /etc/fstab | grep system
192.168.128.15:/system  /system         nfs             nfsvers=3,tcp,rw,noauto 0 0

# umount /system; mount /system; ls -la /system
total 39
drwxr-xr-x  3 root root    7 Feb 10 14:32 .
drwxr-xr-x 23 root root 4096 Dec 17 04:20 ..
dr-xr-xr-x  1 root root    0 Jan  1  1970 .zfs
-rw-r--r--  1 root root    5 Jan  5 12:21 file45
drwxr-xr-x  2 root root    7 Dec  6 01:37 logrotate.d
-rw-r--r--  1 root root    5 Dec  6 01:33 test.txt
-rw-r--r--  1 root root   10 Dec  6 01:40 test2.txt
-rw-r--r--  1 root root    8 Feb 10 14:32 testet.etet


# tail -f /var/log/messages &
# rpcdebug -m nfs -s all
# ls -la /system/.zfs
ls: cannot open directory /system/.zfs: No such file or directory
Feb 11 18:06:27 b13 kernel: [60889.473116] NFS: permission(0:17/4), mask=0x1, res=0
Feb 11 18:06:27 b13 kernel: [60889.473133] NFS: nfs_lookup_revalidate(/.zfs) is valid
Feb 11 18:06:27 b13 kernel: [60889.473144] NFS: dentry_delete(/.zfs, c018)
Feb 11 18:06:27 b13 kernel: [60889.473169] NFS: permission(0:17/4), mask=0x1, res=0
Feb 11 18:06:27 b13 kernel: [60889.473176] NFS: nfs_lookup_revalidate(/.zfs) is valid
Feb 11 18:06:27 b13 kernel: [60889.473185] NFS: dentry_delete(/.zfs, c018)
Feb 11 18:06:27 b13 kernel: [60889.474539] NFS: permission(0:17/4), mask=0x1, res=0
Feb 11 18:06:27 b13 kernel: [60889.474549] NFS: revalidating (0:17/-1)
Feb 11 18:06:27 b13 kernel: [60889.474555] NFS call  getattr
Feb 11 18:06:27 b13 kernel: [60889.476370] NFS reply getattr: -2
Feb 11 18:06:27 b13 kernel: [60889.476379] nfs_revalidate_inode: (0:17/-1) getattr failed, error=-2
Feb 11 18:06:27 b13 kernel: [60889.476391] NFS: nfs_lookup_revalidate(/.zfs) is invalid
Feb 11 18:06:27 b13 kernel: [60889.476398] NFS: dentry_delete(/.zfs, c018)
Feb 11 18:06:27 b13 kernel: [60889.476409] NFS: lookup(/.zfs)
Feb 11 18:06:27 b13 kernel: [60889.476415] NFS call  lookup .zfs
Feb 11 18:06:27 b13 kernel: [60889.477680] NFS: nfs_update_inode(0:17/4 ct=2 info=0x7e7f)
Feb 11 18:06:27 b13 kernel: [60889.477689] NFS reply lookup: 0
Feb 11 18:06:27 b13 kernel: [60889.477699] NFS: nfs_update_inode(0:17/-1 ct=1 info=0x7e7f)
Feb 11 18:06:27 b13 kernel: [60889.477704] NFS: nfs_fhget(0:17/-1 ct=1)
Feb 11 18:06:27 b13 kernel: [60889.477719] NFS call  access
Feb 11 18:06:27 b13 kernel: [60889.478948] NFS reply access: -2
Feb 11 18:06:27 b13 kernel: [60889.478988] NFS: permission(0:17/-1), mask=0x24, res=-2
Feb 11 18:06:27 b13 kernel: [60889.478995] NFS: dentry_delete(/.zfs, c010)
  • Server:
# rpcdebug -m nfsd -s all
# tail -f /var/log/messages &
Feb 11 18:11:40 b15 kernel: [61924.682697] nfsd_dispatch: vers 3 proc 4
Feb 11 18:11:40 b15 kernel: [61924.682712] nfsd: ACCESS(3)   8: 00010001 00000064 00000000 00000000 00000000 00000000 0x1f
Feb 11 18:11:40 b15 kernel: [61924.682724] nfsd: fh_verify(8: 00010001 00000064 00000000 00000000 00000000 00000000)
Feb 11 18:11:40 b15 kernel: [61924.684475] nfsd_dispatch: vers 3 proc 1
Feb 11 18:11:40 b15 kernel: [61924.684490] nfsd: GETATTR(3)  20: 01010001 00000064 ffff000a ffffffff 00000000 00000000
Feb 11 18:11:40 b15 kernel: [61924.684501] nfsd: fh_verify(20: 01010001 00000064 ffff000a ffffffff 00000000 00000000)

     0 nfsd(12660):->zfsctl_is_node ip=0xffff880037ca0e48
    19 nfsd(12660):<-zfsctl_is_node return=0x1
     0 nfsd(12660):->zfsctl_fid ip=0xffff880037ca0e48 fidp=0xffff880078817154
    79 nfsd(12660): ->zfsctl_fid ip=0xffff880037ca0e48 fidp=0xffff880078817154
    93 nfsd(12660): <-zfsctl_fid return=0x0
   101 nfsd(12660):<-zfsctl_fid return=0x0
        zpl_ctldir.c:136 error=?
     0 nfsd(12660):->zpl_root_getattr mnt=0xffff880075dc4e00 dentry=0xffff88007b781000 stat=0xffff88007305bd30
    14 nfsd(12660): ->zpl_root_getattr mnt=0xffff880075dc4e00 dentry=0xffff88007b781000 stat=0xffff88007305bd30
        zpl_ctldir.c:139 error=?
    74 nfsd(12660):  ->simple_getattr mnt=0xffff880075dc4e00 dentry=0xffff88007b781000 stat=0xffff88007305bd30
    83 nfsd(12660):  <-simple_getattr return=0x0
        zpl_ctldir.c:140 error=0x0
        zpl_ctldir.c:143 error=0x0
        zpl_ctldir.c:143 error=0x0
   168 nfsd(12660): <-zpl_root_getattr return=0x0
   175 nfsd(12660):<-zpl_root_getattr return=0x0
Andrew Barnes

Think I may have tracked down the issue with not being able to do ls -la /system/.zfs.

The path is something like:

getattr -> nfs client -> nfs server -> nfsd3_proc_getattr -> zpl_fh_to_dentry -> zfs_vget

In zfs_vget() attempts to retreive a znode, but as its a control directory it doesn't have a backing znode so should not do a normal lookup, this condition is identified in zfs_vget() here:

        /* A zero fid_gen means we are in the .zfs control directories */
        if (fid_gen == 0 &&
            (object == ZFSCTL_INO_ROOT || object == ZFSCTL_INO_SNAPDIR)) {

            ...

            ZFS_EXIT(zsb);
            return (0);
        }

But from my traces I found this condition was not triggered.

I did a trace of the locals in zfs_vget() and got the following:

       157 nfsd(3367):   zfs_vfsops.c:1293 zsb=? zp=0xffff88007c137c20 object=? fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       166 nfsd(3367):   zfs_vfsops.c:1294 zsb=? zp=0x286 object=? fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       174 nfsd(3367):   zfs_vfsops.c:1304 zsb=? zp=0x286 object=? fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       183 nfsd(3367):   zfs_vfsops.c:1302 zsb=0xffff88007b29c000 zp=0x286 object=? fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       192 nfsd(3367):   zfs_vfsops.c:1306 zsb=0xffff88007b29c000 zp=0x286 object=? fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       206 nfsd(3367):   zfs_vfsops.c:1326 zsb=0xffff88007b29c000 zp=0x286 object=? fid_gen=? gen_mask=? zp_gen=? i=? err=0x11270000 __func__=[...]
       216 nfsd(3367):   zfs_vfsops.c:1330 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0x0 fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       229 nfsd(3367):   zfs_vfsops.c:1329 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       243 nfsd(3367):   zfs_vfsops.c:1330 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       252 nfsd(3367):   zfs_vfsops.c:1329 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       261 nfsd(3367):   zfs_vfsops.c:1330 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       270 nfsd(3367):   zfs_vfsops.c:1329 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       279 nfsd(3367):   zfs_vfsops.c:1330 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       288 nfsd(3367):   zfs_vfsops.c:1329 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       297 nfsd(3367):   zfs_vfsops.c:1330 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       306 nfsd(3367):   zfs_vfsops.c:1329 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       316 nfsd(3367):   zfs_vfsops.c:1330 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       325 nfsd(3367):   zfs_vfsops.c:1329 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=? gen_mask=? zp_gen=? i=? err=? __func__=[...]
       335 nfsd(3367):   zfs_vfsops.c:1333 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       348 nfsd(3367):   zfs_vfsops.c:1332 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       362 nfsd(3367):   zfs_vfsops.c:1333 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       372 nfsd(3367):   zfs_vfsops.c:1332 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       381 nfsd(3367):   zfs_vfsops.c:1333 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       390 nfsd(3367):   zfs_vfsops.c:1332 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       399 nfsd(3367):   zfs_vfsops.c:1333 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       409 nfsd(3367):   zfs_vfsops.c:1332 zfid=? zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       418 nfsd(3367):   zfs_vfsops.c:1340 zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       432 nfsd(3367):   zfs_vfsops.c:1341 zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       441 nfsd(3367):   zfs_vfsops.c:1356 zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       459 nfsd(3367):   zfs_vfsops.c:1357 zsb=0xffff88007b29c000 zp=0x286 object=0xffffffffffff fid_gen=0x0 gen_mask=? zp_gen=? i=? err=? __func__=[...]
       550 nfsd(3367):   zfs_vfsops.c:1379 zsb=0xffff88007b29c000 zp=0x0 object=? fid_gen=0x0 gen_mask=? zp_gen=0xffffffffa0df2b34 i=? err=0x2 __func__=[...]
       559 nfsd(3367):   zpl_export.c:93 fid=? ip=? len_bytes=? rc=?
       572 nfsd(3367):   zpl_export.c:94 fid=? ip=? len_bytes=? rc=0x2
       585 nfsd(3367):   zpl_export.c:99 fid=? ip=0x0 len_bytes=? rc=?

Did a printk to confirm:

printk("zfs: %d == 0 && ( %llx == %llx || %llx == %llx )",
    fid_gen, object, ZFSCTL_INO_ROOT, object, ZFSCTL_INO_SNAPDIR);

[  204.649576] zfs: 0 == 0 && ( ffffffffffff == ffffffffffffffff || ffffffffffff == fffffffffffffffd )

So looks like zlfid->zf_setid is not long enough, or perhaps it's been trucated by nfs3, but either way object ends up with not enough f's: ffffffffffff so doesn't match ZFSCTL_INO_ROOT.

As a test I adjusted the values for the control inode defines:

        diff --git a/include/sys/zfs_ctldir.h b/include/sys/zfs_ctldir.h
        index 5546aa7..46e7353 100644
        --- a/include/sys/zfs_ctldir.h
        +++ b/include/sys/zfs_ctldir.h
        @@ -105,10 +105,10 @@ extern void zfsctl_fini(void);
          * because these inode numbers are never stored on disk we can safely
          * redefine them as needed in the future.
          */
        -#define        ZFSCTL_INO_ROOT         0xFFFFFFFFFFFFFFFF
        -#define        ZFSCTL_INO_SHARES       0xFFFFFFFFFFFFFFFE
        -#define        ZFSCTL_INO_SNAPDIR      0xFFFFFFFFFFFFFFFD
        -#define        ZFSCTL_INO_SNAPDIRS     0xFFFFFFFFFFFFFFFC
        +#define        ZFSCTL_INO_ROOT         0xFFFFFFFFFFFF
        +#define        ZFSCTL_INO_SHARES       0xFFFFFFFFFFFE
        +#define        ZFSCTL_INO_SNAPDIR      0xFFFFFFFFFFFD
        +#define        ZFSCTL_INO_SNAPDIRS     0xFFFFFFFFFFFC

Then tested, am I am now able to traverse the .zfs directory and see the shares and snapshot dirs inside.

Noticed that other ZFS implementations use low values for the control dir inodes, have we gone to large on these, or is this a limitation in nfs3 itself, wondering on how best to deal with it?

Andrew Barnes

Now I can get into the .zfs directory, trying to cd to the snapshot dir causes cd to hang and I get the following in dmesg ( slowly getting there! ):

        [  563.905750] ------------[ cut here ]------------
        [  563.905768] WARNING: at fs/inode.c:901 unlock_new_inode+0x31/0x53()
        [  563.905776] Hardware name: Bochs
        [  563.905780] Modules linked in: zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl scsi_wait_scan
        [  563.905805] Pid: 3358, comm: nfsd Tainted: P            3.0.6-gentoo #1
        [  563.905811] Call Trace:
        [  563.905843]  [<ffffffff81071a5d>] warn_slowpath_common+0x85/0x9d
        [  563.905853]  [<ffffffff81071a8f>] warn_slowpath_null+0x1a/0x1c
        [  563.905861]  [<ffffffff81141701>] unlock_new_inode+0x31/0x53
        [  563.905902]  [<ffffffffa046c14a>] snapentry_compare+0xcb/0x12f [zfs]
        [  563.905937]  [<ffffffffa046c44a>] zfsctl_root_lookup+0xc3/0x123 [zfs]
        [  563.905967]  [<ffffffffa047bd25>] zfs_vget+0x1f6/0x3e4 [zfs]
        [  563.905988]  [<ffffffff817475ce>] ? seconds_since_boot+0x1b/0x21
        [  563.905996]  [<ffffffff81748d17>] ? cache_check+0x57/0x2d0
        [  563.906021]  [<ffffffffa0491d93>] zpl_snapdir_rename+0x11e/0x455 [zfs]                                           [43/1854]
        [  563.906052]  [<ffffffff811f160c>] exportfs_decode_fh+0x56/0x21e
        [  563.906060]  [<ffffffff811f4690>] ? fh_compose+0x367/0x367
        [  563.906079]  [<ffffffff813a370f>] ? selinux_cred_prepare+0x1f/0x36
        [  563.906094]  [<ffffffff8112a2ad>] ? __kmalloc_track_caller+0xee/0x101
        [  563.906103]  [<ffffffff813a370f>] ? selinux_cred_prepare+0x1f/0x36
        [  563.906112]  [<ffffffff811f4a58>] fh_verify+0x299/0x4d9
        [  563.906121]  [<ffffffff817475ce>] ? seconds_since_boot+0x1b/0x21
        [  563.906129]  [<ffffffff8174935b>] ? sunrpc_cache_lookup+0x146/0x16d
        [  563.906137]  [<ffffffff811f4f4c>] nfsd_access+0x2d/0xfa
        [  563.906145]  [<ffffffff81748f73>] ? cache_check+0x2b3/0x2d0
        [  563.906154]  [<ffffffff811fc469>] nfsd3_proc_access+0x75/0x80
        [  563.906164]  [<ffffffff811f1afd>] nfsd_dispatch+0xf1/0x1d5
        [  563.906172]  [<ffffffff817400b2>] svc_process+0x45e/0x665
        [  563.906181]  [<ffffffff811f1fa9>] ? nfsd_svc+0x170/0x170
        [  563.906190]  [<ffffffff811f209f>] nfsd+0xf6/0x13a
        [  563.906198]  [<ffffffff811f1fa9>] ? nfsd_svc+0x170/0x170
        [  563.906206]  [<ffffffff8108d357>] kthread+0x82/0x8a
        [  563.906216]  [<ffffffff817ece24>] kernel_thread_helper+0x4/0x10
        [  563.906225]  [<ffffffff8108d2d5>] ? kthread_worker_fn+0x158/0x158
        [  563.906233]  [<ffffffff817ece20>] ? gs_change+0x13/0x13
        [  563.906240] ---[ end trace c8c4cba0e76b487f ]---
        [  563.906272] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        [  563.911695] IP: [<ffffffffa048819a>] zfs_inode_destroy+0x72/0xd1 [zfs]
        [  563.913129] PGD 780fc067 PUD 77074067 PMD 0 
        [  563.913857] Oops: 0002 [#1] SMP 
        [  563.914487] CPU 1 
        [  563.914565] Modules linked in: zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl scsi_wait_scan
        [  563.915111] 
        [  563.915111] Pid: 3358, comm: nfsd Tainted: P        W   3.0.6-gentoo #1 Bochs Bochs
        [  563.915111] RIP: 0010:[<ffffffffa048819a>]  [<ffffffffa048819a>] zfs_inode_destroy+0x72/0xd1 [zfs]
        [  563.915111] RSP: 0018:ffff8800704779c0  EFLAGS: 00010282
        [  563.915111] RAX: ffff880074945020 RBX: ffff880074945048 RCX: 0000000000000000
        [  563.915111] RDX: 0000000000000000 RSI: 0000000000014130 RDI: ffff880071b763e0
        [  563.915111] RBP: ffff8800704779e0 R08: ffffffff813a3e91 R09: 0000000000000000
        [  563.915111] R10: dead000000200200 R11: dead000000100100 R12: ffff880071b76000
        [  563.915111] R13: ffff880074944ea0 R14: ffff880071b763e0 R15: ffffffffa049d200
        [  563.915111] FS:  00007f8d978a2700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
        [  563.915111] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        [  563.915111] CR2: 0000000000000008 CR3: 0000000079577000 CR4: 00000000000006e0
        [  563.915111] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [  563.915111] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        [  563.915111] Process nfsd (pid: 3358, threadinfo ffff880070476000, task ffff88007caaf800)
        [  563.915111] Stack:
        [  563.915111]  ffff880074945048 ffff880074945068 ffffffffa049dba0 ffff880074944ea0
        [  563.915111]  ffff8800704779f0 ffffffffa0492cbd ffff880070477a10 ffffffff81141df2
        [  563.915111]  ffff880074945048 ffff880074945048 ffff880070477a30 ffffffff81142370
        [  563.915111] Call Trace:
        [  563.915111]  [<ffffffffa0492cbd>] zpl_vap_init+0x525/0x59c [zfs]
        [  563.915111]  [<ffffffff81141df2>] destroy_inode+0x40/0x5a
        [  563.915111]  [<ffffffff81142370>] evict+0x130/0x135
        [  563.915111]  [<ffffffff81142788>] iput+0x173/0x17b
        [  563.915111]  [<ffffffffa046c155>] snapentry_compare+0xd6/0x12f [zfs]
        [  563.915111]  [<ffffffffa046c44a>] zfsctl_root_lookup+0xc3/0x123 [zfs]
        [  563.915111]  [<ffffffffa047bd25>] zfs_vget+0x1f6/0x3e4 [zfs]
        [  563.915111]  [<ffffffff817475ce>] ? seconds_since_boot+0x1b/0x21
        [  563.915111]  [<ffffffff81748d17>] ? cache_check+0x57/0x2d0
        [  563.915111]  [<ffffffffa0491d93>] zpl_snapdir_rename+0x11e/0x455 [zfs]
        [  563.915111]  [<ffffffff811f160c>] exportfs_decode_fh+0x56/0x21e
        [  563.915111]  [<ffffffff811f4690>] ? fh_compose+0x367/0x367
        [  563.915111]  [<ffffffff813a370f>] ? selinux_cred_prepare+0x1f/0x36
        [  563.915111]  [<ffffffff8112a2ad>] ? __kmalloc_track_caller+0xee/0x101
        [  563.915111]  [<ffffffff813a370f>] ? selinux_cred_prepare+0x1f/0x36
        [  563.915111]  [<ffffffff811f4a58>] fh_verify+0x299/0x4d9
        [  563.915111]  [<ffffffff817475ce>] ? seconds_since_boot+0x1b/0x21
        [  563.915111]  [<ffffffff8174935b>] ? sunrpc_cache_lookup+0x146/0x16d
        [  563.915111]  [<ffffffff811f4f4c>] nfsd_access+0x2d/0xfa
        [  563.915111]  [<ffffffff81748f73>] ? cache_check+0x2b3/0x2d0
        [  563.915111]  [<ffffffff811fc469>] nfsd3_proc_access+0x75/0x80
        [  563.915111]  [<ffffffff811f1afd>] nfsd_dispatch+0xf1/0x1d5
        [  563.915111]  [<ffffffff817400b2>] svc_process+0x45e/0x665
        [  563.915111]  [<ffffffff811f1fa9>] ? nfsd_svc+0x170/0x170
        [  563.915111]  [<ffffffff811f209f>] nfsd+0xf6/0x13a
        [  563.915111]  [<ffffffff811f1fa9>] ? nfsd_svc+0x170/0x170
        [  563.915111]  [<ffffffff8108d357>] kthread+0x82/0x8a
        [  563.915111]  [<ffffffff817ece24>] kernel_thread_helper+0x4/0x10
        [  563.915111]  [<ffffffff8108d2d5>] ? kthread_worker_fn+0x158/0x158
        [  563.915111]  [<ffffffff817ece20>] ? gs_change+0x13/0x13
        [  563.915111] Code: 35 e1 4c 89 e8 49 03 84 24 c0 03 00 00 49 bb 00 01 10 00 00 00 ad de 49 ba 00 02 20 00 00 00 ad de 4c 89
         f7 48 8b 08 48 8b 50 08 
        [  563.915111]  89 51 08 48 89 0a 4c 89 18 4c 89 50 08 49 ff 8c 24 d8 03 00 
        [  563.915111] RIP  [<ffffffffa048819a>] zfs_inode_destroy+0x72/0xd1 [zfs]
        [  563.915111]  RSP <ffff8800704779c0>
        [  563.915111] CR2: 0000000000000008
        [  563.966671] ---[ end trace c8c4cba0e76b4880 ]---
Brian Behlendorf
Owner

@b333z Nice job. Yes, your exactly right I'd forgotten about this NFS limit when selecting those object IDs. There's actually a very nice comment in the code detailing exactly where this limit comes from. So we're limited to 48-bits for NFSv2 compatibility reasons, and actually the DMU imposes a (not widely advertised) 48-bit object number limit too.

include/sys/zfs_vfsops.h:105


/*
 * Normal filesystems (those not under .zfs/snapshot) have a total
 * file ID size limited to 12 bytes (including the length field) due to
 * NFSv2 protocol's limitation of 32 bytes for a filehandle.  For historical
 * reasons, this same limit is being imposed by the Solaris NFSv3 implementation
 * (although the NFSv3 protocol actually permits a maximum of 64 bytes).  It
 * is not possible to expand beyond 12 bytes without abandoning support
 * of NFSv2.
 *
 * For normal filesystems, we partition up the available space as follows:
 *      2 bytes         fid length (required)
 *      6 bytes         object number (48 bits)
 *      4 bytes         generation number (32 bits)
 *
 * We reserve only 48 bits for the object number, as this is the limit
 * currently defined and imposed by the DMU.
 */
typedef struct zfid_short {
        uint16_t        zf_len;
        uint8_t         zf_object[6];           /* obj[i] = obj >> (8 * i) */
        uint8_t         zf_gen[4];              /* gen[i] = gen >> (8 * i) */
} zfid_short_t;

include/sys/zfs_znode.h:160

/*
 * The directory entry has the type (currently unused on Solaris) in the
 * top 4 bits, and the object number in the low 48 bits.  The "middle"
 * 12 bits are unused.
 */
#define ZFS_DIRENT_TYPE(de) BF64_GET(de, 60, 4)
#define ZFS_DIRENT_OBJ(de) BF64_GET(de, 0, 48)

So the right fix is going to have to be use smaller values for the ZFSCTL_INO_* constants... along with a very good comment explaining why those values are what they are.

As you noted they are quite a bit larger than their upstream counterparts. The reason is because the upstream code creates a separate namespace for the small .zfs/ directory and then mounts the snapshots on top of it. Under Linux it was far easier just to just create these directories in the same namespace and the original zfs filesystem. However, since they are in the same namespace (unlike upstream) we needed to make sure the object ids never conflicted so they used the upper most object ids. Since zfs allocates all of it's object ids from 1 in a monotonically increasing fashion there wouldn't be a conflict.

The second issue looks like it's caused by trying to allocate a new inode for the .zfs/snapshot directory when one already exists in the namespace. Normally, this wouldn't occur in the usual vfs callpaths but the NFS paths differ. We're going to need to perform a lookup and only create the inode when the lookup fails. See zfsctl_snapdir_lookup() as an example of this.

I'd really like to get this code done and merged in to master but I don't have the time to run down all these issues right now. If you can work on this and resolve the remaining NFS bugs that would be great, I'm happy to iterate with you on this in the bug and merge it once it's done.

Andrew Barnes

Sounds good Brian, I'll expand my test env to include nfs2 and nfs4 and work towards resolving any remaining issues.

Andrew Barnes
b333z commented March 17, 2012

Making some slow progress on this, I can traverse the control directory structure down to the snapshots now ( still on nfs3 ). As you said it looks like the it was trying to create a new inodes for the control directories when they were already there, so adding a lookup as you suggested seems to have done the trick.

These are the changes that I have so far:

diff --git a/include/sys/zfs_ctldir.h b/include/sys/zfs_ctldir.h
index 5546aa7..46e7353 100644
--- a/include/sys/zfs_ctldir.h
+++ b/include/sys/zfs_ctldir.h
@@ -105,10 +105,10 @@ extern void zfsctl_fini(void);
  * because these inode numbers are never stored on disk we can safely
  * redefine them as needed in the future.
  */
-#define    ZFSCTL_INO_ROOT     0xFFFFFFFFFFFFFFFF
-#define    ZFSCTL_INO_SHARES   0xFFFFFFFFFFFFFFFE
-#define    ZFSCTL_INO_SNAPDIR  0xFFFFFFFFFFFFFFFD
-#define    ZFSCTL_INO_SNAPDIRS 0xFFFFFFFFFFFFFFFC
+#define    ZFSCTL_INO_ROOT     0xFFFFFFFFFFFF
+#define    ZFSCTL_INO_SHARES   0xFFFFFFFFFFFE
+#define    ZFSCTL_INO_SNAPDIR  0xFFFFFFFFFFFD
+#define    ZFSCTL_INO_SNAPDIRS 0xFFFFFFFFFFFC

#define ZFSCTL_EXPIRE_SNAPSHOT  300

diff --git a/module/zfs/zfs_ctldir.c b/module/zfs/zfs_ctldir.c
index 6abbedf..61dea94 100644
--- a/module/zfs/zfs_ctldir.c
+++ b/module/zfs/zfs_ctldir.c
@@ -346,12 +346,30 @@ zfsctl_root_lookup(struct inode *dip, char *name, struct inode **ipp,
    if (strcmp(name, "..") == 0) {
        *ipp = dip->i_sb->s_root->d_inode;
    } else if (strcmp(name, ZFS_SNAPDIR_NAME) == 0) {
-       *ipp = zfsctl_inode_alloc(zsb, ZFSCTL_INO_SNAPDIR,
-           &zpl_fops_snapdir, &zpl_ops_snapdir);
+       *ipp = ilookup(zsb->z_sb, ZFSCTL_INO_SNAPDIR);
+       if (!*ipp)
+       {
+           *ipp = zfsctl_inode_alloc(zsb, ZFSCTL_INO_SNAPDIR,
+           &zpl_fops_snapdir, &zpl_ops_snapdir);
+       }
    } else if (strcmp(name, ZFS_SHAREDIR_NAME) == 0) {
-       *ipp = zfsctl_inode_alloc(zsb, ZFSCTL_INO_SHARES,
-           &zpl_fops_shares, &zpl_ops_shares);
+       *ipp = ilookup(zsb->z_sb, ZFSCTL_INO_SHARES);
+       if (!*ipp)
+       {         
+           *ipp = zfsctl_inode_alloc(zsb, ZFSCTL_INO_SHARES,
+               &zpl_fops_shares, &zpl_ops_shares);
+       }
    } else {
        *ipp = NULL;
        error = ENOENT;
    }
diff --git a/module/zfs/zfs_vfsops.c b/module/zfs/zfs_vfsops.c
index f895f5c..3197243 100644
--- a/module/zfs/zfs_vfsops.c
+++ b/module/zfs/zfs_vfsops.c
@@ -1336,25 +1336,42 @@ zfs_vget(struct super_block *sb, struct inode **ipp, fid_t *fidp)
        return (EINVAL);
    }

-   /* A zero fid_gen means we are in the .zfs control directories */
-   if (fid_gen == 0 &&
-       (object == ZFSCTL_INO_ROOT || object == ZFSCTL_INO_SNAPDIR)) {
-       *ipp = zsb->z_ctldir;
-       ASSERT(*ipp != NULL);
-       if (object == ZFSCTL_INO_SNAPDIR) {
-           VERIFY(zfsctl_root_lookup(*ipp, "snapshot", ipp,
-               0, kcred, NULL, NULL) == 0);
+   printk("zfs.zfs_vget() - Decoded: fid_gen: %llx object: %llx\n",
+       fid_gen, object);
+
+   if (fid_gen == 0) {
+       if (object == ZFSCTL_INO_ROOT || object == ZFSCTL_INO_SNAPDIR || object == ZFSCTL_INO_SHARES) {
+           *ipp = zsb->z_ctldir;
+           ASSERT(*ipp != NULL);
+           if (object == ZFSCTL_INO_SNAPDIR) {
+               VERIFY(zfsctl_root_lookup(*ipp, ZFS_SNAPDIR_NAME, ipp,
+                   0, kcred, NULL, NULL) == 0);
+           } else if (object == ZFSCTL_INO_SHARES) {
+               VERIFY(zfsctl_root_lookup(*ipp, ZFS_SHAREDIR_NAME, ipp,
+                   0, kcred, NULL, NULL) == 0);
+           } else if (object == ZFSCTL_INO_ROOT) {
+               igrab(*ipp);
+           }
+           ZFS_EXIT(zsb);
+           return (0);
        } else {
-           igrab(*ipp);
+           printk("zfs.zfs_vget() - Not .zfs,shares,snapshot must be snapdir doing lookup...\n");
+           *ipp = ilookup(zsb->z_sb, object);
+           if (*ipp) {
+               printk("zfs.zfs_vget() - Found snapdir Node\n");
+               ZFS_EXIT(zsb);
+               return (0);
+           } else {
+               printk("zfs.zfs_vget() - snapdir Node not found continuing...\n");
+           }
        }
-       ZFS_EXIT(zsb);
-       return (0);
    }

    gen_mask = -1ULL >> (64 - 8 * i);

    dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, gen_mask);
    if ((err = zfs_zget(zsb, object, &zp))) { 
        ZFS_EXIT(zsb);
        return (err);
    }

Have tried a few combinations in zfs_vget of dealing with a snapshot directory ( currently an ilookup ) but so far all I get is the "." and ".." directories inside with a 1970 timestamp.

Had tried traversing the directory first via the local zfs mount to ensure the snapshot is mounted, then traversing via nfs but still get an empty directory.

My currently thinking is that nfsd refuses to export anything below that point as its a new mount point. Did some experimentation in forcing/ensuring that getattr returned the same stat->dev as the parent filesystem, that didn't seem to help. I will start doing some tracing on the nfs code so I can see what its doing.

I then did some experimentation with the crossmnt nfs option, that seems to have some promise it seems to be attempting to traverse the mount but give a stale file handle error but looked to at least attempting it.

Anyhow, slowly getting my head around it all, just planning to keep improving my tracing and hopefully get to the bottom of it soon, let us know if you have any idea's or tips!

Brian Behlendorf
Owner

Sounds good. Since I want to avoid this branch getting any staler than it already is I'm seriously considering merging this change in without the NFS support now that -rc7 has been tagged. We can further work on the NFS issues in another bug. Any objections?

As for the specific NFS issues your seeing that idea here is to basically fake out NFS for snapshots. The snapshot filesystems should be created with the same fsid as their parent so NFS can't tell the difference. Then it should allow traversal even without the crossmnt option. The NFS handles themselves are constructed in such a way as to avoid collisions and so lookups will be performed in the proper dataset. That said, clearly that's all not working quite right under Linux. We'll still need to dig in to why.

Andrew Barnes
b333z commented March 19, 2012

I have no objections, would be great to get this code merged, there's some great functionality there even without NFS support, so if I can assist in any way, let us know.

I'll continue to dig deeper on the nfs stuff and see what I can find.

Brian Behlendorf behlendorf referenced this issue from a commit March 20, 2012
Commit has since been removed from the repository and is no longer available.
Brian Behlendorf behlendorf closed this issue from a commit November 11, 2011
Brian Behlendorf Add .zfs control directory
Add support for the .zfs control directory.  This was accomplished
by leveraging as much of the existing ZFS infrastructure as posible
and updating it for Linux as required.  The bulk of the core
functionality is now all there with the following limitations.

*) The .zfs/snapshot directory automount support requires a 2.6.37
   or newer kernel.  The exception is RHEL6.2 which has backported
   the d_automount patches.

*) Creating/destroying/renaming snapshots with mkdir/rmdir/mv
   in the .zfs/snapshot directory works as expected.  However,
   this functionality is only available to root until zfs
   delegations are finished.

      * mkdir - create a snapshot
      * rmdir - destroy a snapshot
      * mv    - rename a snapshot

The following issues are known defeciences, but we expect them to
be addressed by future commits.

*) Add automount support for kernels older the 2.6.37.  This should
   be possible using follow_link() which is what Linux did before.

*) Accessing the .zfs/snapshot directory via NFS is not yet possible.
   The majority of the ground work for this is complete.  However,
   finishing this work will require resolving some lingering
   integration issues with the Linux NFS kernel server.

*) The .zfs/shares directory exists but no futher smb functionality
   has yet been implemented.

Contributions-by: Rohan Puri <rohan.puri15@gmail.com>
Contributiobs-by: Andrew Barnes <barnes333@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #173
ebe7e57
Brian Behlendorf behlendorf closed this in ebe7e57 March 22, 2012
Brian Behlendorf
Owner

The deed is done. Thanks for being patient with me to make sure this was done right. The core .zfs/snapshot code has been merged in the master with the following limitations.

  • A kernel with d_automount support is required, 2.6.37+ or RHEL6.2

  • Snapshots may not yet be accessed over NFS, issue #616

  • The .zfs/shares directory exists but is not yet functional.

Please open new issues for any problems you observe.

Darik Horn dajhorn referenced this issue from a commit in zfsonlinux/pkg-zfs March 21, 2012
Darik Horn Add: Add-.zfs-control-directory.patch
Add support for the .zfs control directory.  This was accomplished
by leveraging as much of the existing ZFS infrastructure as posible
and updating it for Linux as required.  The bulk of the core
functionality is now all there with the following limitations.

*) The .zfs/snapshot directory automount support requires a 2.6.37
   or newer kernel.  The exception is RHEL6.2 which has backported
   the d_automount patches.

*) Creating/destroying/renaming snapshots with mkdir/rmdir/mv
   in the .zfs/snapshot directory works as expected.  However,
   this functionality is only available to root until zfs
   delegations are finished.

      * mkdir - create a snapshot
      * rmdir - destroy a snapshot
      * mv    - rename a snapshot

The following issues are known defeciences, but we expect them to
be addressed by future commits.

*) Add automount support for kernels older the 2.6.37.  This should
   be possible using follow_link() which is what Linux did before.

*) Accessing the the .zfs/snapshot directory via NFS is not yet
   possible.  The majority of the ground work for this is complete.
   However, finishing this work will require resolving some lingering
   integration issues with the Linux NFS kernel server.

*) The .zfs/shares directory exists but no futher smb functionality
   has yet been implemented.

Contributions-by: Rohan Puri <rohan.puri15@gmail.com>
Contributiobs-by: Andrew Barnes <barnes333@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #173
2bac37d
Richard Yao ryao referenced this issue from a commit April 18, 2012
Commit has since been removed from the repository and is no longer available.
baquar

can any one help me
i set up zfs in two system with glusterfs data replicated from one volume to another, but on system 1 i am taking snapshot and it is not visible and not replicating i did many things ls -a, seen in all directories i zfs set snapdir=visible zpool/zfilesystem but still not able to find .zfs please any one

Brian Behlendorf
Owner

@baquar I'm not exactly sure what your asking. ZFS is a local filesystem if you take a snapshot it will just be visible in .zfs snapshot on that system. Gluster which layers on top of ZFS will not replicate it to your other systems. You can however manually ship it to the other system with send/recv.

baquar

@behlendorf hiii behlendorf i m sorry u did nt got me, issue was i am unable to find snapshot location in zfs.0
and i also want to learn spl and zfs code if you can help me to give some useful tips i would be pleased if you give some advice, really i m passionate to learn code of zfs and spl.
thanks
baquar

baquar

@behlendorf one more question i have i m mounting snapshot using this commond mount -t zfs datapool@dara /export/queue-data/ to share directory but it is in read only mode could you please tell me how to set permission to snapshot in zfs
thanks you

Andrey Kudinov

@baquar You can't write to snapshot, make clone if you need to write. And use mailing list for questions.

baquar

@aikudinov thanks you i really appreciate your warm response but my question is can we set full permission to snapshot i am sending it to another system and restoring it

Brian Behlendorf
Owner

@baquar IMHO the best way to get familiar with the spl/zfs code is to pick an open issue you care about and see if you can fix it. We're always happy to have the help and are willing to provide advise and hints.

However let me second @aikudinov and point you to the zfs-discuss@zfsonlinux.org mailing list. There are lots of helpful people reading the list who can probably very quickly answer your questions. As for your question about the snapshots they are by definition immutable. If you need a read-write copy you must clone it and mount the clone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.