Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dummy znode from zvol_state #4510

Closed
wants to merge 3 commits into from
Closed

Remove dummy znode from zvol_state #4510

wants to merge 3 commits into from

Conversation

tuxoko
Copy link
Contributor

@tuxoko tuxoko commented Apr 12, 2016

struct zvol_state contains a dummy znode, which is around 1KB on x64, only for
zfs_range_lock. But in reality, other than z_range_lock and z_range_avl,
zfs_range_lock only need znode on regular file, which means we add 1KB on a
structure and gain nothing.

In this patch, we remove the dummy znode for zvol_state. In order to do that,
we also need to refactor zfs_range_lock a bit. We move z_range_lock and
z_range_avl pair out of znode_t to form zfs_rlock_t. This new struct replaces
znode_t as the main handle inside the range lock functions. Since regular
files still need znode for RL_WRITER and RL_APPEND, we make znode an optional
argument in zfs_range_lock_impl.

To reduce possible merge conflict, we retain the prototype of zfs_range_lock.
And zvol now should call zvol_range_lock instead.

Signed-off-by: Chunwei Chen david.chen@osnexus.com

@behlendorf
Copy link
Contributor

This work nicely complements a patch proposed by @bprotopopov to verify the kernel module range lock implementation in ztest. I've never liked how the znode_t was coupled to the range lock to I'm all for this improvement. But we should coordinate these two changes so we can 1) eliminate the znode_t from the a zvol_state_t, and 2) update ztest to use a single version of the range locks.

The kernel built-in builder also detecting a small build issue where you need to update an existing trace point.

http://build.zfsonlinux.org/builders/Kernel.org%20Built-in%20x86_64%20%28BUILD%29/builds/4012/steps/shell_1/logs/make

@tuxoko
Copy link
Contributor Author

tuxoko commented Apr 13, 2016

I fixed the trace part. I'm not sure what's the status of #4024. Is it going to be merged? Or is it waiting for illumos?

@behlendorf
Copy link
Contributor

The ztest changes are independent of the bug being fixed. Ideally all of these changed should be submitted back upstream.

@bprotopopov
Copy link
Contributor

Yes, this could work out quite well; the user-space functions could also be #defined like zvol_*() functions, and the zp_is_zvol member would no linger be needed in the user-space znode.

I can help with the port if needed.

@tuxoko
Copy link
Contributor Author

tuxoko commented Apr 13, 2016

@bprotopopov Can you rebase the ztest patch on this one, then I'll pull it in this pr.

@bprotopopov
Copy link
Contributor

Yes, I can look into it.

@behlendorf
Copy link
Contributor

@tuxoko I found some time to get back to this and do a more careful review. I like where this is going, but if we're going to change this code we need to go farther and make it totally generic by eliminating the znode entirely from the range locking code. Let me suggest a way to do this which builds on the work you've already done.

If we assume that range locks will always be embedded in another struct as they are they today. Then we can extend the zfs_rlock_t structure you've introduced to include uint64_t pointers to the z_size, z_blksz, and z_max_blksz fields in the containing structure. This way the zfs_range_lock_writer() function will have safe access to this information for whatever containing structure is using the lock without the znode_t baggage. The zfs_rlock_init() function could be easily extended to take this values.

@tuxoko
Copy link
Contributor Author

tuxoko commented Apr 26, 2016

@behlendorf
I'm not sure putting pointers in zfs_rlock_t will be any better than having znode in zfs_range_lock_writer. Those pointers and the logic in the function is still very znode specific anyway.

@behlendorf
Copy link
Contributor

behlendorf commented Apr 26, 2016

How so? The logic here is just for appending to a range lock or changing the objects block size. Neither concept is specific to a znode.

@bprotopopov
Copy link
Contributor

I have rebased against @tuxoko changes on my branch at https://github.com/bprotopopov/zfs/commits/issue-4023

@tuxoko
Copy link
Contributor Author

tuxoko commented May 4, 2016

Rebase to master, add pointers to z_size and stuff, add ztest patch.

@@ -60,8 +70,12 @@ typedef struct rl {
* or exclusive (RL_WRITER or RL_APPEND). RL_APPEND is a special type that
* is converted to RL_WRITER that specified to lock from the start of the
* end of file. Returns the range lock structure.
*
* Filesystem should call zfs_range_lock.
* Zvol should call zvol_range_lock.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be removed, there is no zvol_range_lock() and zfs_range_lock() should be usable by all dmu consumers.

@bprotopopov
Copy link
Contributor

@tuxoko, I say LGTM.

@behlendorf
Copy link
Contributor

behlendorf commented May 13, 2016

@tuxoko @bprotopopov LGTM to me this is a nice bit of cleanup.

I'd suggest exporting the symbols so external consumers can use these range locks. This is functionality that filesystems like Lustre which are layered on the DMU like the ZPL could take advantage of right away. @adilger what are your thoughts on this?

zfs_range_lock()
zfs_range_unlock()
zfs_range_reduce()
zfs_range_compare()

@adilger
Copy link
Contributor

adilger commented May 13, 2016

@bzzz77 was already looking at adding range locking for the ZFS OSD (http://review.whamcloud.com/15542) so if this was provided directly by ZFS it would avoid duplicating this functionality in Lustre and would definitely be welcome.

@adilger
Copy link
Contributor

adilger commented May 13, 2016

Also, if the range locks are exported up to the ZPL then it is possible to implement multi-threaded read/write operations from the VFS instead of i_mutex serializing all access to the file.

@behlendorf
Copy link
Contributor

behlendorf commented May 13, 2016

In that case, @bzzz77 could you looked over the public interface in zfs_rlock.h and verify it's going to meet your needs. It should.

@tuxoko could you refresh this patch to export the needed symbols. Then this work is ready to merge.

@adilger right, we've always depended on these range lock in the ZPL instead of the i_mutex. It's just that previously they were wired in to the znode a little to tightly which prevented them from being used elsewhere.

Chunwei Chen and others added 3 commits May 13, 2016 15:17
struct zvol_state contains a dummy znode, which is around 1KB on x64, only for
zfs_range_lock. But in reality, other than z_range_lock and z_range_avl,
zfs_range_lock only need znode on regular file, which means we add 1KB on a
structure and gain nothing.

In this patch, we remove the dummy znode for zvol_state. In order to do that,
we also need to refactor zfs_range_lock a bit. We move z_range_lock and
z_range_avl pair out of znode_t to form zfs_rlock_t. This new struct replaces
znode_t as the main handle inside the range lock functions. We also add
pointers to z_size, z_blksz, and z_max_blksz so range lock code doesn't depend
on znode_t.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
@tuxoko
Copy link
Contributor Author

tuxoko commented May 13, 2016

Done

@behlendorf
Copy link
Contributor

@tuxoko thanks! Merged as:

e3a07cd Use zfs range locks in ztest
d88895a Remove dummy znode from zvol_state

@bzzz77 let us know if you encounter any issues using the new interface with Lustre. You should be able to use the same approach taken by the ZPL and embed the range lock in your osd object.

@behlendorf behlendorf added this to the 0.6.5.8 milestone May 17, 2016
ryao pushed a commit to ClusterHQ/zfs that referenced this pull request Jun 7, 2016
struct zvol_state contains a dummy znode, which is around 1KB on x64,
only for zfs_range_lock. But in reality, other than z_range_lock and
z_range_avl, zfs_range_lock only need znode on regular file, which
means we add 1KB on a structure and gain nothing.

In this patch, we remove the dummy znode for zvol_state. In order to
do that, we also need to refactor zfs_range_lock a bit. We move
z_range_lock and z_range_avl pair out of znode_t to form zfs_rlock_t.
This new struct replaces znode_t as the main handle inside the range
lock functions.

We also add pointers to z_size, z_blksz, and z_max_blksz so range lock
code doesn't depend on znode_t.  This allows non-ZPL consumers like
Lustre to use the range locks with their equivalent znode_t structure.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4510
nedbass pushed a commit to nedbass/zfs that referenced this pull request Aug 26, 2016
struct zvol_state contains a dummy znode, which is around 1KB on x64,
only for zfs_range_lock. But in reality, other than z_range_lock and
z_range_avl, zfs_range_lock only need znode on regular file, which
means we add 1KB on a structure and gain nothing.

In this patch, we remove the dummy znode for zvol_state. In order to
do that, we also need to refactor zfs_range_lock a bit. We move
z_range_lock and z_range_avl pair out of znode_t to form zfs_rlock_t.
This new struct replaces znode_t as the main handle inside the range
lock functions.

We also add pointers to z_size, z_blksz, and z_max_blksz so range lock
code doesn't depend on znode_t.  This allows non-ZPL consumers like
Lustre to use the range locks with their equivalent znode_t structure.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4510
@nedbass
Copy link
Contributor

nedbass commented Sep 2, 2016

For future reference, this was added to the 0.6.5.8 milestone so Lustre could take advantage of the ZFS range lock implementation.

nedbass pushed a commit to nedbass/zfs that referenced this pull request Sep 5, 2016
struct zvol_state contains a dummy znode, which is around 1KB on x64,
only for zfs_range_lock. But in reality, other than z_range_lock and
z_range_avl, zfs_range_lock only need znode on regular file, which
means we add 1KB on a structure and gain nothing.

In this patch, we remove the dummy znode for zvol_state. In order to
do that, we also need to refactor zfs_range_lock a bit. We move
z_range_lock and z_range_avl pair out of znode_t to form zfs_rlock_t.
This new struct replaces znode_t as the main handle inside the range
lock functions.

We also add pointers to z_size, z_blksz, and z_max_blksz so range lock
code doesn't depend on znode_t.  This allows non-ZPL consumers like
Lustre to use the range locks with their equivalent znode_t structure.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4510
nedbass pushed a commit to nedbass/zfs that referenced this pull request Sep 5, 2016
struct zvol_state contains a dummy znode, which is around 1KB on x64,
only for zfs_range_lock. But in reality, other than z_range_lock and
z_range_avl, zfs_range_lock only need znode on regular file, which
means we add 1KB on a structure and gain nothing.

In this patch, we remove the dummy znode for zvol_state. In order to
do that, we also need to refactor zfs_range_lock a bit. We move
z_range_lock and z_range_avl pair out of znode_t to form zfs_rlock_t.
This new struct replaces znode_t as the main handle inside the range
lock functions.

We also add pointers to z_size, z_blksz, and z_max_blksz so range lock
code doesn't depend on znode_t.  This allows non-ZPL consumers like
Lustre to use the range locks with their equivalent znode_t structure.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4510
tuxoko pushed a commit to tuxoko/zfs that referenced this pull request Sep 8, 2016
struct zvol_state contains a dummy znode, which is around 1KB on x64,
only for zfs_range_lock. But in reality, other than z_range_lock and
z_range_avl, zfs_range_lock only need znode on regular file, which
means we add 1KB on a structure and gain nothing.

In this patch, we remove the dummy znode for zvol_state. In order to
do that, we also need to refactor zfs_range_lock a bit. We move
z_range_lock and z_range_avl pair out of znode_t to form zfs_rlock_t.
This new struct replaces znode_t as the main handle inside the range
lock functions.

We also add pointers to z_size, z_blksz, and z_max_blksz so range lock
code doesn't depend on znode_t.  This allows non-ZPL consumers like
Lustre to use the range locks with their equivalent znode_t structure.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Boris Protopopov <boris.protopopov@actifio.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#4510
@tuxoko tuxoko deleted the zrl branch March 3, 2017 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants