Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pile of ceph_statx patches #10842

Closed
wants to merge 27 commits into from
Closed

First pile of ceph_statx patches #10842

wants to merge 27 commits into from

Conversation

jtlayton
Copy link
Contributor

This patchset is the first pile of patches to add ceph_statx support to the userland client. The basic idea is to add a new interface patterned after the proposed statx() interface for Linux. The idea there is to extend the traditional stat() interface in several ways:

  1. allow access to more inode attribute fields, in particular the birthtime (btime) and a change attribute (a'la NFSv4), in a way that is extensible in the future

  2. allow applications to request just specific fields in the ceph_statx struct be filled out. Note that you can (and will) often get more than you request, but you should always get what is requested or an error.

  3. allow applications to do a "lazy" statx, where we fill out the attribute fields from whatever we have in cache, without querying the server. This is handy for performance reasons or in cases where we don't really need up-to-date information.

In addition, a ceph_setattrx has also been added, which allows the setting of the btime as well as the other fields that ceph_setattr allows. I also have a companion patchset for samba that converts it to use these new interfaces. That seems to work properly, though I still need to roll a testcase to verify that setting the btime via smb actually works.

Pay close attention to the first patch in the file as well, which I think may help performance in some cases by allowing us to request the caps needed to do an operation during the lookup phase of path-based operations.

I've also started some work to do a patchset for ganesha, but that requires adding a number of new ceph_ll_* calls. I think it's probably best to get this merged first and then do that part afterward.

Testing this has been somewhat problematic. I've been fighting a number of unrelated testcase failures, but the new code seems to do the right thing so far.

jtlayton and others added 27 commits August 23, 2016 16:07
ll_walk expects to get a set of attributes out of a path_walk. Pass a
caps mask parameter into path_walk, and then apply it when we reach the
last component of the path.

This may prevent us from having to further iteract with the server after
the pathwalk, in some cases.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
...in preparation for adding a ll_getattrx.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
We'll need them in Client.cc now in addition to FUSE specific code.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
New interfaces for fetching extended (and selective) stat information.
Additionally, applications can specify AT_NO_ATTR_SYNC in the flags to
indicate that they want to do a "lazy" statx that just hands out the
inode info from the cache, or AT_SYMLINK_NOFOLLOW to avoid following
symlinks when walking the path.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Create 2 clients. Create a file in client1, and do a lookup of it in
client2, and then ll_getattrx it from client2. chmod the file from
client1, ll_getattrx it client2 (this time with AT_NO_ATTR_SYNC) and
ensure that the ctime change is not seen.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Currently we don't have a mechanism to set the btime, but we will need
that eventually. If we want to allow the client to cache that change, we
need to be able to pass it back and forth between client and server.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
...that is mapped to BTIME feature.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
...and pass it around appropriately.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
The semantics for a change_attr are that it should be incremented
whenever there is a change to the ctime in the inode. Add those
increments for the simple case of regular files. Directories however can
be fragmented so we'll need to do something more elaborate there.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
To give us change attribute support for directories. Whenever we gather
the dirfrags, we just select the largest change_attr out of the set.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Unfortunately, the only option here is to rev the MClientRequest
version as the ceph_mds_request_head is not currently versioned. Add a
ceph_mds_request_head_new, which contains a ceph_mds_request_args_new
structure.

The ceph_mds_request_head_new is now versioned via a __le16
at the beginning of it, and then the args structure is expanded to hold
the btime. The btime is also stored within a separate field of the
MClientRequest, and the encode and decode place the value in there as
needed.

Reluctantly-Signed-off-by: Jeff Layton <jlayton@redhat.com>
This adds a new set of libcephfs calls: ceph_ll_setattrx and
ceph_setattrx. This allows clients to set the btime in addition to other
values that are typically settable via ceph_setattr calls.

Currently, the setattrx mask uses the same CEPH_SETATTR values that the
ceph_setattr interface uses. I'm not sure this is what we will want
though. Would it be better to rephrase that via STATX_* constants?

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
@jtlayton jtlayton changed the title First pile of statx patches First pile of ceph_statx patches Aug 24, 2016
@ukernel ukernel added the cephfs Ceph File System label Aug 24, 2016
@@ -9623,6 +9628,7 @@ Inode *Client::open_snapdir(Inode *diri)
in->ctime = diri->ctime;
in->btime = diri->btime;
in->size = diri->size;
in->change_attr = diri->change_attr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't actually need this on a snapdir; it's just for consistency, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You appear to have removed this, and I didn't mean to say it was wrong, I just wanted to double-check. :)
(And actually we do allow snaps to be renamed at present, which could maybe bump the change attr...not sure how that will play out in future or if it's a thing we want to support).

@jtlayton jtlayton closed this Aug 25, 2016
@jtlayton jtlayton reopened this Aug 25, 2016
@jtlayton jtlayton closed this Aug 25, 2016
@jtlayton
Copy link
Contributor Author

I'll close this for now, until the changes are done and have been tested. Thanks for the comments so far!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System feature
Projects
None yet
6 participants