Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcephfs client API overhaul and update #11647

Merged
merged 31 commits into from Nov 7, 2016
Merged

libcephfs client API overhaul and update #11647

merged 31 commits into from Nov 7, 2016

Conversation

jtlayton
Copy link
Contributor

This is the first pile of patches for a proposed overhaul of libcephfs client API. The brunt of the changes are there to support @gregsfortytwo 's UserPerm work. Applications can now set up a UserPerm object and pass that in to different calls.

Additionally, this plumbs in the use of ceph_statx in more calls, in particular, the readdirplus code.

All of these API changes add up to backward incompatible changes to the libcephfs ABI as well, so this also bumps the SOVERSION on the library from 1 to 2 (per @liewegas's suggestion on the mailing list), and changes the packaging accordingly.

I have nfs-ganesha changes that will allow it build against this. I also have some older samba patches, but those may need some work after the latest changes.

Note that this is just the first pile. I think we also will want to make some other changes to better support ganesha here (in particular, allowing it to pass in a userperm on read, write and fsync), but this is a good spot to stop and take a pause.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
@jtlayton jtlayton added the cephfs Ceph File System label Oct 25, 2016
...and convert ceph_fsetattr callers to use it instead.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
…eph_fsetattr

...there are no more callers.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Add args for the gids_count and gids list, and give them default
values so that callers can populate it correctly. We'll need this
for ganesha so it can populate the UserPerm from a RPC AUTH object.

Note that the gidlist pointer must be valid for the lifetime of
the created object!

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
…inal

Signed-off-by: Jeff Layton <jlayton@redhat.com>
…Perm

For now, we leave the old ->ll_lookup method in place, as FUSE needs
it. We could do a ceph_statx -> stat conversion, but that's extra
copies and I don't think we want the perf hit in FUSE.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
In addition to acquiring the right caps for the requested attributes, we
can also do a path walk that terminates on an existing symlink without
following the link now.

Is that useful? No idea...

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Again, for now we leave the underlying method called ll_setattrx, since
FUSE is wired to use struct stat.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
It's just a wrapper around ceph_ll_setattr that is only called from the
ganesha SETATTR handler. There's no need for special casing truncate
handling.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
…erPerms pointer

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Make a common function that can be called and move the handling of the
out inode reference and attribute handling into the caller.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
The underlying plumbing needs to remain the same for FUSE though.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
…eter

The main user of this API (ganesha) doesn't do anything with the
returned attributes, so there's no real point in returning them
there.

Also, we're not guaranteed to have any caps on the target inode
after the link operation, so in the case of FUSE (which does
require the post-op attributes) we should really do a getattr
to get the latest attributes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
...that defaults to true since that's what most of the callers want.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Make ceph_readdirplus_r take a ceph_statx, a want and a flags parm. With
this, we can allow applications to express an interest in subset of the
attributes, and can allow for a "lazy" readdirplus.

Drop the stmask. It ends up returning the caps that the client holds on
the inode. That's not well defined, and we can now express that in a
better way via the stx_mask, which applications can use to tell which
fields in the ceph_statx are actually valid.

For now, the want mask is ignored. I don't see a way to ask for a set
of caps in a ceph readdir request on the wire. Maybe we could add that?

Signed-off-by: Jeff Layton <jlayton@redhat.com>
...so we can ensure that we have the necessary caps when filling out
the ceph_statx for each dentry's inode. In order to only do this when
completely necessary, we have want default to 0 and the flags default
to AT_NO_ATTR_SYNC. The only codepath where we pass in a non-default
set of args there is ceph_readdirplus_r as it's the only codepath that
cares about fields in the ceph_statx that aren't immutable.

For now, since we have no support for requesting caps during a readdir
call, we simply issue getattrs prior to calling fill_statx. If we
already have the necessary caps, or are doing a lazy statx then this
becomes a no-op.

Note too that I _think_ the MDS will recall caps on the entries when
satisfying a readdir, so we avoid calling getattr when we're populating
the ceph_statx out of a just-acquired readdir response.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Ganesha needs an inode reference in addition to the attributes when it
calls readdirplus. Other callers however don't need an inode reference.

We could just take one universally and pass it to the callback, but most
callers don't need that reference and would need to put it in the
callback. That's cumbersome and mutex-thrashy.

So, we need to fix the readdir engine to only conditionally take this
extra reference, when the callback will actually use it. Add a bool to
readdir_r_cb that defaults to false and indicates that the caller wants
an inode reference for each dentry returned. When that bool is true
we'll pass a pointer to the inode to the callback after taking a
reference. Otherwise, NULL is passed to the callback.

Next, add a return double pointer arg to ceph_readdirplus_r that
indicates whether the caller wants an inode reference and where to put
the pointer to the inode. Almost all callers will set that to NULL, but
ganesha can set it to a non-NULL value to get the inode reference that
it wants on each call.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
The new libcephfs API is incompatible with the old. Move to SOVERSION
2.0.0 so that any programs built against the old headers will fail at
load time.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
@jtlayton
Copy link
Contributor Author

jtlayton commented Oct 26, 2016

Test run mostly passed. A couple of jobs ended up dead for reasons that aren't clear to me, but I suspect that those are unrelated to the changes in this PR:

http://pulpito.ceph.com/jlayton-2016-10-25_18:12:41-fs-wip-jlayton-cephfs---basic-mira/

@jtlayton
Copy link
Contributor Author

Note that there are companion PRs for teuthology and ceph-qa-suite that should be merged at the same time as this one:

ceph/ceph-qa-suite#1223
ceph/teuthology#972

Copy link
Contributor

@jcsp jcsp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy. Looks like it just needs a squash before merging for Kraken.

@jcsp jcsp merged commit f80c7a8 into master Nov 7, 2016
@jcsp jcsp deleted the wip-jlayton-cephfs branch November 7, 2016 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System
Projects
None yet
2 participants