Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: Clear setuid bits when writing or truncating #12412

Merged
merged 5 commits into from Dec 19, 2016
Merged

Conversation

jtlayton
Copy link
Contributor

@jtlayton jtlayton commented Dec 9, 2016

This is the remainder of the fix for this tracker bug:

http://tracker.ceph.com/issues/18131

The basic idea is to declare a new CEPH_SETATTR_KILL_SGUID flag. When this is set in a setattr request, the MDS will clear those bits out of the mode. This should allow us to fix the mode properly without relying on a read/modify/write cycle of the mode (which would be racy).

Truncates always go to the server so there we just set the KILL_SGUID flag if we don't have Ax caps.

Writes will now take As caps along with Fw in order to check whether to issue a setattr. I think that should be mostly harmless since we'll already have them in most cases, but there could be some small performance impact there. I'm open to suggestions if anyone sees a better way to do this.

Also, there are a couple of bugfixes in here for problems that I noticed by inspection, and I beefed up the new ClearSetuid test to test truncate and write behavior.

Once this is in, we should be set to change ceph-fuse over to handling setuid clearing itself, which will close some races that can occur now.

There's a noticable up to tick-length hang that occurs when you call
ceph_shutdown with open files still in play. Ensure that we always
call ceph_ll_close after a successful ceph_ll_create.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
btime lives under auth caps, but the client drops file caps when setting
it. Change it to drop auth caps instead.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Add tests for write and truncate and switch to symbolic constants.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
…re set

Declare a new CEPH_SETATTR_KILL_SGUID flag. When that bit is set in
the mask, then that tells the server to clear out the S_ISUID and
S_ISGID bits. Doing that is less racy and problematic than trying
to do a read/modify/write cycle on the mode. Note that this flag is
ignored if the mode is being set in the same setattr call (uncommon,
but possible from something like ganesha or samba).

Then, change the client library write code to get As caps when issuing a
write, so we can check whether the mode has the setuid/setgid bits set.
If it does, then call __setattrx to clear them using
CEPH_SETATTR_KILL_SGUID.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
On a size change, try to clobber the S_ISUID and S_ISGID bits if we
have Ax caps. If we don't have them, then ask the server to do it
for us using CEPH_SETATTR_KILL_SGUID.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
@jtlayton
Copy link
Contributor Author

jtlayton commented Dec 9, 2016

I kicked off a test last night that has mostly passed, but the queue seems to be paused or something at the moment:

http://pulpito.ceph.com/jlayton-2016-12-08_23:15:40-fs-wip-jlayton-suid---basic-smithi/

Copy link
Member

@gregsfortytwo gregsfortytwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple comments on code, a few below based on the commit messages. And a more general question:
I get the impression from your verbal descriptions that this clearing is really supposed to be atomic with respect to the file being changed. We can do that with the metadata changes, but with write it's obviously going to be two separate steps we do our best to disguise as one thing. Here, it's being done with a setattr request to the MDS. But it seems like it would be more reliable to have the MDS locally clear ISUID|ISGID when it hands out the write capability on files with it set, and to only hand out that cap on demand. (I don't remember if we give out Fw proactively or not.) That would also mean we don't need As on every write, which even though it should be cheap seems inimical to some realistic workloads (like maybe some rsync configs?). Did you look at any options like that? Do you think the client-side best effort is good enough?

"There's a noticable up to tick-length hang that occurs when you call ceph_shutdown with open files still in play." Are there pieces of tick() we should invoke from shutdown()?

"test: flesh out the setuid/setgit clearing test" s/setgit/setgid/ ? :)

"Note that this flag is ignored if the mode is being set in the same setattr call (uncommon, but possible from something like ganesha or samba)." I'm not seeing that.

@@ -3800,7 +3800,7 @@ void Server::handle_client_setattr(MDRequestRef& mdr)

if (mask & CEPH_SETATTR_MODE)
pi->mode = (pi->mode & ~07777) | (req->head.args.setattr.mode & 07777);
else if ((mask & (CEPH_SETATTR_UID|CEPH_SETATTR_GID)) &&
else if ((mask & (CEPH_SETATTR_UID|CEPH_SETATTR_GID|CEPH_SETATTR_KILL_SGUID)) &&
S_ISREG(pi->mode)) {
pi->mode &= ~S_ISUID;
if ((pi->mode & (S_ISGID|S_IXGRP)) == (S_ISGID|S_IXGRP))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the previous set of commits but I guess I didn't ask — why are we unconditionally removing S_ISUID and doing a check before removing S_ISGID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the S_ISGID bit set on a file that does not have the group execute bit is (oddly enough) used to indicate that the file is a candidate for mandatory locking. So you only want to clear that bit if the group execute bit is also set.

struct ceph_statx stx = { 0 };

put_cap_ref(in, CEPH_CAP_AUTH_SHARED);
r = __setattrx(in, &stx, CEPH_SETATTR_KILL_SGUID, f->actor_perms);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is asynchronously sending off an MDS request and then doing the normal write. That does mean under nasty failure conditions we could send off the write, fail to send off the setattr, and then crash...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that's a synchronous call. do_setattr calls make_request, which appears to wait on the reply (like we would for any other setattr). Am I missing something there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...wow, I'm not sure how I got that confused. You're right of course.

@gregsfortytwo gregsfortytwo removed their assignment Dec 13, 2016
@jtlayton
Copy link
Contributor Author

I don't see how this could be racy. Once we have As caps we know that the mode can't change until the client_mutex is released, which happens after the write is complete. It is possible that we might crash before we ever issue the write after doing the setattr, but I don't think we can make it any more atomic than that and if we end up clearing some mode bits in that case, then I think we can live with it (after all, we were going to do a write).

I did consider whether we might just clear them when the client takes Fw caps, and have the MDS recall Fw when we need to change the mode. The problem there is that POSIX is pretty clear that you need to clear the bits on the actual write. The client though, will request Fw when it opens a file for write. It's possible (maybe even likely) that some applications might open files O_RDWR and only issue reads to that fd. If we clear the bits on Fw acquisition then we'd clear the bits in that situation as well.

That said, I'm open to the idea if we want to go beyond what POSIX requires there. Just be forewarned that it has the potential to break some weird corner cases.

@jtlayton
Copy link
Contributor Author

Oof. We don't actually hold on to As caps until the write is complete as we release them before doing the setattr call. That's hard to fix as we need to release As caps in order for the server to clear the bits in the mode for us.

So, it's possible we'd check the mode, and see it without any setuid bits, and before we do the write, a setattr from another client races in and sets those bits in the mode. I guess the question there is...does it matter? After all, we end up with the same result if the mode change gets in just after the write.

I tend to think this is good enough, and not likely to hurt performance in the common case.

@gregsfortytwo
Copy link
Member

I'm not so concerned about making writes slow when you have the bits set — it's more about imposing costs when they aren't, and I really don't like forcing a check every time.
In terms of the racing setattr, that also shouldn't be a problem since it requires the user to have overlapping syscalls. ;)

So, how about we change the client so it doesn't request write caps on a file with these mode bits set until it actually gets into the write syscall (and/or the MDS ignores Fw cap requests that are bundled up with opens on those files)? That doesn't seem too hard once we know the bits are set, which we we can handle with a one-time look at the mode (which we probably already have caps on if we're opening).

@jtlayton
Copy link
Contributor Author

I agree about not imposing extra penalty on the common case where these aren't set. That said, I'm not convinced that what you're suggesting will be any better in that regard.

If we do that then we'll also need to ensure that the MDS recalls all outstanding Fw caps if a client has a request to change the mode in such a way that either setuid bit is set. The client will also need to proactively drop Fw on any mode changes that set the setuid bits as well.

Also, if the client has Ax and tries to set the mode to something with either bit set, then we'll need to take steps to ensure that the MDS recalls Fw caps in that case as well (probably by just dropping Ax at that point and sending the mode change directly to the MDS).

So yeah, with that you don't need As caps on the client to do a write. But...once we get As we're unlikely to lose them. uid/gid/mode/btime changes are quite rare in most workloads, so getting and holding As caps in most cases should be no big deal. Having to sort out As caps on the initial write doesn't seem any more costly than having to acquire Fw in the same situation.

Also, what about legacy clients? Are we ok with someone mounting their kraken share up with their jewel client and stripping off the setuid bits when they accidentally open a file with O_RDWR? I'm not yuuuugely concerned there, but we should be aware that that is the likely outcome in that situation.

Thoughts?

@gregsfortytwo
Copy link
Member

I guess that's a good enough argument for me. We can do something more sophisticated if it causes trouble.
I went through where we actually do stuff that revokes As and I'm not seeing any other cases we need to worry about.

@liewegas, can you think of anything that would make it really bad for our writers to maintain an As cap on files they are actively writing?

@gregsfortytwo
Copy link
Member

It's kind of a bummer about older clients not stripping off the setuid/gid bits properly — I guess for them we just have to hope the kernel FUSE is doing the right thing?

@jtlayton
Copy link
Contributor Author

Yes, older fuse clients should do the right thing as FUSE will send down a setattr request when the kernel needs to strip those bits. Aside from not handling that correctly on chown on recent buggy kernels, that should keep working as it always has.

@liewegas liewegas changed the title Clear setuid bits when writing or truncating client: Clear setuid bits when writing or truncating Dec 19, 2016
Copy link
Member

@liewegas liewegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jtlayton jtlayton merged commit 20f936e into master Dec 19, 2016
@jtlayton jtlayton deleted the wip-jlayton-suid branch December 19, 2016 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System needs-review
Projects
None yet
4 participants