Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix attribute handling at lookup time #10386

Merged
merged 7 commits into from Aug 2, 2016
Merged

Fix attribute handling at lookup time #10386

merged 7 commits into from Aug 2, 2016

Conversation

jtlayton
Copy link
Contributor

This should be a fix for the bug that Frank reported here:

http://tracker.ceph.com/issues/16668

The root cause of the problem is due to several factors:

  1. _lookup doesn't check for the correct caps before filing out the stat structure
  2. when an inode has multiple dentries, leases on the other dentries are not revoked after unlinking one
  3. the MDS doesn't send any is_target info in the UNLINK reply trace

All of this means that when you remove one link of a hardlinked inode and then do a ll_lookup on the remaining dentry, it will end up filling out the struct stat with stale attributes.

This also has a fix for a minor bug in the handling of the ctime in fill_stat.

Teuthology run is in progress here:

http://pulpito.ceph.com/jlayton-2016-07-21_07:31:09-fs-wip-jlayton-nlink---basic-mira/

@jtlayton
Copy link
Contributor Author

Rebase onto current master.

@gregsfortytwo
Copy link
Member

Code looks great. But I'm a little confused -- is "parm" an abbreviation I've never seen before or a very common typo for you, @jtlayton?

@gregsfortytwo
Copy link
Member

Oh, and can you tag the appropriate commit with the Fixes: line for that fracker bug.

@jtlayton
Copy link
Contributor Author

Yeah, "parm" is just an abbreviation of "parameter". I'll make it a bit more clear in the changelogs.

@jtlayton
Copy link
Contributor Author

I've also kicked off a couple of "fs" test suite runs, but a lot of the tests in there are failing, and I don't think it's due to any of the changes in this PR. I'm doing a control run now from the head of the tree sans these patches to see if it fares any better.

@jtlayton
Copy link
Contributor Author

jtlayton commented Jul 26, 2016

Found a bug in one of the commits, so I'm rerunning the build and unit tests now.

@jtlayton jtlayton force-pushed the wip-jlayton-nlink branch 3 times, most recently from 25ae623 to 3248dc5 Compare August 1, 2016 13:26
@jtlayton
Copy link
Contributor Author

jtlayton commented Aug 1, 2016

Ran a "fs" test run over the weekend, and all but 9 tests passed. Most of the failures were SSH errors, but there were a few others too:

http://pulpito.ceph.com/jlayton-2016-07-29_18:51:42-fs-wip-jlayton-nlink---basic-mira/

I opened bugs for most of the failures and kicked off a new set of tests for the failing runs. Those all passed, except for one:

http://pulpito.ceph.com/jlayton-2016-08-01_04:07:30-fs-wip-jlayton-nlink---basic-mira/

...so I kicked that one off again after rebaing onto the current master branch, and it still fails with the same error. This is a new forward scrub test however, so I suspect that something is broken in the test itself (or maybe the tighter caps handling in this PR is throwing off the results?).

I'm rerunning the test again now without any of my patches now to see whether it passes.

@jtlayton
Copy link
Contributor Author

jtlayton commented Aug 1, 2016

The only test that is consistently failing also fails with a recent pull from the master branch. It's an MDS related test and this is strictly a client-side pull request (now). So, I don't think that's a regression in this set.

@gregsfortytwo
Copy link
Member

@jtlayton, you mean the "wrongly marked free" inode on scrub, right?

@jcsp ran c24d480 (newer than your wip-jlayton-control's 1194331) and it isn't failing there, although I'm not sure what merged PR might have resolved it. I suggest you wait for @jcsp's next test branch if you're sure this isn't something you did, or else rebase/merge with more current master.

goto hit_dn;
if (dn->cap_shared_gen == dir->shared_gen &&
(!dn->inode || dn->inode->caps_issued_mask(mask)))
goto hit_dn;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You appear to have some weird tabbing/spacing rules here, btw — not sure why this line is indented 6(?) spaces instead of 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I see the problem there? The goto line is spaced in two spaces inside the conditional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I open this in my emacs and have it indent things it's indenting "goto hit_dn;" 2 spaces past the initial "if" line (it, before the "!dn->inode" bit); it looks like whatever setup you've got is indenting it 2 past the final line. shrug

The CEPH_CAP_SFLOCK shift value (and the other constants derived from
it) are entirely unused. Ditto for CEPH_CAP_BITS. Just remove them.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
...and use the lowlevel calls to emulate what ganesha would do. In
particular, it dumps its inode from the cache after an unlink. So
do a second lookup afterward and verify that the nlink field in the
struct stat is valid.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
We need to allow callers to specify caps to acquire during a lookup, as
they may need to scrape certain info out of the inode later. Allow them
to pass in a mask. For now, _lookup just passes in 0 for the mask,
but verify_reply_trace passes in the regetattr_mask to match what we'd
request in the _getattr request if there were a traceless reply.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
...and attempt to pass in a sane value there, based on what we'll do
with the resulting inode.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
…ary caps

If we don't have the caps we'll need later, then we must reissue the
lookup to get them regardless of whether we have a lease on the
dentry.

Fixes: http://tracker.ceph.com/issues/16668
Signed-off-by: Jeff Layton <jlayton@redhat.com>
…he mtime

The current code just looks at the sec field, but the difference may
only be evident in the nsec field.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
@jtlayton
Copy link
Contributor Author

jtlayton commented Aug 2, 2016

Thanks @gregsfortytwo!

Ok, fixed up the doc to remove the dangling sentence fragment, and also purged all mention of the FLOCK permissions. I've also added a patch to remove the SFLOCK related constants. It's just confusing having that sort of cruft in there.

On the indentation in _lookup... I dunno, it looks fine to me. Could you send me a patch based upon this branch that shows how you think it ought to be indented?

@jcsp jcsp merged commit 01cd578 into master Aug 2, 2016
@jcsp jcsp deleted the wip-jlayton-nlink branch August 2, 2016 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants