New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mds: trim null dentries proactively #10606
Conversation
Okay, so this is really only pushing them to the bottom on replay, right? The commit message concerned me (as obviously we sometimes want to keep them around for fast responses on incomplete dirs). |
Tested by ceph/ceph-qa-suite#1111 |
e5eae60
to
c419878
Compare
Right, the extra touch_dentry_bottom calls are only in replay. There are other places that we do that already, like unlinks and renames, so those will also get those stray dentries thrown away immediately instead of waiting for the cache fill up. The (important) case of null dentries that cache lookup misses should be unaffected though. |
LGTM |
…o greg-fs-testing #10606 Reviewed-by: Greg Farnum <gfarnum@redhat.com>
http://pulpito.ceph.com/gregf-2016-08-25_19:08:49-fs-greg-fs-testing-825---basic-mira/385454/ may be this PR's fault; testing without it. |
http://pulpito.ceph.com/gregf-2016-08-29_04:22:27-fs-greg-fs-testing-828---basic-mira/ has the same branch minus this PR running that test job. |
Yeah, looks like it's busted for some reason. I unzipped the MDS logs and looked a little bit and I'm not quite sure why they're stuck, but it does indeed involve stray inodes and trimming on the export. You may just have lucked into revealing a bug that was being disguised by timing and isn't any more; not sure. |
Instead of leaving null dentries (e.g. left behind from unlinks) in the cache until they fall out of the LRU, actively push them to the bottom of the LRU and then consume all nulls at the bottom in trim() even if the cache is not oversized yet. This fixes the case where standby replay daemons would otherwise accumulate a cache full of null dentries resulting from unlinks, and it makes the behaviour of active daemons more deterministic. Fixes: http://tracker.ceph.com/issues/16919 Signed-off-by: John Spray <john.spray@redhat.com>
c419878
to
86f6522
Compare
The offending test (TestStrays.test_migration_on_shutdown) is passing with latest update to this patch: I suspect there was an underlying bug here that was triggered by the extra trimming, so I'm going to create a branch that sets cache size to zero to see if it triggers it. |
@gregsfortytwo quick re-review? |
(the bit that changed was the extra |
Reviewed-by: |
Instead of leaving null dentries (e.g. left
behind from unlinks) in the cache until they
fall out of the LRU, actively push them
to the bottom of the LRU and then consume
all nulls at the bottom in trim() even if
the cache is not oversized yet.
This fixes the case where standby replay daemons
would otherwise accumulate a cache full of
null dentries resulting from unlinks, and it
makes the behaviour of active daemons more
deterministic.
Fixes: http://tracker.ceph.com/issues/16919
Signed-off-by: John Spray john.spray@redhat.com