-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common, libcephfs: Fixes for LibCephFS.ShutdownRace test failures #18139
Conversation
Mostly passing fs suite run here: http://pulpito.ceph.com/jlayton-2017-10-05_18:49:13-fs-wip-jlayton-21512-distro-basic-smithi/ |
Ahh, this is now failing the env_to_vec test (which I didn't see before). It looks like that relies on the behavior that we explicitly are preventing here with the patch to env_to_vec. Asking @dachary for his input here: how should we fix this? Once we've started handing out pointers to elements in str_vec, we can't just go clean out and overwrite that array. Are there any programs that actually rely on being able to set environment variables and then call down into env_to_vec again? |
The only setenv() callers are in the rdma code and it's meant for external consumers, nothing in our process. So.. just the test code needs fixing? |
In that case we should probably just remove the env_to_vec test altogether, as it doesn't really work with what we have now. SQUASH patch pushed that does that. We may be able to resurrect it with a new routine to explicitly clear out str_vec. @dachary added that test in 2014, so I'd like to have his input and ack here before deciding what to do. |
Can you explain what motivates that shutdown thrasher? I haven't traced its existence but it really seems like you're testing a bunch of stuff that nobody expects to work, and I'm not sure why. Is it invoked when remounting CephFS inside of a process? Just motivated by some tests you wrote that violated the rules? |
Initially, I hit some of these problems when writing the delegation tests. Those spawn threads and create clients, and sometimes I'd have two of them racing to shut down at the same time. I expect Manila+ganesha to do this (but even less predictably). Manila is expected to set up new shares/exports, each with a different cephx id and key. Those each get a different client. Those can be added and removed at any time (IIUC), quite possibly even in parallel. Do we expect that to typically be run with lockdep? Probably not, but selectively disabling it for the test turned out to be very difficult. It was simpler to just fix lockdep to cope with this sort of situation. |
memset((void*) &free_ids[0], 255, sizeof(free_ids)); | ||
if (!free_ids_inited) { | ||
free_ids_inited = true; | ||
memset((void*) &free_ids[0], 255, sizeof(free_ids)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not sure what you mean here. This whole pile of code is using 2-space indent. I agree that it's hideous, but it's the code style in use...
Now that I've slept on it, I'm not sure I really like the env_to_vec patch. It'll break the case where we call env_to_vec with different env variable names. That said, AFAICT, nothing ever passes anything but nullptr to that function (aside from the selftests), so it universally uses $CEPH_ARGS. The right way to fix it would be to instantiate a new copy of str_vec every time this function is called, but I don't see a good way to ensure that those strings are eventually cleaned up. I guess we could look at reworking env_to_vec to populate a vector with string objects instead of pointers, but that means quite a bit of code churn. |
retest this please |
Docs build check failed on some curl error. I don't think it's related to any of the changes in this PR. |
Looks like the test failed doing this:
That timeout=20 is suspicious -- I assume that's in seconds? I ran this command on my box in an empty dir and it took well over 20s to run. |
retest this please |
2 similar comments
retest this please |
retest this please |
This latest set adds a clear_g_str_vec() function that will clean out the global str_vec. The testcase can then use that to explciitly erase the global str_vec. It's still hideous, but since we have callers in the field that already use ceph_conf_set_env(), I think this is the best we can do. The missing SoB is in a SQUASH patch that will be dropped soon anyhow. I think everything else is OK. |
After it has been called once and we have outstanding CephContexts with pointers into str_vec, we can't call get_str_vec on it again. Add a static local mutex to protect access to str_vec. Tracker: http://tracker.ceph.com/issues/21512 Signed-off-by: Jeff Layton <jlayton@redhat.com>
Prefix str_vec and str_vec_lock with "g_" to make it clear that they are truly global values. Add a new clear_g_str_vec function to allow it to be explicitly cleaned out by callers that need that functionality (mostly testcase for now). Tracker: http://tracker.ceph.com/issues/21512 Signed-off-by: Jeff Layton <jlayton@redhat.com>
It's possible for the teardown of g_lockdep_ceph_ctx to occur, followed by a new context being registered as the lockdep context. When that occurs, we can end up reusing lock id's that were previously handed out to consumers. We need for those IDs to be persistent across lockdep enablement and disablement. Make both the free_ids table, and the lock_refs map persistent across lockdep_unregister_ceph_context and lockdep_register_ceph_context cycles. Entries in those tables will only be deleted by the destruction of the associated mutex. When lockdep_unregister is called, do the refcounting like we normally would, but only clear out the state when the lockid is registered in the lock_names hash. Finally, we do still need to handle the case where g_lockdep has gone false even when there are outstanding references after the decrement. Only log the message if that's not the case. With this, we can deal with the case of multiple clients enabling and disabling lockdep in an unsynchronized way. Tracker: http://tracker.ceph.com/issues/21512 Signed-off-by: Jeff Layton <jlayton@redhat.com>
Have each thread do the startup and shutdown in a loop for a specified number of times. Tracker: http://tracker.ceph.com/issues/21512 Signed-off-by: Jeff Layton <jlayton@redhat.com>
9e92898
to
f877e36
Compare
Squashed down the later changes into earlier patches. I think this is now ready for merge, unless there are objections. |
retest this please |
* refs/pull/18139/head: test: make the LibCephFS.ShutdownRacer test even more thrashy lockdep: free_ids and lock_ref hashes must be truly global common: add a clear_g_str_vec() function to clear g_str_vec common: make it safe to call env_to_vec multiple times Reviewed-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
@batrick pointed my attention at some ShutdownRace test failures here:
http://tracker.ceph.com/issues/21512
The reported problem should be fixed by the first patch in the series. While testing that, I beefed up the ShutdownRace test a bit, and found and fixed another issue with lockdep:
The main problem there is that lockdep hands out ids that get attached to a Mutex or RWLock. Since those survive lockdep being shut down, we end up with collisions and false positives for lock recursion.
When lockdep is shut down, we can clear out most of the state, but we do need to keep track of which ids are currently assigned out. For that, we need to keep the free ids table, and the refcounting for it persistent past the shutdown.
That means that we need to take a little extra care in lockdep_unregister to clear out the ids in the right way when we're handed one after lockdep has been shut down. Once all locks allocated a valid ID are freed, the lock_refs map should be cleared out naturally.