Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osdc: Fix race condition with tick_event and shutdown #7151

Merged
merged 1 commit into from Jan 13, 2016
Merged

Conversation

adamemerson
Copy link
Contributor

  • Clear the tick_event whether it was in the timer queue or not.
  • Make sure we don't schedule a new tick_event if someone calls shutdown while
    tick() is running
  • Make tick_event atomic so we can check it without a lock/while only holding
    a read lock

Fixes #14256

if (tick_event) {
if (timer.cancel_event(tick_event)) {
if (tick_event.read()) {
if (timer.cancel_event(tick_event.read())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. We're holding Objecter::rwlock here, and the timer is potentially in the Objecter::tick callback, blocking on acquiring rwlock. If we cancel the event here, that's a no-op because timer has already removed the event from the queue. We'll go ahead and zero it, and eventually give up rwlock. Objecter::tick will acquire rwlock, and immediately assert out on the condition that tick_event is nonzero. Probably the assertion inside Objecter::tick needs to move the other side of the initialized check so that it never gets hit in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching that. I think it's better just to drop that assert in tick() entirely. It's fragile and doesn't really guard against anything except someone calling 'tick' directly.

  - Clear the tick_event whether it was in the timer queue or not.
  - Make sure we don't schedule a new tick_event if someone calls shutdown while
    tick() is running
  - Get rid of some assertions that aren't relevant

Fixes #14256
@@ -4795,7 +4794,6 @@ Objecter::~Objecter()
assert(check_latest_map_ops.empty());
assert(check_latest_map_commands.empty());

assert(!tick_event);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we clear tick_event in tick() and when we cancel both, can't we keep this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably yes. I got rid of this since getting rid of the asserts was the suggested way to solve the problem of not having a lock on the class when reading them without making them atomic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsfortytwo gregsfortytwo added the cephfs Ceph File System label Jan 12, 2016
liewegas added a commit that referenced this pull request Jan 13, 2016
osdc: Fix race condition with tick_event and shutdown

Reviewed-by: John Spray <jspray@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
@liewegas liewegas merged commit ed82212 into master Jan 13, 2016
@liewegas liewegas deleted the wip-14256 branch January 13, 2016 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants