Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: check is_shutdown() in timer callbacks #14919

Merged
merged 2 commits into from May 8, 2017
Merged

Conversation

tchaikov
Copy link
Contributor

@tchaikov tchaikov commented May 3, 2017

Fixes: http://tracker.ceph.com/issues/19825
Signed-off-by: Kefu Chai kchai@redhat.com

@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

retest this please.

@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

@liewegas mind taking a look at this one also?

@liewegas
Copy link
Member

liewegas commented May 4, 2017

We have lots of code like

void Elector::shutdown()
{
  if (expire_event)
    mon->timer.cancel_event(expire_event);
}

They won't break (cancel_event behaves if the event is already gone), but it makes me nervous having these dangling pointers to freed memory. I think this will be safer than maintaining all of the shutdown timer checks, though!

@@ -899,6 +899,10 @@ void Monitor::shutdown()

wait_for_paxos_write();

// stop timer earlier so the event callbacks won't panic seeing an inactive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/earlier/early/

liewegas
liewegas previously approved these changes May 4, 2017
@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

@liewegas how about exposing the cancel_event() methods of all timer consumers in monitor and call them before calling timer.shutdown()?

@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

changelog

  • s/earlier/early/

@liewegas
Copy link
Member

liewegas commented May 4, 2017 via email

@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

even if Paxos cancels all its events, there is still chance that the lease_timeout_event callback can be called after the is_leader() returns false. please note we change the mon's state before we shutdown the paxos.

maybe we can change mon's state to STATE_SHUTDOWN after paxos is shutdown?

@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

changelog

  • change mon's state to STATE_SHUTDOWN after paxos is shutdown.

@tchaikov tchaikov dismissed liewegas’s stale review May 4, 2017 14:55

the latest change is different now. needs review.

@tchaikov tchaikov changed the title mon/Monitor: stop timer earlier in shutdown() [DNM] mon/Monitor: stop timer earlier in shutdown() May 4, 2017
@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

adding [DNM], i haven't tested this yet.

@liewegas
Copy link
Member

liewegas commented May 4, 2017 via email

@tchaikov
Copy link
Contributor Author

tchaikov commented May 4, 2017

@liewegas there are a handful call backs checking the monitor state, i think we could check if (mon->is_shutdown()) in all of them, but if this approach works. i'd prefer go with postponing the "state = STATE_SHUTDOWN".

(i just skimmed through the code, seems it's safe to do so)

@tchaikov tchaikov changed the title [DNM] mon/Monitor: stop timer earlier in shutdown() mon/Monitor: stop timer earlier in shutdown() May 4, 2017
@tchaikov tchaikov requested a review from liewegas May 4, 2017 16:07
@tchaikov tchaikov changed the title mon/Monitor: stop timer earlier in shutdown() mon/Monitor: switch state to STATE_SHUTDOWN after stopping all services May 4, 2017
@@ -899,8 +899,6 @@ void Monitor::shutdown()

wait_for_paxos_write();

state = STATE_SHUTDOWN;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do this either :( because we use is_shutdown() in places like ms_dispatch to drop new work on the floor. Without it, we will take in a start new work even after systems are shutting down..

@liewegas
Copy link
Member

liewegas commented May 4, 2017

Looking a bit more it seems like checking for is_shutdown() in the callacks is the safest thing?

instead of doing it manually.

Signed-off-by: Kefu Chai <kchai@redhat.com>
@tchaikov tchaikov changed the title mon/Monitor: switch state to STATE_SHUTDOWN after stopping all services mon: check is_shutdown() in timer callbacks May 5, 2017
introduce a helper class: C_MonContext, and initialize all timer events
using it, to ensure that they do check is_shutdown() before doing their
work.

Fixes: http://tracker.ceph.com/issues/19825
Signed-off-by: Kefu Chai <kchai@redhat.com>
@tchaikov
Copy link
Contributor Author

tchaikov commented May 5, 2017

changelog

  • check is_shutdown() in events hooked on monitor's timer

@liewegas fixed and repushed.

@tchaikov tchaikov requested a review from liewegas May 5, 2017 05:57
void finish(int r) override {
ps->proposal_timer = 0;
proposal_timer = new C_MonContext(mon, [this](int r) {
proposal_timer = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't run in is_shutdown(). I think it's okay in this case since timer tolerates a cancel_event on a non-event, though, and who cares during shutdown.

expire_event = new C_ElectionExpire(this);
expire_event = new C_MonContext(mon, [this](int) {
expire();
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg so much better

@liewegas
Copy link
Member

liewegas commented May 8, 2017

@liewegas liewegas merged commit 916c5e3 into ceph:master May 8, 2017
@tchaikov tchaikov deleted the wip-19825 branch May 8, 2017 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants