Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Objecter: failed assert(tick_event==NULL) at osdc/Objecter.cc #4169

Closed
wants to merge 1 commit into from

Conversation

wonzhq
Copy link
Contributor

@wonzhq wonzhq commented Mar 25, 2015

The assert failure is caused by a locking issue. When the Objecter timer
erases the tick_event from its events queue and calls tick() to dispatch
it, if the Objecter::rwlock is held by shutdown(), it waits there to get
the rwlock. However, inside the shutdown function, it checks the
tick_event and tries to cancel it. The cancel_event function returns
false since tick_event is already removed from the events queue. Thus
tick_event is not set to NULL, and leads to this assertion failure.

Fix issue #11183.

Signed-off-by: Zhiqiang Wang zhiqiang.wang@intel.com

The assert failure is caused by a locking issue. When the Objecter timer
erases the tick_event from its events queue and calls tick() to dispatch
it, if the Objecter::rwlock is held by shutdown(), it waits there to get
the rwlock. However, inside the shutdown function, it checks the
tick_event and tries to cancel it. The cancel_event function returns
false since tick_event is already removed from the events queue. Thus
tick_event is not set to NULL, and leads to this assertion failure.

Fix issue ceph#11183.

Signed-off-by: Zhiqiang Wang <zhiqiang.wang@intel.com>
@loic-bot
Copy link

SUCCESS: the output of run-make-check.sh on centos-7 for a9ecdd6 is http://paste2.org/PfVcHjZZ

:octocat: Sent from GH.

@tchaikov
Copy link
Contributor

@wonzhq and could you please tag this commit with Fixes: #11183 instead of Fix issue #11183, or if you are sure that the later can be understood by tracker?

@wonzhq
Copy link
Contributor Author

wonzhq commented Mar 25, 2015

After digging more into this issue, looks like this is a giant specfic issue, and has been fixed in the current master by removing the following code in Objecter::tick().

  if (!initialized.read())
    return;

@tchaikov
Copy link
Contributor

After digging more into this issue, looks like this is a giant specfic issue

agreed. timer.shutdown() ensures that the callback gets called, and the time_event always reset itself even initialized is set to 0. hence assert(tick_event == NULL) holds.

@tchaikov
Copy link
Contributor

i am labeling this PR with milestone of 'giant', @wonzhq , probably you can create another PR targeting giant instead?

@tchaikov tchaikov added this to the giant milestone Mar 25, 2015
@wonzhq
Copy link
Contributor Author

wonzhq commented Mar 25, 2015

You mean create another PR and push it to the giant branch?

@tchaikov
Copy link
Contributor

@wonzhq yes, not sure there is better way to re-target a PR to another branch though...

BTW, i bisected the commits, turns out the fix was introduced in 8253ead and the buggy part was introduced in d790833. we might want to backport (some of) it to giant ?

@wonzhq
Copy link
Contributor Author

wonzhq commented Mar 25, 2015

@tchaikov , I created #4175. Pls help take a look. I think simply removing 2 lines of code would be able to fix this problem in giant.

@tchaikov tchaikov closed this Mar 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants