Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clear data and payload after removed from ops_in_flight #3217

Merged
merged 1 commit into from Jan 9, 2015

Conversation

boydc2014
Copy link

@ghost ghost added bug-fix core labels Dec 19, 2014
@liewegas liewegas added this to the firefly milestone Dec 19, 2014
@liewegas
Copy link
Member

Reviewed-by:

@boydc2014
Copy link
Author

update <> to email address

@ghost
Copy link

ghost commented Dec 22, 2014

bot timed out, increasing timeout

@boydc2014
Copy link
Author

@dachary how can we see the result after timeout increased?

@ghost
Copy link

ghost commented Dec 26, 2014

@boydc2014 ok, I'll run the bot again

@loic-bot
Copy link

FAIL: the output of run-make-check.sh on 3e7abac is http://paste.pound-python.org/show/eolLxZVMWwbAKrtpyBEw/

:octocat: Sent from GH.

@ghost
Copy link

ghost commented Dec 26, 2014

You should try to run the failed tests locally to get more information. I suspect that running a vstart.sh cluster manually will also show problems.

PASS: unittest_on_exit
FAIL: test/erasure-code/test-erasure-code.sh
PASS: unittest_bufferlist.sh
PASS: test/encoding/check-generated.sh
PASS: test/mon/osd-pool-create.sh
PASS: test/mon/misc.sh
PASS: test/mon/osd-crush.sh
PASS: test/mon/osd-erasure-code-profile.sh
PASS: test/mon/mkfs.sh
PASS: test/ceph-disk.sh
PASS: test/mon/mon-handle-forward.sh
FAIL: test/vstart_wrapped_tests.sh 

@liewegas
Copy link
Member

this fails make check.. test/erasure-code/test-erasure-code.sh is enough to reproduce in my case.

@loic-bot
Copy link

SUCCESS: the output of run-make-check.sh on 725391e is http://paste.pound-python.org/show/ZrNGT4lYlkg0Qz1vyIXA/

:octocat: Sent from GH.

@boydc2014
Copy link
Author

@dachary @liewegas put it under lock instead of changing order passed the tests.

@boydc2014
Copy link
Author

@liewegas I may need to explain why QA fails on my first commit:

Firstly, I want to clear data and payload after remove op from ops_in_flight using the following code:
{
Mutex::Locker locker(ops_in_flight_lock);
assert(i->xitem.get_list() == &ops_in_flight);
utime_t now = ceph_clock_now(cct);
i->xitem.remove_myself();
history.insert(now, TrackedOpRef(i)); // here is the problem
}
i->request->clear_data();
i->request->clear_payload();

Why QA fails is that when after doing history.insert, the destructor of this op is called. And then I want to clear the data and payload. It just crashs with a segment fault.

Still investigating why the destructor is called since this op is still tracked by history.
===== Update====
After further investigation, why destructor is called after inserting op to history is that this op got removed by op history because history_size limit is hit.

I may open another pull request to refine the order of op history.

@boydc2014
Copy link
Author

ping @dachary @liewegas

@liewegas
Copy link
Member

liewegas commented Jan 8, 2015

This looks correct to me! Alternatively, you could take a ref and do teh free after, but this is simpler, and clear should be fast.

Reviewed-by:

@liewegas
Copy link
Member

liewegas commented Jan 8, 2015

Can you redo this pull request, targetting master branch? and put

Backport: giant, firefly

in the git commit message so that we can backport it. Thanks!

@liewegas liewegas closed this Jan 8, 2015
@boydc2014
Copy link
Author

@liewegas Thanks sage, this has been fixed in master 11082f7
I just redo it in firefly.

@liewegas liewegas reopened this Jan 9, 2015
liewegas added a commit that referenced this pull request Jan 9, 2015
clear data and payload after removed from ops_in_flight

Reviewed-by: Sage Weil <sage@redhat.com>
@liewegas liewegas merged commit c26ebd3 into ceph:firefly Jan 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants