New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

librados: redirect balanced reads to acting primary when targeting object isn't recovered #15489

Merged
merged 1 commit into from Jun 30, 2017

Conversation

Projects
None yet
5 participants
@xxhdx1985126

xxhdx1985126 commented Jun 5, 2017

Redirect balanced read requests to acting primary when the
targeting object hasn't been recovered on replica OSDs.

Fixes: http://tracker.ceph.com/issues/17968
Signed-off-by: Xuehan Xu xxhdx1985126@163.com

@liewegas liewegas added common feature bug fix and removed feature labels Jun 5, 2017

@@ -3412,6 +3412,21 @@ void Objecter::handle_osd_op_reply(MOSDOpReply *m)
if (rc == -EAGAIN) {
ldout(cct, 7) << " got -EAGAIN, resubmitting" << dendl;
if (op->target.flags & CEPH_OSD_FLAG_BALANCE_READS && op->target.acting_primary != op->target.osd) {

This comment has been minimized.

@liewegas

liewegas Jun 5, 2017

Member

add parens around & expression pls

@liewegas

parens

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 5, 2017

parens added:-)

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 13, 2017

Uh, hi, is there anything else that I should do?

@liewegas liewegas added the needs-qa label Jun 13, 2017

@liewegas

This comment has been minimized.

Member

liewegas commented Jun 13, 2017

I won't really trust any of this code until we have stress tests like ceph_test_rados that exercise the BALANCE_READS flag. Until then this probably moves us closer to stable.

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 15, 2017

I got it, where should I upload the stress test result?

@liewegas

This comment has been minimized.

Member

liewegas commented Jun 15, 2017

If you want to expand the testing, the tool to look at is ceph_test_rados in test/osd/RadosModel. The trick is that when you do parallel reads they can be reordered wrt other reads randomly, so in that mode you'll need to relax the ordering checks (if any). And add an option to set the flag for reads. And update qa/tasks/rados.py to pass it through.

IIRC there are some bigger issues that need to be fixed up in the OSD for replicas before it will work, though, although I forget the details. But we may as well get the testing in place and see what breaks!

In the meantime this pr is queued for qa!

@@ -3412,6 +3412,21 @@ void Objecter::handle_osd_op_reply(MOSDOpReply *m)
if (rc == -EAGAIN) {
ldout(cct, 7) << " got -EAGAIN, resubmitting" << dendl;
if ((op->target.flags & CEPH_OSD_FLAG_BALANCE_READS) && (op->target.acting_primary != op->target.osd)) {

This comment has been minimized.

@tchaikov

tchaikov Jun 15, 2017

Contributor

better off wrapping at 79 chars.

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 16, 2017

Wrapped at 57 chars:-)

@liewegas

This comment has been minimized.

Member

liewegas commented Jun 16, 2017

/home/jenkins-build/build/workspace/ceph-pull-requests/src/osdc/Objecter.cc: In member function ‘void Objecter::handle_osd_op_reply(MOSDOpReply*)’:
/home/jenkins-build/build/workspace/ceph-pull-requests/src/osdc/Objecter.cc:3418:16: error: ‘struct std::atomic’ has no member named ‘dec’
  num_in_flight.dec();
                ^

we just switched to std::atomic<>, so this needs to be --num_in_flight now

Xuehan Xu
redirect balanced reads to acting primary when necessary
Redirect balanced read requests to acting primary when the
targeting object hasn't been recovered on replica OSDs.

Fixes: http://tracker.ceph.com/issues/17968
Signed-off-by: Xuehan Xu <xxhdx1985126@163.com>

@xxhdx1985126 xxhdx1985126 reopened this Jun 18, 2017

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 18, 2017

I'm sorry that I forgot to pull the latest code before I modified the code and tried to build it, I've changed it to 'num_in_fligh--' now.

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 20, 2017

Is there anything else that I should do? :-)

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Jun 20, 2017

if we want to push this, I hope we can actually make balanced/localized read op work. we need to add qa test to cover this at least. otherwise, we only add a lot of useless codes and make other changes complexity.

@liewegas

This comment has been minimized.

Member

liewegas commented Jun 20, 2017

FWIW this is what @athanatos had when he was working on this a year or so back: https://github.com/athanatos/ceph/commits/wip-replica-read

@tchaikov

This comment has been minimized.

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 29, 2017

Uh, hi, sorry for not responding for so long. I wonder if this test contains only this PR? I'm asking this because I saw that some test failed because of assert(weak_refs.empty()), it seems to be some kind of memory leak, and it seems not very probable for this PR to cause this kind of problem. I just want to confirm whether the failure is definitely caused by this PR. Thank you:-)

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Jun 29, 2017

@xxhdx1985126 could you point me to the failed test due to assert(weak_refs.empty())?

no, my test batch contains other PRs in addition to this one. and i don't think the failed tests are relevant to it.

i am holding back on merging this PR to wait for your response to @yuyuyu101 's query.

@xxhdx1985126

This comment has been minimized.

xxhdx1985126 commented Jun 29, 2017

Hi, @tchaikov , here is the log.

2017-06-25T17:36:09.457 INFO:tasks.ceph.osd.0.smithi137.stderr:  -861> 2017-06-25 17:36:03.521554 354c0700 -1 osd.0 9 *** Got signal Terminated ***
2017-06-25T17:36:09.458 INFO:tasks.ceph.osd.0.smithi137.stderr:  -806> 2017-06-25 17:36:04.459556 354c0700 -1 osd.0 9 shutdown
2017-06-25T17:36:09.469 INFO:tasks.ceph.osd.0.smithi137.stderr:  -476> 2017-06-25 17:36:08.738963 354c0700 -1 osd.0 9 pgid 0.7 has ref count of 2
2017-06-25T17:36:09.469 INFO:tasks.ceph.osd.0.smithi137.stderr:  -473> 2017-06-25 17:36:08.741605 354c0700 -1 osd.0 9 pgid 0.1 has ref count of 2
2017-06-25T17:36:09.469 INFO:tasks.ceph.osd.0.smithi137.stderr:  -470> 2017-06-25 17:36:08.742085 354c0700 -1 osd.0 9 pgid 0.6 has ref count of 2
2017-06-25T17:36:09.469 INFO:tasks.ceph.osd.0.smithi137.stderr:  -465> 2017-06-25 17:36:08.780034 354c0700 -1 osd.0 9 pgid 0.4 has ref count of 2
2017-06-25T17:36:09.469 INFO:tasks.ceph.osd.0.smithi137.stderr:  -462> 2017-06-25 17:36:08.780422 354c0700 -1 osd.0 9 pgid 0.3 has ref count of 2
2017-06-25T17:36:09.469 INFO:tasks.ceph.osd.0.smithi137.stderr:  -459> 2017-06-25 17:36:08.780803 354c0700 -1 osd.0 9 pgid 0.2 has ref count of 2
2017-06-25T17:36:09.470 INFO:tasks.ceph.osd.0.smithi137.stderr:  -456> 2017-06-25 17:36:08.781195 354c0700 -1 osd.0 9 pgid 0.0 has ref count of 2
2017-06-25T17:36:09.483 INFO:tasks.ceph.osd.0.smithi137.stderr:    -1> 2017-06-25 17:36:09.140895 96ac180 -1 leaked refs:
2017-06-25T17:36:09.483 INFO:tasks.ceph.osd.0.smithi137.stderr:dump_weak_refs 0xf88bb50 weak_refs: 9 = 0x36af76c0 with 14 refs
2017-06-25T17:36:09.483 INFO:tasks.ceph.osd.0.smithi137.stderr:
2017-06-25T17:36:09.484 INFO:tasks.ceph.osd.0.smithi137.stderr:     0> 2017-06-25 17:36:09.193020 96ac180 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.0.3-2085-g2e6b413/rpm/el7/BUILD/ceph-12.0.3-2085-g2e6b413/src/common/shared_cache.hpp: In function 'SharedLRU<K, V, C, H>::~SharedLRU() [with K = unsigned int; V = const OSDMap; C = std::less<unsigned int>; H = std::hash<unsigned int>]' thread 96ac180 time 2017-06-25 17:36:09.143126
2017-06-25T17:36:09.484 INFO:tasks.ceph.osd.0.smithi137.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.0.3-2085-g2e6b413/rpm/el7/BUILD/ceph-12.0.3-2085-g2e6b413/src/common/shared_cache.hpp: 108: FAILED assert(weak_refs.empty())

@yuyuyu101 Hi, do you mean that I should add qa tests for this PR?

@yuyuyu101

This comment has been minimized.

Member

yuyuyu101 commented Jun 29, 2017

yes....

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Jun 29, 2017

@tchaikov Is it the same as http://tracker.ceph.com/issues/20000 ? I am having some trouble seeing how this (client-side change) could impact the ref-counting on the OSDs. :)

If it improves things we should probably merge it as @liewegas suggested, even though we expect it not to be stable until it's tested in QA and has all the fixes for locking order and things we've discussed.

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Jun 30, 2017

@gregsfortytwo yeah. it does look the same. neither do i think it's related. just wanted to make sure i didn't miss anything obvious.

okay, i am merging it.

@tchaikov tchaikov merged commit 9b5f723 into ceph:master Jun 30, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
arm64 make check arm64 make check succeeded
Details
make check make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment