Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osdc/Objecter: fix bugs in explicit naming of op spg_t #13534

Merged
merged 5 commits into from Feb 24, 2017

Conversation

liewegas
Copy link
Member

@liewegas liewegas commented Feb 20, 2017

Fixed this (and tested) as part of wip-faster-dispatch.

@gregsfortytwo
Copy link
Member

Rather than recalculate every op on every map (which actually can take up a noticeable amount of time when you multiple every op by 10 microseconds), why not check if the pg count changed and recalculate if so?

@@ -1208,18 +1208,29 @@ void Objecter::handle_osd_map(MOSDMap *m)

cluster_full = cluster_full || _osdmap_full_flag();
update_pool_full_map(pool_full_map);

// check all outstanding requests on every epoch, including need_resend
for (auto p = need_resend.begin(); p != need_resend.end(); ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hang on, need_resend is empty here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a loop.. the bug comes up when we get multiple maps at once and fail to _calc_target on subsequent maps

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh duh. Yeah, this looks good.

@liewegas
Copy link
Member Author

liewegas commented Feb 20, 2017 via email

@jdurgin
Copy link
Member

jdurgin commented Feb 20, 2017

Could move the need_resend check outside the loop, inside a check for whether we processed multiple maps - it only needs to be done once right?

ldout(cct, 10) << __func__ << " checking op " << p->first << dendl;
int r = _calc_target(&op->target, nullptr);
if (r == RECALC_OP_TARGET_POOL_DNE) {
p = need_resend.erase(p);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we still need the _check_op_pool_dne() call to cleanup the callback? it won't gen scanned later since _scan_requests() reset op->session when it was added to the resend list

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, fixing with

          ldout(cct, 10) << __func__ << "  checking op " << p->first << dendl;
          int r = _calc_target(&op->target, nullptr);
          if (r == RECALC_OP_TARGET_POOL_DNE) {
-           p = need_resend.erase(p);
+           ++p;
+           OSDSession::unique_lock sl(op->session->lock);
+           _check_op_pool_dne(op, sl);
          } else {
            ++p;
          }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op->session will have been set to NULL by _session_op_remove() when the op was added to need_resend...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ick

@liewegas
Copy link
Member Author

Hmm yeah, that's doable.. fixing

@liewegas
Copy link
Member Author

@jdurgin updated

ldout(cct, 10) << __func__ << " checking op " << p->first << dendl;
int r = _calc_target(&op->target, nullptr);
if (r == RECALC_OP_TARGET_POOL_DNE) {
++p;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still need to remove it from need_resend

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest looks good

Signed-off-by: Sage Weil <sage@redhat.com>
Check whether the request hobj maps to the current pg_t.  If we have the
osd_debug_misdirected_ops setting enabled (as teuthology does), assert out
as well so that the error is easy to spot.  This catches bugs in the
Objecter (especially the new code that explicitly names the spg_t for the
request).

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
We need to make sure we update the mapping and get an accurate actual_pgid
value by recalcuating the mapping on every map change.  Otherwise, we may
not notice a split (and subsequent actual_pgid change) and resend the same
op with a stale spg_t.  To fix this,

- _calc_target on need_resend
- update target regardless of current con

Signed-off-by: Sage Weil <sage@redhat.com>
@gregsfortytwo
Copy link
Member

lgtm

@liewegas liewegas merged commit 674ae80 into ceph:master Feb 24, 2017
@liewegas liewegas deleted the wip-objecter-fixes branch February 24, 2017 18:55
@@ -1713,6 +1713,17 @@ void PrimaryLogPG::do_op(OpRequestRef& op)
hobject_t head = m->get_hobj();
head.snap = CEPH_NOSNAP;

if (!info.pgid.pgid.contains(
info.pgid.pgid.get_split_bits(pool.info.get_pg_num()), head)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @liewegas , I am using kernel rbd (kernel 3.10 in RHEL 7.4) as a client for a 12.2.12 ceph cluster. But I found an error message as below:

2019-07-02 03:58:43.724305 osd.3 osd.3 192.168.15.3:6800/37773 1 : cluster [WRN] 6.0 does not contain 6:06a674d9:::rbd_data.11d36b8b4567.0000000000000023:head op osd_op(client.4615.1:286132 6.9b2e6560 6:06a674d9:::rbd_data.11d36b8b4567.0000000000000023:head [set-alloc-hint object_size 4194304 write_size 4194304,write 0~4096] snapc 0=[] ondisk+write+known_if_redirected e95) v4

meanwhile, a request in client is hanging and never to be complete.:

REQUESTS 1 homeless 0
286132    osd3    6.9b2e6560    [3,5,1]/3    [3,5,1]/3    rbd_data.11d36b8b4567.0000000000000023    0x400024    1    0'0    set-alloc-hint,write

At that time, we are doing pg splitting, and yes, the client op epoch is e95 but server epoch is already e99.

But we reach this warning because the kernel client in RHEL 7.4 is not supporting CEPH_FEATURE_OSD_POOLRESEND. So it was not dropped in can_discard_op():

  } else if (m->get_connection()->has_feature(CEPH_FEATURE_OSD_POOLRESEND)) {
    // < luminous client
    if (m->get_map_epoch() < pool.info.get_last_force_op_resend_preluminous()) {
      dout(7) << __func__ << " sent before last_force_op_resend_preluminous "
	      << pool.info.last_force_op_resend_preluminous
	      << ", dropping" << *m << dendl;
      return true;
    }
  }

My question is, what should we do in this case, when client is not supporting CEPH_FEATURE_OSD_POOLRESEND?
assert on that to force the client resend?

assert(m->get_connection()->has_feature(CEPH_FEATURE_OSD_POOLRESEND))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants