New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: meta sync thread crash at RGWMetaSyncShardCR #15660

Merged
merged 1 commit into from Jun 19, 2017

Conversation

Projects
None yet
3 participants
@fangyuxiangGL
Contributor

fangyuxiangGL commented Jun 13, 2017

@fangyuxiangGL

This comment has been minimized.

Contributor

fangyuxiangGL commented Jun 13, 2017

reference count err

somewhere in the codes, once RGWContinuousLeaseCR done(failed possible), RGWCoroutinesStack of it will be put twice (in RGWCoroutinesManager::run and RGWCoroutinesStack::collect_next) and then self deleted in RefCountedObject::put().

In RGWMetaSyncShardCR::collect_children(), we can see:

` while (collect_next(&child_ret, &child)) {
auto iter = stack_to_pos.find(child);
if (iter == stack_to_pos.end()) {
/* some other stack that we don't care about */
continue;
}

  string& pos = iter->second;

  if (child_ret < 0) {
    ldout(sync_env->cct, 0) << *this << ": child operation stack=" << child << " entry=" << pos << " returned " << child_ret << dendl;
  }

  map<string, string>::iterator prev_iter = pos_to_prev.find(pos);
  assert(prev_iter != pos_to_prev.end());

  /*
   * we should get -EAGAIN for transient errors, for which we want to retry, so we don't
   * update the marker and abort. We'll get called again for these. Permanent errors will be
   * handled by marking the entry at the error log shard, so that we retry on it separately
   */
  if (child_ret == -EAGAIN) {
    can_adjust_marker = false;
  }

  if (pos_to_prev.size() == 1) {
    if (can_adjust_marker) {
      sync_marker.marker = pos;
    }
    pos_to_prev.erase(prev_iter);
  } else {
    assert(pos_to_prev.size() > 1);
    pos_to_prev.erase(prev_iter);
    prev_iter = pos_to_prev.begin();
    if (can_adjust_marker) {
      sync_marker.marker = prev_iter->second;
    }
  }

  ldout(sync_env->cct, 0) << *this << ": adjusting marker pos=" << sync_marker.marker << dendl;
  stack_to_pos.erase(iter);
}`

collect_next will return a RGWCoroutinesStack of the lease_cr that is already released~

@liewegas liewegas added the rgw label Jun 13, 2017

@cbodley cbodley added the bug fix label Jun 14, 2017

@cbodley cbodley self-requested a review Jun 15, 2017

@fangyuxiangGL

This comment has been minimized.

Contributor

fangyuxiangGL commented Jun 16, 2017

jenkins test this please

@cbodley

This comment has been minimized.

Contributor

cbodley commented Jun 16, 2017

make check failures should be fixed with #15730, can you rebase please?

@cbodley

looks good except for minor comments. passes test_multi.py under valgrind 👍

@@ -939,7 +939,7 @@ class RGWRunBucketSyncCoroutine : public RGWCoroutine {
logger.init(sync_env, "Bucket", bs.get_key());
}
~RGWRunBucketSyncCoroutine() override {
if (lease_cr) {
if (lease_cr.get()) {

This comment has been minimized.

@cbodley

cbodley Jun 16, 2017

Contributor

intrusive_ptr has conversion to bool, this can remain if (lease_cr) {. same with other if statements below

lock_name, lock_duration, this);
lease_cr->get();
lease_stack = spawn(lease_cr, false);
lock_name, lock_duration, this));

This comment has been minimized.

@cbodley

cbodley Jun 16, 2017

Contributor

fix indentation under RGWContinuousLeaseCR( please

rgw_raw_obj(store->get_zone_params().log_pool, status_oid),
"sync_lock",
cct->_conf->rgw_sync_lease_period,
this);
lease_stack = spawn(lease_cr.get(), false);
this));

This comment has been minimized.

@cbodley

cbodley Jun 16, 2017

Contributor

indentation

lock_name, lock_duration, this);
lease_cr->get();
lease_stack = spawn(lease_cr, false);
lock_name, lock_duration, this));

This comment has been minimized.

@cbodley

cbodley Jun 16, 2017

Contributor

indentation

lock_name, lock_duration, this);
lease_cr->get();
lease_stack = spawn(lease_cr, false);
lock_name, lock_duration, this));

This comment has been minimized.

@cbodley

cbodley Jun 16, 2017

Contributor

indentation

lock_name, lock_duration, this);
lease_cr->get();
lease_stack = spawn(lease_cr, false);
lock_name, lock_duration, this));

This comment has been minimized.

@cbodley

cbodley Jun 16, 2017

Contributor

indentation

lock_name, lock_duration, this);
lease_cr->get();
lease_stack = spawn(lease_cr, false);
lock_name, lock_duration, this));

This comment has been minimized.

@cbodley

cbodley Jun 16, 2017

Contributor

indentation

rgw: meta sync thread crash at RGWMetaSyncShardCR
Fixes: http://tracker.ceph.com/issues/20251

Signed-off-by: fang yuxiang fang.yuxiang@eisoo.com
@fangyuxiangGL

This comment has been minimized.

Contributor

fangyuxiangGL commented Jun 19, 2017

Hi @cbodley

already rebased and modified according to your comments.

@cbodley cbodley merged commit 8557916 into ceph:master Jun 19, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
arm64 make check arm64 make check succeeded
Details
make check make check succeeded
Details
@cbodley

This comment has been minimized.

Contributor

cbodley commented Jun 19, 2017

thanks!

@fangyuxiangGL fangyuxiangGL deleted the fangyuxiangGL:meta_sync_crash branch Jan 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment