Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: make snapmapper warn+clean up instead of assert #20040

Merged
merged 5 commits into from Mar 29, 2018

Conversation

liewegas
Copy link
Member

@liewegas liewegas commented Jan 21, 2018

src/osd/PG.cc Outdated
@@ -4182,9 +4182,10 @@ void PG::_scan_snaps(ScrubMap &smap)
<< "...repaired";
}
snap_mapper.add_oid(hoid, obj_snaps, &_t);
r = osd->store->apply_transaction(osr.get(), std::move(t));
// we don't particularly care when this is visible or committed.
r = osd->store->queue_transaction(osr.get(), std::move(t), nullptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about skipping a snaptrim, because it gets processed before the repair is complete?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think that race can happen (although it'd be pretty extreme).

The alternative fix here is to change apply_transaction() to use onreadable_sync so that we don't deadlock against items ahead of os waiting for pg->lock, but that affects apply_transaction behavior everywhere.

Probably the best thing here is to make a apply-transaction variant (or hard code it) to wait on onreadable_sync instead of onreadable. I'll do the latter I think, since apply_transaction goes away in wip-kill-onreadable anyway so this is really a luminous-only fix!

@liewegas liewegas force-pushed the wip-snapmapper branch 3 times, most recently from 1824b87 to e17493b Compare January 31, 2018 23:12
@gregsfortytwo
Copy link
Member

This looks good to me, but it's still marked WIP so I'm not marking it as approved. Feel free to do so when ready!

@liewegas liewegas changed the title WIP: osd: make snapmapper warn+clean up instead of assert osd: make snapmapper warn+clean up instead of assert Mar 13, 2018
Old option is unused

Signed-off-by: Sage Weil <sage@redhat.com>
Better to clean up than to crash the OSD.

Signed-off-by: Sage Weil <sage@redhat.com>
This shouldn't ever happen.

Signed-off-by: Sage Weil <sage@redhat.com>
Wait for our repair to apply before continuing to avoid reading the
broken state again.

Signed-off-by: Sage Weil <sage@redhat.com>
@@ -243,6 +243,7 @@ void SnapMapper::add_oid(
MapCacher::Transaction<std::string, bufferlist> *t)
{
dout(20) << __func__ << " " << oid << " " << snaps << dendl;
assert(!snaps.empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is the case, we might want to assert(0 == "decode snaps failure") at https://github.com/ceph/ceph/blob/master/src/osd/PG.cc#L3747 in PG::update_snap_map().

Signed-off-by: Sage Weil <sage@redhat.com>
@tchaikov tchaikov merged commit 7dbc67f into ceph:master Mar 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants