New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: populate last_epoch_split during build_initial_pg_history #16519

Merged
merged 1 commit into from Jul 26, 2017

Conversation

Projects
None yet
7 participants
@liewegas
Member

liewegas commented Jul 24, 2017

If the request was sent before a split then this isn't actually a bug.

Fixes: http://tracker.ceph.com/issues/20754
Signed-off-by: Sage Weil sage@redhat.com

@liewegas liewegas added this to the luminous milestone Jul 24, 2017

@liewegas liewegas requested a review from jdurgin Jul 24, 2017

@@ -1848,7 +1848,8 @@ void PrimaryLogPG::do_op(OpRequestRef& op)
<< std::hex << head.get_hash() << std::dec << dendl;
osd->clog->warn() << info.pgid.pgid << " does not contain " << head
<< " op " << *m;
assert(!cct->_conf->osd_debug_misdirected_ops);
assert(!cct->_conf->osd_debug_misdirected_ops ||
m->get_epoch() < info.history.last_epoch_split);

This comment has been minimized.

@xiexingguo

xiexingguo Jul 24, 2017

Member

does not compile, s/m->get_epoch()/m->get_map_epoch()/

@gregsfortytwo

Shouldn't the op have been "split" as well as the PG?

Or if the send epoch was across a boundary when the message was received, we should have detected that and dropped it because the client will resend, right?

osd: populate last_epoch_split during build_initial_pg_history
If we get a pg create and have to generate the pg_history_t fields,
populate the last_epoch_split field too.  This is needed discard ops sent
from before the last split.  Specifically, PG::can_discard_op() looks at
it when the client has the RESEND_ON_SPLIT feature bit.  If we don't
discard, we may run afoul of assertions later (e.g., that the object
belongs to the PG at all) or potentially process an op out of order.

Fixes: http://tracker.ceph.com/issues/20754
Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas

This comment has been minimized.

Member

liewegas commented Jul 24, 2017

You're right.. I thought this was before that check but it's not. core reveals that the last_epoch_split was 0 (should be 80). Pushed a new fix.

@gregsfortytwo

This comment has been minimized.

Member

gregsfortytwo commented Jul 24, 2017

jenkins test this please

@liewegas liewegas added needs-qa and removed wip-sage-testing labels Jul 24, 2017

@liu-chunmei

This comment has been minimized.

Contributor

liu-chunmei commented Jul 24, 2017

Does this PR will avoid the assert? if the send epoch across the boundary when pg split, it should be detected and resend, right?

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 24, 2017

The client will already resend; this PR fixes it so that we properly ignore the op sent before the split.

@idryomov

This comment has been minimized.

Contributor

idryomov commented Jul 24, 2017

@liewegas This explains my kernel client paste from last week: https://pastebin.com/B3Tvx2kG. Should have pressed on, I guess ;)

@liewegas

This comment has been minimized.

Member

liewegas commented Jul 25, 2017

@liewegas liewegas changed the title from osd/PrimaryLogPG: fix bad assert to osd: populate last_epoch_split during build_initial_pg_history Jul 25, 2017

@tchaikov tchaikov merged commit 483efaf into ceph:master Jul 26, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment