New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mds: fix dropping events in standby replay #12077
Conversation
|
dout(10) << " segment seq=" << seg->seq << " " << seg->offset << | ||
"~" << seg->end - seg->offset << dendl; | ||
|
||
if (seg->end > journaler->get_read_pos()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seg->end is set to journaler->get_read_pos() after reading each log event. How can this test be true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I wrote this branch when experimenting and forgot to remove it. The "segments.size() == 1" below is the part that really fixes the bug.
@ukernel does it look good now? |
If I understand the code correctly. The code tries trimming all segments except the head one. This seems incorrect. (If a log segment is not expired by active MDS, updates in the segment haven't |
Ensure that we never drop the last segment during standby replay -- this avoids the case where we start ignoring events because we think we're still waiting to see a subtreemap. Fixes: http://tracker.ceph.com/issues/17954 Signed-off-by: John Spray <john.spray@redhat.com>
Sorry, I messed this up when committing it. I originally mean to remove the "seg->end > journaler->get_read_pos()" condition and leave the "seg->end > expire_pos" condition, but I removed the wrong one. As you say, the expire_pos check still needs to be there. |
(it's fixed now) |
Reviewed-by: Yan, Zheng zyan@redhat.com |
Ensure that we never drop the last segment during
standby replay -- this avoids the case where we
start ignoring events because we think we're
still waiting to see a subtreemap.
Fixes: http://tracker.ceph.com/issues/17954
Signed-off-by: John Spray john.spray@redhat.com