Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: fix rados write op hang #11143

Merged
merged 1 commit into from Oct 24, 2016
Merged

osd: fix rados write op hang #11143

merged 1 commit into from Oct 24, 2016

Conversation

wycbox
Copy link

@wycbox wycbox commented Sep 20, 2016

if primary OSD handle a write op and do_osd_ops return errro code,
record_write_error will try to record it into pglog and send log message
to replica.
but if replica OSD crash right now, the primary OSD will call
on_change to free all repop, and do_update_log_missing_reply will not
be called, so the write op will lost and hangs.

this patch fix it.

Signed-off-by: Yunchuan Wen yunchuan.wen@kylin-cloud.com

if primary OSD handle a write op and do_osd_ops return errro code,
record_write_error will try to record it into pglog and send log message
to replica.
but if replica OSD crash right now, the primary OSD will call
on_change to free all repop, and do_update_log_missing_reply will not
be called, so the write op will lost and hangs.

this patch fix it.

Signed-off-by: Yunchuan Wen <yunchuan.wen@kylin-cloud.com>
@liewegas
Copy link
Member

@jdurgin ping

@liewegas liewegas changed the title BUGFIX: rados write op hangs osd: fix rados write op hang Sep 20, 2016
@jdurgin
Copy link
Member

jdurgin commented Sep 23, 2016

nice catch! looks good to me

@tchaikov
Copy link
Contributor

@tchaikov tchaikov merged commit b0e2028 into ceph:master Oct 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants