Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: fix waiting_for_peered vs flushing #17759

Merged
merged 1 commit into from Oct 10, 2017
Merged

Conversation

liewegas
Copy link
Member

on_flush() requeues waiting_for_peered, but we flush twice on the
primary during peering, and we don't want to requeue the first one
(when we have the master pg log merged).

Fix by moving waiting_for_peered to waiting_for_flush if we aren't
already flush on _activate_committed. If we get an op and are
peered but not flushed, queue ops there. (We can simplify this
check a bit since pgbackend inactive message handling doesn't care
about flushed or not flushed.) When flushed, we requeue
waiting_for_flush.

The waiting_for_flush, waiting_for_peered, and waiting_for_active
lists are all mutually exclusive, so this mostly serves to
clarify what we are waiting for (not to keep items separate). And
it means that on_flushed() will only requeue things that were
waiting for it specifically.

Fixes: http://tracker.ceph.com/issues/21407
Signed-off-by: Sage Weil sage@redhat.com

@tchaikov
Copy link
Contributor

retest this please.

@liewegas
Copy link
Member Author

retest this please

@liewegas liewegas changed the title osd: fix waiting_for_peered vs flushing WIP osd: fix waiting_for_peered vs flushing Sep 19, 2017
@tchaikov
Copy link
Contributor

retest this please.

@tchaikov
Copy link
Contributor

/usr/bin/timeout 360 rados --pool rbd put 50816810-afc5-49fe-93d9-2405a68c56e3 td/test-ceph-disk/50816810-afc5-49fe-93d9-2405a68c56e3
/home/jenkins-build/build/workspace/ceph-pull-requests/src/ceph-disk/tests/ceph-disk.sh:259: read_write:  return 1

the ceph-disk.sh test constantly fails .

@tchaikov tchaikov removed the needs-qa label Sep 28, 2017
@liewegas liewegas changed the title WIP osd: fix waiting_for_peered vs flushing osd: fix waiting_for_peered vs flushing Oct 6, 2017
@liewegas liewegas force-pushed the wip-21407 branch 2 times, most recently from 5c9b955 to 1b8267c Compare October 9, 2017 20:23
on_flush() requeues waiting_for_peered, but we flush twice on the
primary during peering, and we don't want to requeue the first one
(when we have the master pg log merged).

Fix by moving waiting_for_peered to waiting_for_flush if we aren't
already flush on _activate_committed.  If we get an op and are
peered but not flushed, queue ops there.  (We can simplify this
check a bit since pgbackend inactive message handling doesn't care
about flushed or not flushed.)  When flushed, we requeue
waiting_for_flush.

The waiting_for_flush, waiting_for_peered, and waiting_for_active
lists are all mutually exclusive, so this mostly serves to
clarify what we are waiting for (not to keep items separate). And
it means that on_flushed() will only requeue things that were
waiting for it specifically.

Fixes: http://tracker.ceph.com/issues/21407
Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas liewegas merged commit 85055e8 into ceph:master Oct 10, 2017
@liewegas liewegas deleted the wip-21407 branch October 10, 2017 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants