Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: fix waiting_for_peered vs flushing #17759

Merged
merged 1 commit into from Oct 10, 2017

Conversation

Projects
None yet
3 participants
@liewegas
Copy link
Member

commented Sep 15, 2017

on_flush() requeues waiting_for_peered, but we flush twice on the
primary during peering, and we don't want to requeue the first one
(when we have the master pg log merged).

Fix by moving waiting_for_peered to waiting_for_flush if we aren't
already flush on _activate_committed. If we get an op and are
peered but not flushed, queue ops there. (We can simplify this
check a bit since pgbackend inactive message handling doesn't care
about flushed or not flushed.) When flushed, we requeue
waiting_for_flush.

The waiting_for_flush, waiting_for_peered, and waiting_for_active
lists are all mutually exclusive, so this mostly serves to
clarify what we are waiting for (not to keep items separate). And
it means that on_flushed() will only requeue things that were
waiting for it specifically.

Fixes: http://tracker.ceph.com/issues/21407
Signed-off-by: Sage Weil sage@redhat.com

@liewegas liewegas requested review from jdurgin and tchaikov Sep 15, 2017

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Sep 18, 2017

retest this please.

@liewegas

This comment has been minimized.

Copy link
Member Author

commented Sep 19, 2017

retest this please

@liewegas liewegas changed the title osd: fix waiting_for_peered vs flushing WIP osd: fix waiting_for_peered vs flushing Sep 19, 2017

@jdurgin jdurgin added the needs-qa label Sep 20, 2017

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Sep 28, 2017

retest this please.

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Sep 28, 2017

/usr/bin/timeout 360 rados --pool rbd put 50816810-afc5-49fe-93d9-2405a68c56e3 td/test-ceph-disk/50816810-afc5-49fe-93d9-2405a68c56e3
/home/jenkins-build/build/workspace/ceph-pull-requests/src/ceph-disk/tests/ceph-disk.sh:259: read_write:  return 1

the ceph-disk.sh test constantly fails .

@tchaikov tchaikov removed the needs-qa label Sep 28, 2017

@liewegas liewegas force-pushed the liewegas:wip-21407 branch from b5e0882 to dbac947 Oct 6, 2017

@liewegas liewegas changed the title WIP osd: fix waiting_for_peered vs flushing osd: fix waiting_for_peered vs flushing Oct 6, 2017

@liewegas liewegas force-pushed the liewegas:wip-21407 branch 2 times, most recently from 5c9b955 to 1b8267c Oct 9, 2017

osd: fix waiting_for_peered vs flushing
on_flush() requeues waiting_for_peered, but we flush twice on the
primary during peering, and we don't want to requeue the first one
(when we have the master pg log merged).

Fix by moving waiting_for_peered to waiting_for_flush if we aren't
already flush on _activate_committed.  If we get an op and are
peered but not flushed, queue ops there.  (We can simplify this
check a bit since pgbackend inactive message handling doesn't care
about flushed or not flushed.)  When flushed, we requeue
waiting_for_flush.

The waiting_for_flush, waiting_for_peered, and waiting_for_active
lists are all mutually exclusive, so this mostly serves to
clarify what we are waiting for (not to keep items separate). And
it means that on_flushed() will only requeue things that were
waiting for it specifically.

Fixes: http://tracker.ceph.com/issues/21407
Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas force-pushed the liewegas:wip-21407 branch from 1b8267c to 8f7dc8b Oct 9, 2017

@liewegas liewegas merged commit 85055e8 into ceph:master Oct 10, 2017

5 checks passed

Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details

@liewegas liewegas deleted the liewegas:wip-21407 branch Oct 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.