Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: request new map from PG when needed #17795

Merged
merged 1 commit into from Sep 19, 2017

Conversation

Projects
None yet
4 participants
@jdurgin
Copy link
Member

commented Sep 19, 2017

The fast dispatch refactor in 3cc4827
eliminated the osdmap subscription in the ms_fast_dispatch path, which
meant ops could reach a PG without having the latest map. In a cluster
with few osdmap updates, where the monitor fails to send a new map to
an osd (it tries one random osd), this can result in indefinitely
blocked requests.

Fix this by adding an OSDService mechanism for scheduling a new osdmap
subscription request. Since we don't want to take the osd_lock within
the PG, process this request during OSD::tick().

Fixes: http://tracker.ceph.com/issues/21428
Signed-off-by: Josh Durgin jdurgin@redhat.com

{
std::lock_guard<std::mutex> l(osd->need_epoch_lock);
if (e > osd->subscribed_epoch) {
osd->need_epoch = e;

This comment has been minimized.

Copy link
@liewegas

liewegas Sep 19, 2017

Member

request_osdmap() doesn't need osd_lock.. it just grabs the latest osdmap and then calls into monc. Is there a lock ordering problem if we do that?

This comment has been minimized.

Copy link
@jdurgin

jdurgin Sep 19, 2017

Author Member

good point, I was just not looking hard enough through the existing calls which did hold the osd_lock. looks like there's no issue, so I'll do that.

@jdurgin jdurgin force-pushed the jdurgin:wip-21428 branch from ac3559e to a29ebe8 Sep 19, 2017

@jdurgin

This comment has been minimized.

Copy link
Member Author

commented Sep 19, 2017

backport is #17796

@gregsfortytwo
Copy link
Member

left a comment

We should also fix the monitor so that it doesn't ignore a failed map share, right?


// for PGs to request new osdmaps
std::mutex subscribed_epoch_lock;
epoch_t subscribed_epoch{0};

This comment has been minimized.

Copy link
@gregsfortytwo

gregsfortytwo Sep 19, 2017

Member

Don't we already have a variable somewhere for the epoch we've requested from the monitor? It shouldn't get dumb located.

This comment has been minimized.

Copy link
@jdurgin

jdurgin Sep 19, 2017

Author Member

not for the latest map. I'll move it to the osd from osdservice though

@jdurgin

This comment has been minimized.

Copy link
Member Author

commented Sep 19, 2017

the monitor resending isn't strictly necessary, though it's probably a good idea too. perhaps you could take a look at that in a separate change? I don't think it's as urgent for luminous if the osd side is fixed.

@liewegas

This comment has been minimized.

Copy link
Member

commented Sep 19, 2017

Yeah. When we do that we might also change it so we notify N random OSDs, not just one of them.

@liewegas
Copy link
Member

left a comment

This looks right to me, but I won't be super confident until it makes a run through lockdep.

@jdurgin

This comment has been minimized.

Copy link
Member Author

commented Sep 19, 2017

@liewegas lockdep won't work for std::mutex, we'd need to make helgrind work

@jdurgin

This comment has been minimized.

Copy link
Member Author

commented Sep 19, 2017

changing to Mutex so lockdep will tell us if there are monclient lock dependency issues

osd: request new map from PG when needed
The fast dispatch refactor in 3cc4827
eliminated the osdmap subscription in the ms_fast_dispatch path, which
meant ops could reach a PG without having the latest map. In a cluster
with few osdmap updates, where the monitor fails to send a new map to
an osd (it tries one random osd), this can result in indefinitely
blocked requests.

Fix this by adding an OSDService mechanism for scheduling a new osdmap
subscription request.

Fixes: http://tracker.ceph.com/issues/21428
Signed-off-by: Josh Durgin <jdurgin@redhat.com>

@jdurgin jdurgin force-pushed the jdurgin:wip-21428 branch from a29ebe8 to dd33360 Sep 19, 2017

@gregsfortytwo

This comment has been minimized.

Copy link
Member

commented Sep 19, 2017

This LGTM then.
(Also "dumb located" is apparently a pretty awesome autocorrect when you mis-type "duplicated".)

@jdurgin

This comment has been minimized.

@jdurgin jdurgin merged commit 311373a into ceph:master Sep 19, 2017

5 checks passed

Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details

@jdurgin jdurgin deleted the jdurgin:wip-21428 branch Sep 19, 2017

@@ -1646,6 +1646,7 @@ void PrimaryLogPG::do_request(
<< p->first << " not empty, queueing" << dendl;
p->second.push_back(op);
op->mark_delayed("waiting_for_map not empty");
osd->request_osdmap_update(op->min_epoch);

This comment has been minimized.

Copy link
@LiumxNL

LiumxNL Sep 20, 2017

Contributor

@jdurgin why not request_osdmap_update at the beginning when op must wait for map?
https://github.com/ceph/ceph/pull/17795/files#diff-fb41013d27e932534adb50eb3de2aaa5R1656

This comment has been minimized.

Copy link
@jdurgin

jdurgin Sep 20, 2017

Author Member

argh that's where it's meant to be, must have moved this around by accident... fixing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.