New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: fix synchronise pgmap with others #14418

Merged
merged 1 commit into from Apr 18, 2017

Conversation

Projects
None yet
5 participants
@songbaisen

songbaisen commented Apr 10, 2017

mon: fix synchronise pgmap with others

leader mon down for a long time and then the local pgmap/osdmap was behind latest map a lot,
when it start and sync ,it will read full map from other mons;
if serveral osds/pgs were deleted during mon offline, these osds/pgs will be left over in pgmap until manully restart ceph-mon process

Solution:
reset pgmap before read pgmap full,make sure local maps are the same as the latest map

Signed-off-by: z09440 zhao.mingyue@h3c.com
Signed-off-by: song baisen song.baisen@zte.com.cn

@songbaisen songbaisen changed the title from mon: leader mon down for a long time and then the local pgmap/osdmap was behind latest map a lot, to mon: fix synchronise pgmap with others Apr 10, 2017

@tchaikov

This comment has been minimized.

Contributor

tchaikov commented Apr 10, 2017

could you keep @mingyuez 's Signed-off-by line in the commit message? he/she should have the credit of this contribution

Signed-off-by: z09440 <zhao.mingyue@h3c.com>
@songbaisen

This comment has been minimized.

songbaisen commented Apr 10, 2017

@tchaikov Yes

songbaisen
mon: fix synchronise pgmap with others
leader mon down for a long time and then the local pgmap/osdmap was behind latest map a lot,
when it start and sync ,it will read full map from other mons;
if serveral osds/pgs were deleted during mon offline, these osds/pgs will be left over in pgmap until manully restart ceph-mon process

Solution:
reset pgmap before read pgmap full,make sure local maps are the same as the latest map

Signed-off-by: z09440 <zhao.mingyue@h3c.com>
Signed-off-by: song baisen <song.baisen@zte.com.cn>
@@ -190,6 +190,8 @@ void PGMonitor::update_from_paxos(bool *need_bootstrap)
int r = get_version(pg_map.version + 1, bl);
if (r == -ENOENT) {
dout(10) << __func__ << " failed to read_incremental, read_full" << dendl;
// reset pg map
pg_map = PGMap();

This comment has been minimized.

@jecluis

jecluis Apr 10, 2017

Member

I think this will work as the patch intends.

Looking at this function however, made me realize we are not taking any steps if the incremental (or the full version) do not exist. @tchaikov do you know if that's by design or just something that (at somepoint) was overlooked?

This comment has been minimized.

@tchaikov

tchaikov Apr 10, 2017

Contributor

@jecluis if the incremental / full version pgmap is missing, we will have assert failure. i am not sure that's the best way to address this though.

This comment has been minimized.

@jecluis

jecluis Apr 10, 2017

Member

@tchaikov yeah, we assert on out. Missed that. As for not being the best way, when we get to the point of a full version, or an incremental, being missing, then something went wrong with paxos at some point (sync, recovery, or proposal/commit), and asserting out doesn't seem like the wrong way to go. Would be nicer to have a self-healing attempt round though, but in the absence of that I'd much rather the monitor die a horrible death than having an inconsistent store.

This comment has been minimized.

@tchaikov

tchaikov Apr 10, 2017

Contributor

yeah. /me agreed!

This comment has been minimized.

@songbaisen

@tchaikov tchaikov added the needs-qa label Apr 10, 2017

@yuriw

This comment has been minimized.

Contributor

yuriw commented Apr 12, 2017

test this pls

@yuriw

This comment has been minimized.

Contributor

yuriw commented Apr 18, 2017

@liewegas pls merge

@liewegas liewegas merged commit c6676a7 into ceph:master Apr 18, 2017

2 of 3 checks passed

default Build finished.
Details
Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment