New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: pglog trimming fixes #12882

Merged
merged 3 commits into from May 2, 2017

Conversation

Projects
None yet
4 participants
@wonzhq
Contributor

wonzhq commented Jan 11, 2017

Fixes the cases when pglog is not trimmed for whatever reasons, we are accumulating too much log entries in the memory, which leads to osd high memory usage and some other issues like slow peering.

@@ -829,6 +829,7 @@ OPTION(osd_pg_epoch_persisted_max_stale, OPT_U32, 150) // make this < map_cache_
OPTION(osd_min_pg_log_entries, OPT_U32, 3000) // number of entries to keep in the pg log when trimming it
OPTION(osd_max_pg_log_entries, OPT_U32, 10000) // max entries, say when degraded, before we trim
OPTION(osd_force_recovery_pg_log_entries, OPT_U32, 13000) // max entries before force recovery

This comment has been minimized.

@liewegas

liewegas Jan 11, 2017

Member

Perhaps we should make this option a (float) multiple of osd_max_pg_log_entries? That way it will always be, say, 30% past whatever max is so that it doesn't also have to be configured if the min/max options have been adjusted.

This comment has been minimized.

@wonzhq

wonzhq Jan 11, 2017

Contributor

Yep, make sense

This comment has been minimized.

@athanatos

athanatos Jan 11, 2017

Contributor

+1

@liewegas

This comment has been minimized.

Member

liewegas commented Jan 11, 2017

These all look reasonable to me, but @athanatos should take a look.

@liewegas liewegas requested a review from athanatos Jan 11, 2017

@tchaikov tchaikov self-assigned this Jan 11, 2017

min_version = peer_missing[peer].get_rmissing().begin()->first;
soid = peer_missing[peer].get_rmissing().begin()->second;
}
}

This comment has been minimized.

@athanatos

athanatos Jan 11, 2017

Contributor

I think you want to use PG::missing_loc for this.

This comment has been minimized.

@wonzhq

wonzhq Jan 12, 2017

Contributor

Is missing_loc helpful in this case? The purpose is to find the object who is missing with the smallest version_t.

This comment has been minimized.

@athanatos

athanatos Jan 12, 2017

Contributor

Oh, you're right, it doesn't maintain a reverse map. This is probably fine then.

dirty_info = true;
write_if_dirty(t);
int tr = osd->store->queue_transaction(osr.get(), std::move(t), NULL);
assert(tr == 0);

This comment has been minimized.

@athanatos

athanatos Jan 11, 2017

Contributor

write_if_dirty and queueing the transaction should be handled by the caller. The Recovered handler is in the PG state machine and is called by the peering_wq which will do that anyway.

This comment has been minimized.

@wonzhq

wonzhq Jan 12, 2017

Contributor

ok

@athanatos

I left some comments.

Zhiqiang Wang added some commits Jan 4, 2017

Zhiqiang Wang
osd: trim pglog when pg is recovered
In the current code, primary initiates pglog trim or peers inform
primary to trim when finishing recovery of the last object on it.
However, the pg is still in degraded/recovering/backfilling state
at that time and then the max log entries are kept. If there are
no IOs on this pg, it will hold up to the max number of log entries
in the memory, even if it's totally recovered.

Signed-off-by: Zhiqiang Wang <zhiqiang@xsky.com>
Zhiqiang Wang
osd: trim primary pglog as well when recovery finishes on a peer
When recovery finishes on primary or on a peer, primary trims pglog
on the peer osds, or peer informs primary to trim pglog. In both of
these cases, trim primary pglog as well as pglog on the peers.

Signed-off-by: Zhiqiang Wang <zhiqiang@xsky.com>
Zhiqiang Wang
osd: force recover the oldest missing object if too many logs
When the oldest missing object of a pg is not recovered in a long
time, pg log is not trimmed because min_last_complete_on_disk is
not going further. This may accumulate too many logs in memory.
Force recover the oldest missing object when the number of logs
exceeds osd_force_recovery_pg_log_entries.

Signed-off-by: Zhiqiang Wang <zhiqiang@xsky.com>
@wonzhq

This comment has been minimized.

Contributor

wonzhq commented Jan 13, 2017

Updated, please take another look

@wonzhq

This comment has been minimized.

Contributor

wonzhq commented Feb 9, 2017

Updated according to Sam's comments. Mind to take another look? @liewegas @athanatos

@liewegas liewegas changed the title from osd: pglog fixes to osd: pglog trimming fixes Apr 26, 2017

updated

@liewegas liewegas merged commit 5610444 into ceph:master May 2, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
arm build successfully built on arm
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment