New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: reduce buffer pinning from EC entries #15120

Merged
merged 3 commits into from May 18, 2017

Conversation

Projects
None yet
3 participants
@liewegas
Member

liewegas commented May 16, 2017

This reduces memory consumption by bufferlists (buffer_data) by about 3x.

mon/OSDMonitor: introduce debug option to allow filestore for ec over…
…writes

Signed-off-by: Sage Weil <sage@redhat.com>
@@ -440,6 +440,9 @@ struct PGLog : DoutPrefixProvider {
assert(get_can_rollback_to() == head);
}
// make sure our buffers don't pin bigger buffers
e.mod_desc.trim_bl();

This comment has been minimized.

@jdurgin

jdurgin May 16, 2017

Member

ah, this used to be here but was accidentally removed in 5e0ec06

@@ -383,6 +383,7 @@ OPTION(mon_debug_dump_transactions, OPT_BOOL, false)
OPTION(mon_debug_dump_json, OPT_BOOL, false)
OPTION(mon_debug_dump_location, OPT_STR, "/var/log/ceph/$cluster-$name.tdump")
OPTION(mon_debug_no_require_luminous, OPT_BOOL, false)
OPTION(mon_debug_no_require_bluestore_for_ec_overwrites, OPT_BOOL, false)

This comment has been minimized.

@jdurgin

jdurgin May 16, 2017

Member

I'd rather not add another footgun if we don't need it

::encode(snaps, entry->snaps);
bufferlist bl(op.updated_snaps->second.size() * 8 + 8);
::encode(op.updated_snaps->second, bl);
ldpp_dout(dpp, 0) << "snap bl is " << bl << dendl;

This comment has been minimized.

@jdurgin

jdurgin May 16, 2017

Member

dout(0) accidentally left in

@markhpc

This is targeting the right areas in the code based on valgrind and heap profiling data. Prior to the change, OSDs were consuming around 6GB RSS after 7 minutes of random writes to an RBD volume on an EC4+2 pool with 1024 PGs spread across 16 NVMe backed bluestore OSDs. After the change, tests on a new cluster with the same a parameters showed roughly 3.5GB RSS used per OSD.

@liewegas

This comment has been minimized.

Member

liewegas commented May 16, 2017

liewegas added some commits May 16, 2017

osd: encode snaps more efficiently
1- encode into a sized buffer.
2- do not needlessly copy the set<> to a vector<> before encoding.
set<> and vector<> encode identically.  Since we are converting from sorted
set<> to unsorted vector<>, the order doesn't change either.

Signed-off-by: Sage Weil <sage@redhat.com>
osd/PGLog: avoid pinning large buffers with ObjectModDesc
Accidentally removed by 5e0ec06.

Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas merged commit 5a03220 into ceph:master May 18, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@liewegas liewegas deleted the liewegas:wip-ec-buffer branch May 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment