Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

denc: add need_contiguous to denc_traits #15224

Merged

Conversation

Projects
None yet
6 participants
@tchaikov
Copy link
Contributor

tchaikov commented May 23, 2017

No description provided.

@tchaikov tchaikov requested review from liewegas and adamemerson May 23, 2017

@tchaikov

This comment has been minimized.

Copy link
Contributor Author

tchaikov commented May 23, 2017

we found performance degeneration in large cluster with bluestore. osdmaps are loaded from bluestore, and the bufferlists holding them are segmented. whenever an osdmap is decoded, a new temporary mem chunk is allocated, and the segmented bufferlist is deep copied to that temp mem chunk. this is expensive.

hence the fix.

@tchaikov tchaikov force-pushed the tchaikov:wip-denc-no-deep-copy-for-segmented-buffer branch from 70acab2 to 681afb7 May 23, 2017

uint64_t f=0) {
o = *(etype*)p.get_pos_add(sizeof(etype));
}
};

This comment has been minimized.

Copy link
@tchaikov

tchaikov May 23, 2017

Author Contributor

these are not purely cosmetic changes. it helps me decipher the error messages when the compiler is not content with my meta programming skill.

@@ -168,6 +168,7 @@ struct denc_traits<PExtentVector> {
static constexpr bool supported = true;
static constexpr bool bounded = false;
static constexpr bool featured = false;
static constexpr bool with_dencv = false;

This comment has been minimized.

Copy link
@tchaikov

tchaikov May 23, 2017

Author Contributor

assuming PExtentVector does not suffer from the deep-copy problem.

This comment has been minimized.

Copy link
@liewegas

liewegas May 23, 2017

Member

it comes out of the attrs, so it's not fragmented the way object data can be.

@tchaikov tchaikov force-pushed the tchaikov:wip-denc-no-deep-copy-for-segmented-buffer branch 2 times, most recently from 46ff3cd to 7a99d01 May 23, 2017

auto bl = p.get_bl();
auto remaining = bl.length() - p.get_off();
// it is expensive to rebuild a contigous buffer and drop it, so avoid this.
if (bl.get_num_buffers() > 1 && remaining > CEPH_PAGE_SIZE) {

This comment has been minimized.

Copy link
@liewegas

liewegas May 23, 2017

Member

we could do slightly better by checking whether p is the last buffer in the list

This comment has been minimized.

Copy link
@tchaikov

tchaikov May 23, 2017

Author Contributor

yeah! i am on it.

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented May 23, 2017

This looks great to me!

include/denc: templatize denc() for integer types
Signed-off-by: Kefu Chai <kchai@redhat.com>

@tchaikov tchaikov force-pushed the tchaikov:wip-denc-no-deep-copy-for-segmented-buffer branch from 7a99d01 to 0602154 May 23, 2017

@adamemerson
Copy link
Contributor

adamemerson left a comment

LaGrange Grants Theoretical Mastery

uint64_t f=0) {
o = *(T *)p.get_pos_add(sizeof(o));
}
};

This comment has been minimized.

Copy link
@adamemerson

adamemerson May 23, 2017

Contributor

Very nice, I like this.

uint64_t f=0) {
o = *(etype*)p.get_pos_add(sizeof(etype));
}
};

This comment has been minimized.

Copy link
@adamemerson

adamemerson May 23, 2017

Contributor

Also very good.

_size = 0;
for (const auto& i : m)
_size += i.second;
}

This comment has been minimized.

Copy link
@adamemerson

adamemerson May 23, 2017

Contributor

And this is something we desperately need.

This comment has been minimized.

Copy link
@badone

badone May 24, 2017

Contributor

@adamemerson ZZ Top FTW! =D

@tchaikov tchaikov force-pushed the tchaikov:wip-denc-no-deep-copy-for-segmented-buffer branch from 0602154 to 8752cc2 May 23, 2017

@tchaikov

This comment has been minimized.

Copy link
Contributor Author

tchaikov commented May 23, 2017

changelog

  • addressed @liewegas 's comment: check if the current iterator is in the last segment.
  • move the new denc test to test_denc.cc. it's where denc tests reside.

@tchaikov tchaikov added the needs-qa label May 23, 2017

@tchaikov

This comment has been minimized.

Copy link
Contributor Author

tchaikov commented May 23, 2017

@cbodley fwiw, in this PR, a machinery is added to avoid the copy_shallow() under some circumstances. this reminds me the short discussion we had over github the other day.

auto cp = tmp.begin();
traits::decode(o, cp);
p.advance((ssize_t)cp.get_offset());
auto bl = p.get_bl();

This comment has been minimized.

Copy link
@cbodley

cbodley May 23, 2017

Contributor

auto& bl = to avoid atomic inc/dec of ref counts

This comment has been minimized.

Copy link
@tchaikov

tchaikov May 24, 2017

Author Contributor

ahh, yeah. bufferlist is not refcounted. but i get your point!

t.copy_shallow(remaining, tmp);
auto cp = tmp.begin();
traits::decode(o, cp);
p.advance((ssize_t)cp.get_offset());

This comment has been minimized.

Copy link
@cbodley

cbodley May 23, 2017

Contributor

is it safe to splice this flattened buffer back into the original bufferlist? something like this?

if (p.get_current_ptr().get_raw() != t.get_current_ptr().get_raw()) {
  bufferlist newbl;
  newbl.substr_of(bl, 0, p.get_off());
  newbl.append(tmp);
  bl.swap(newbl);
  // update p
}

then later decodes wouldn't have to keep flattening those segments in bl

This comment has been minimized.

Copy link
@tchaikov

tchaikov May 24, 2017

Author Contributor

so, it's to change the content of a container while accessing it using an iterator. i am not sure that we can assume how the bufferlist is used by its caller. if it can be read by multiple threads in parallel, we are in the risk of racing.

i think, if the caller is concerned about the cost of repeatedly flatten the decoded buffer, it should cache the decoded object instead keeping the bufferlist around.

This comment has been minimized.

Copy link
@cbodley

cbodley May 24, 2017

Contributor

if it can be read by multiple threads in parallel, we are in the risk of racing.

true

i think, if the caller is concerned about the cost of repeatedly flatten the decoded buffer, it should cache the decoded object instead keeping the bufferlist around

by later decodes, i'm referring to the rest of the elements being decoded from this bufferlist. say you start with a bufferlist with 8 segments of 8 bytes each [8][8][8][8][8][8][8][8], and you're decoding four 16-byte objects. the first decode flattens the bufferlist into a new ptr [64] and decodes from that. the next decode still sees [8][8][8][8][8][8] left, and has to flatten again and decode from a ptr [48]. in the end, you're allocating and copying buffers for each object separately

(just clarifying my point from our earlier discussion, i like your pr as a whole!)

This comment has been minimized.

Copy link
@tchaikov

tchaikov May 25, 2017

Author Contributor

@cbodley yes, i do understand your point. your change does address the repeatedly flattening issue.

struct denc_traits {
static constexpr bool supported = false;
static constexpr bool featured = false;
static constexpr bool bounded = false;
static constexpr bool with_dencv = false;

This comment has been minimized.

Copy link
@cbodley

cbodley May 23, 2017

Contributor

it's not clear to me what dencv means here. maybe something like need_contiguous = true would better express its purpose?

This comment has been minimized.

Copy link
@tchaikov

tchaikov May 24, 2017

Author Contributor

a short explanation is in the commit message. and i thought it was short. but yeah, we'd better use an expressive name. will do.

denc: add need_contiguous to denc_traits
so the decode() will not try to copy the bufferlist for a continous one
if the bufferlist is segmented *and* its length is greater than
CEPH_PAGE_SIZE (4K) if the decoded object does not "need_contiguous" per
its denc_traits<>.

copying a memory chunk could be expensive if the decoded bufferlist is
huge, so we should try to avoid this. this could happen when we read the
buffer from bluestore.

and drop the partial specialization for denc() which tries to differentiate
traits::featured and !traits::featured, it does not matter to
decode() if the type supports feature or not. the encode() does.

also use denc() in interval_set<>::decode(). unlike encode(), decode(),
encode_nohead() or decode_nohead(), denc() is not part of our legacy
dencoder, so i think it's fine and encouraged to use it.

Signed-off-by: Kefu Chai <kchai@redhat.com>

@tchaikov tchaikov force-pushed the tchaikov:wip-denc-no-deep-copy-for-segmented-buffer branch from 8752cc2 to 77fbfc2 May 24, 2017

@tchaikov

This comment has been minimized.

Copy link
Contributor Author

tchaikov commented May 24, 2017

changelog

  • s/with_dencv/!need_contiguous/

i think it's safe to merge the new version even what what we test is the version before the substitution

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented May 24, 2017

@tchaikov tchaikov changed the title denc: add with_dencv to denc_traits denc: add need_contiguous to denc_traits May 24, 2017

@yuriw yuriw merged commit 516c3ec into ceph:master May 25, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@tchaikov tchaikov deleted the tchaikov:wip-denc-no-deep-copy-for-segmented-buffer branch May 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.