Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: raw_combined allocations buffer and ref count together #7612

Merged
merged 12 commits into from Mar 2, 2016

Conversation

liewegas
Copy link
Member

Before:

[----------] 1 test from Buffer
[ RUN ] Buffer.BenchAlloc
1000000 alloc of size 16384 in 0.505902
1000000 alloc of size 4096 in 0.268716
1000000 alloc of size 1024 in 0.198454
1000000 alloc of size 256 in 0.127251
1000000 alloc of size 32 in 0.100286
1000000 alloc of size 4 in 0.099957
[ OK ] Buffer.BenchAlloc (1301 ms)
[----------] 1 test from Buffer (1301 ms total)

[----------] 1 test from BufferList
[ RUN ] BufferList.BenchAlloc
100000 alloc of size 32768 in 0.335541
100000 alloc of size 25000 in 0.328742
100000 alloc of size 16384 in 0.323622
100000 alloc of size 10000 in 0.302479
100000 alloc of size 8192 in 0.302486
100000 alloc of size 6000 in 0.304038
100000 alloc of size 4096 in 0.304013
100000 alloc of size 1024 in 0.295486
100000 alloc of size 256 in 0.199804
100000 alloc of size 32 in 0.163652
100000 alloc of size 4 in 0.165094
[ OK ] BufferList.BenchAlloc (3025 ms)
[----------] 1 test from BufferList (3025 ms total)

After:

[----------] 1 test from Buffer
[ RUN ] Buffer.BenchAlloc
1000000 alloc of size 16384 in 0.677293
1000000 alloc of size 4096 in 0.232403
1000000 alloc of size 1024 in 0.114915
1000000 alloc of size 256 in 0.108958
1000000 alloc of size 32 in 0.091280
1000000 alloc of size 4 in 0.078020
[ OK ] Buffer.BenchAlloc (1303 ms)
[----------] 1 test from Buffer (1303 ms total)

[----------] 1 test from BufferList
[ RUN ] BufferList.BenchAlloc
100000 alloc of size 32768 in 0.335772
100000 alloc of size 25000 in 0.331753
100000 alloc of size 16384 in 0.328051
100000 alloc of size 10000 in 0.309004
100000 alloc of size 8192 in 0.309229
100000 alloc of size 6000 in 0.233140
100000 alloc of size 4096 in 0.204561
100000 alloc of size 1024 in 0.205416
100000 alloc of size 256 in 0.149929
100000 alloc of size 32 in 0.164062
100000 alloc of size 4 in 0.125882
[ OK ] BufferList.BenchAlloc (2697 ms)
[----------] 1 test from BufferList (2697 ms total)

I can't figure out why the buffer::create(16384) calls are slower in
the Buffer.BenchAlloc. It goes from being a straight new raw_char(...)
to the new call, and if I switch it back on the new branch it is still
slower than the original branch... maybe the way the code was generated
vs instructino prefetch or something? Very confusing.

Anyway, aside from that regression, everything else is the same or
faster (in the 10-20% range). More so for small allocations.

if (!align)
align = sizeof(size_t);
size_t rawlen = ROUND_UP_TO(sizeof(buffer::raw_combined), sizeof(size_t));
size_t datalen = ROUND_UP_TO(len, sizeof(size_t));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does rawlen need to be padded?

for datalen, you might consider alignof(raw_combined) as a more explicit alternative to sizeof(size_t)

@branch-predictor
Copy link
Contributor

write-1024
write-4096
write-65536

This is how rados bench 120 write -t 128 went with and without patch, vertical axis is speed in MB/s, horizontal - time in seconds. Cluster was configured to use simple messenger.
This patch improved performance on small IO, but the difference is smaller due to more disk flushing occurring more frequently. For large I/O, the difference is close to none. I assume that with bluestore the performance difference will be even greater than that.
(Complete data here: http://ceph.predictor.org.pl/bufferlist_patch.xlsx, raw data at http://ceph.predictor.org.pl/chunktest_org.tar.gz and http://ceph.predictor.org.pl/chunktest_patch.tar.gz)

@cbodley
Copy link
Contributor

cbodley commented Feb 26, 2016

looks good to me 👍

@liewegas
Copy link
Member Author

needs rebase

Do not assume there is a trailing null the terminate the string.

Signed-off-by: Sage Weil <sage@redhat.com>
- fix source
- include larger sizes

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
These eliminate most callers of buffers(), which exposes the
internal list<ptr>.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
This will let us put policy create_aligned.

Signed-off-by: Sage Weil <sage@redhat.com>
If the alignment is on a page boundary, or the allocation is big,
a separate buffer::raw goes faster.  The rest of the time,
a raw_combined does.

Signed-off-by: Sage Weil <sage@redhat.com>
This may as well fit the input; this doesn't relate to the
append buffer.

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
…ions

We drop some unittest assertions about alloc buffer size.  Sorry!

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas liewegas added this to the jewel milestone Mar 1, 2016
liewegas added a commit that referenced this pull request Mar 2, 2016
buffer: raw_combined allocations buffer and ref count together

Reviewed-by: Casey Bodley <cbodley@redhat.com>
@liewegas liewegas merged commit 67696f0 into ceph:master Mar 2, 2016
@liewegas liewegas deleted the wip-buffer-combined branch March 2, 2016 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants