New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer: raw_combined allocations buffer and ref count together #7612
Conversation
if (!align) | ||
align = sizeof(size_t); | ||
size_t rawlen = ROUND_UP_TO(sizeof(buffer::raw_combined), sizeof(size_t)); | ||
size_t datalen = ROUND_UP_TO(len, sizeof(size_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does rawlen need to be padded?
for datalen, you might consider alignof(raw_combined)
as a more explicit alternative to sizeof(size_t)
This is how rados bench 120 write -t 128 went with and without patch, vertical axis is speed in MB/s, horizontal - time in seconds. Cluster was configured to use simple messenger. |
looks good to me 👍 |
needs rebase |
Do not assume there is a trailing null the terminate the string. Signed-off-by: Sage Weil <sage@redhat.com>
- fix source - include larger sizes Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
These eliminate most callers of buffers(), which exposes the internal list<ptr>. Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
This will let us put policy create_aligned. Signed-off-by: Sage Weil <sage@redhat.com>
If the alignment is on a page boundary, or the allocation is big, a separate buffer::raw goes faster. The rest of the time, a raw_combined does. Signed-off-by: Sage Weil <sage@redhat.com>
This may as well fit the input; this doesn't relate to the append buffer. Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
…ions We drop some unittest assertions about alloc buffer size. Sorry! Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
58621dc
to
aa2b891
Compare
buffer: raw_combined allocations buffer and ref count together Reviewed-by: Casey Bodley <cbodley@redhat.com>
Before:
[----------] 1 test from Buffer
[ RUN ] Buffer.BenchAlloc
1000000 alloc of size 16384 in 0.505902
1000000 alloc of size 4096 in 0.268716
1000000 alloc of size 1024 in 0.198454
1000000 alloc of size 256 in 0.127251
1000000 alloc of size 32 in 0.100286
1000000 alloc of size 4 in 0.099957
[ OK ] Buffer.BenchAlloc (1301 ms)
[----------] 1 test from Buffer (1301 ms total)
[----------] 1 test from BufferList
[ RUN ] BufferList.BenchAlloc
100000 alloc of size 32768 in 0.335541
100000 alloc of size 25000 in 0.328742
100000 alloc of size 16384 in 0.323622
100000 alloc of size 10000 in 0.302479
100000 alloc of size 8192 in 0.302486
100000 alloc of size 6000 in 0.304038
100000 alloc of size 4096 in 0.304013
100000 alloc of size 1024 in 0.295486
100000 alloc of size 256 in 0.199804
100000 alloc of size 32 in 0.163652
100000 alloc of size 4 in 0.165094
[ OK ] BufferList.BenchAlloc (3025 ms)
[----------] 1 test from BufferList (3025 ms total)
After:
[----------] 1 test from Buffer
[ RUN ] Buffer.BenchAlloc
1000000 alloc of size 16384 in 0.677293
1000000 alloc of size 4096 in 0.232403
1000000 alloc of size 1024 in 0.114915
1000000 alloc of size 256 in 0.108958
1000000 alloc of size 32 in 0.091280
1000000 alloc of size 4 in 0.078020
[ OK ] Buffer.BenchAlloc (1303 ms)
[----------] 1 test from Buffer (1303 ms total)
[----------] 1 test from BufferList
[ RUN ] BufferList.BenchAlloc
100000 alloc of size 32768 in 0.335772
100000 alloc of size 25000 in 0.331753
100000 alloc of size 16384 in 0.328051
100000 alloc of size 10000 in 0.309004
100000 alloc of size 8192 in 0.309229
100000 alloc of size 6000 in 0.233140
100000 alloc of size 4096 in 0.204561
100000 alloc of size 1024 in 0.205416
100000 alloc of size 256 in 0.149929
100000 alloc of size 32 in 0.164062
100000 alloc of size 4 in 0.125882
[ OK ] BufferList.BenchAlloc (2697 ms)
[----------] 1 test from BufferList (2697 ms total)
I can't figure out why the buffer::create(16384) calls are slower in
the Buffer.BenchAlloc. It goes from being a straight new raw_char(...)
to the new call, and if I switch it back on the new branch it is still
slower than the original branch... maybe the way the code was generated
vs instructino prefetch or something? Very confusing.
Anyway, aside from that regression, everything else is the same or
faster (in the 10-20% range). More so for small allocations.