-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-6450: [C++] Use 2x reallocation strategy in BufferBuilder instead of 1.5x #5270
Conversation
cpp/src/arrow/buffer_builder.h
Outdated
// Otherwise overallocate by 1.5 to keep a linear amortized cost. | ||
// TODO: revisit this? See comment in BufferOutputStream::Reserve. | ||
return std::max(new_capacity, current_capacity * 3 / 2); | ||
// Doubling capacity except for large Reserve requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider commenting that doubling works well with JEMalloc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work well with the system allocator as well. I guess the 1.5 factor is a good choice for smallish allocations, but less so once the allocator (re)allocates entire pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why "except for large Reserve requests"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason I thought we got rid of this a long time ago.
What are the benchmark results for |
Btw it's impressive how far behind the system allocator is. |
Here are some benchmarks (Ubuntu 18.04, Ryzen 7 1700): 1.5x and jemalloc
2x and jemalloc
|
I think the results I showed last night were affected by CPU throttling. I'm going to update the PR description with higher quality results, and use |
Updated the PR description. The jemalloc performance difference I show is similar to what @pitrou shows. The system allocator perf is much better for 2x than 1.5x, though, so that supports the decision (though we obviously want people to be using jemalloc) |
+1 |
As Antoine noted, 2x seems to have a small benefit with jemalloc and a more significant benefit with the system allocator.
1.5x with jemalloc (i9-9960X on Ubuntu 18.04)
2x and jemalloc
1.5x and system allocator
2x and system allocator