Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Lazily allocate TypedArena's first chunk #36592
Conversation
|
(rust_highfive has picked a reviewer for you, use r? to override) |
|
Some more details. For hyper.0.5.0 this reduces cumulative heap allocations
(These measurements are from Valgrind's DHAT tool, which I used to identify This is due to rustc's frequent use of
The stage2 improvements are smaller, presumably because jemalloc is faster at |
|
Not an official Rust reviewer, but some general thoughts I had when looking through the code. |
| let prev_capacity = chunks.last().unwrap().storage.cap(); | ||
| let new_capacity = prev_capacity.checked_mul(2).unwrap(); | ||
| if chunks.last_mut().unwrap().storage.double_in_place() { | ||
| if chunks.len() == 0 { |
Mark-Simulacrum
Sep 20, 2016
Member
Prefer Vec::is_empty()
Prefer Vec::is_empty()
| for mut chunk in chunks_borrow.drain(..last_idx) { | ||
| let cap = chunk.storage.cap(); | ||
| chunk.destroy(cap); | ||
| if chunks_borrow.len() > 0 { |
Mark-Simulacrum
Sep 20, 2016
Member
Prefer !chunks_borrow.is_empty().
Prefer !chunks_borrow.is_empty().
| chunk.destroy(cap); | ||
| if chunks_borrow.len() > 0 { | ||
| let last_idx = chunks_borrow.len() - 1; | ||
| self.clear_last_chunk(&mut chunks_borrow[last_idx]); |
Mark-Simulacrum
Sep 20, 2016
Member
Why not chunks_borrow.last_mut()? It might conflict with the drain below, in which case you can use split_at_mut I think.
Why not chunks_borrow.last_mut()? It might conflict with the drain below, in which case you can use split_at_mut I think.
| for chunk in chunks_borrow.iter_mut() { | ||
| let cap = chunk.storage.cap(); | ||
| chunk.destroy(cap); | ||
| if chunks_borrow.len() > 0 { |
Mark-Simulacrum
Sep 20, 2016
Member
Prefer is_empty.
Prefer is_empty.
|
The code (both preexisting and current) also duplicates the pop last element, then drain/mutably iterate and destroy the rest of the chunks. Can it be extracted into a helper function? I've discussed this with @nnethercote, I think they believe that this would be best done as a follow up PR; leaving this comment here so this idea doesn't get lost. |
|
I replaced the |
|
In the libcollections convention, Alternatively, Either way, doc comments need updates to not say "preallocated" for new and with_capacity. |
|
The is_empty tests seem to be inverted |
|
Thank you for the comments, @bluss. I fixed the inverted is_empty tests and updated the comments for |
|
I did a quick grep over the rust source code and I don't think |
| let prev_capacity = chunks.last().unwrap().storage.cap(); | ||
| let new_capacity = prev_capacity.checked_mul(2).unwrap(); | ||
| if chunks.last_mut().unwrap().storage.double_in_place() { | ||
| if chunks.is_empty() { |
bluss
Sep 20, 2016
Member
I would too like to rewrite this to use the .last_mut() option for control flow. (None is the empty case). But it needs some wrangling to be able to call chunks.push at the end.
I would too like to rewrite this to use the .last_mut() option for control flow. (None is the empty case). But it needs some wrangling to be able to call chunks.push at the end.
| let cap = chunk.storage.cap(); | ||
| chunk.destroy(cap); | ||
| if !chunks_borrow.is_empty() { | ||
| let mut last_chunk = chunks_borrow.pop().unwrap(); |
bluss
Sep 20, 2016
Member
pop's Option can be used for control flow
pop's Option can be used for control flow
| for mut chunk in chunks_borrow.drain(..last_idx) { | ||
| let cap = chunk.storage.cap(); | ||
| chunk.destroy(cap); | ||
| if !chunks_borrow.is_empty() { |
bluss
Sep 20, 2016
Member
We could use .pop() here too, drain all other chunks, then put the last chunk back (seems like the simplest way to keep the borrow checker happy).
We could use .pop() here too, drain all other chunks, then put the last chunk back (seems like the simplest way to keep the borrow checker happy).
bluss
Sep 20, 2016
Member
This is not more work than what drain already does.
This is not more work than what drain already does.
|
Using the Options for control flow will end up with prettier Rust code |
|
(I haven't ever used the review feature before. I haven't heard any news on how we want to use it in the project.) |
Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
|
I made the requested control flow changes. I haven't changed |
|
Note that |
|
@bors r+ |
|
|
|
Nice catch @nnethercote! The redundant arenas used to not matter because we had 1 |
|
Now that I have a better idea of how rustc-benchmarks works, here are some stage 1 (uses glibc malloc)
stage2 (uses jemalloc)
With glibc malloc they're mostly in the range 1.03--1.06x faster. With jemalloc |
Lazily allocate TypedArena's first chunk Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
Lazily allocate TypedArena's first chunk Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
|
@nnethercote FWIW I've been meaning to eventually move to a single common drop-less arena (instead of a dozen typed ones), but there were things to rework first to make that even possible - we're almost there, in fact |
| _own: PhantomData, | ||
| } | ||
| TypedArena { | ||
| first_chunk_capacity: cmp::max(1, capacity), |
eddyb
Sep 21, 2016
Member
If with_capacity isn't used, I think it'd be worth just not having first_chunk_capacity around at all.
If with_capacity isn't used, I think it'd be worth just not having first_chunk_capacity around at all.
nnethercote
Sep 21, 2016
Author
Contributor
Good suggestion. I'll file a follow-up PR to remove with_capacity once this one lands.
Good suggestion. I'll file a follow-up PR to remove with_capacity once this one lands.
eddyb
Sep 21, 2016
Member
Well, this PR would be simpler if it also did that change, I'm saying. I'd r+ it immediately and this PR will have to wait at least half a day more before getting merged, so you have time now.
Well, this PR would be simpler if it also did that change, I'm saying. I'd r+ it immediately and this PR will have to wait at least half a day more before getting merged, so you have time now.
Lazily allocate TypedArena's first chunk Currently `TypedArena` allocates its first chunk, which is usually 4096 bytes, as soon as it is created. If no allocations are ever made from the arena then this allocation (and the corresponding deallocation) is wasted effort. This commit changes `TypedArena` so it doesn't allocate the first chunk until the first allocation is made. This change speeds up rustc by a non-trivial amount because rustc uses `TypedArena` heavily: compilation speed (producing debug builds) on several of the rustc-benchmarks increases by 1.02--1.06x. The change should never cause a slow-down because the hot `alloc` function is unchanged. It does increase the size of `TypedArena` by one `usize` field, however. The commit also fixes some out-of-date comments.
[breaking-change] Remove TypedArena::with_capacity This is a follow-up to #36592. The function is unused by rustc. Also, it doesn't really follow the usual meaning of a `with_capacity` function because the first chunk allocation is now delayed until the first `alloc` call. This change reduces the size of `TypedArena` by one `usize`. @eddyb: we discussed this on IRC. Would you like to review it?
[breaking-change] Remove TypedArena::with_capacity This is a follow-up to #36592. The function is unused by rustc. Also, it doesn't really follow the usual meaning of a `with_capacity` function because the first chunk allocation is now delayed until the first `alloc` call. This change reduces the size of `TypedArena` by one `usize`. @eddyb: we discussed this on IRC. Would you like to review it?
[breaking-change] Remove TypedArena::with_capacity This is a follow-up to #36592. The function is unused by rustc. Also, it doesn't really follow the usual meaning of a `with_capacity` function because the first chunk allocation is now delayed until the first `alloc` call. This change reduces the size of `TypedArena` by one `usize`. @eddyb: we discussed this on IRC. Would you like to review it?
Currently
TypedArenaallocates its first chunk, which is usually 4096bytes, as soon as it is created. If no allocations are ever made from
the arena then this allocation (and the corresponding deallocation) is
wasted effort.
This commit changes
TypedArenaso it doesn't allocate the first chunkuntil the first allocation is made.
This change speeds up rustc by a non-trivial amount because rustc uses
TypedArenaheavily: compilation speed (producing debug builds) onseveral of the rustc-benchmarks increases by 1.02--1.06x. The change
should never cause a slow-down because the hot
allocfunction isunchanged. It does increase the size of
TypedArenaby oneusizefield, however.
The commit also fixes some out-of-date comments.