Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Avoid needless allocations in `liveness_of_locals`. #51869
Conversation
|
|
||
| /// Overwrite one `IdxSetBuf` with another of the same length. | ||
| pub fn overwrite(&mut self, other: &IdxSetBuf<T>) { | ||
| self.bits.clone_from_slice(&other.bits); |
nikomatsakis
Jun 28, 2018
Contributor
This method probably belongs on IdxSet rather than IdxSetBuf (note that IdxSetBuf derefs to IdxSet). The relationship between those two types is analogous to [] and Vec, so most operations that don't change the size or require allocation belong on IdxSet. There is however an existing clone_from method on that type that does the same thing:
But maybe we should rename that method (IdxSet::clone_from) to overwrite? I wonder if we are having shadowing problems because of Clone::clone_from....
This method probably belongs on IdxSet rather than IdxSetBuf (note that IdxSetBuf derefs to IdxSet). The relationship between those two types is analogous to [] and Vec, so most operations that don't change the size or require allocation belong on IdxSet. There is however an existing clone_from method on that type that does the same thing:
But maybe we should rename that method (IdxSet::clone_from) to overwrite? I wonder if we are having shadowing problems because of Clone::clone_from....
nnethercote
Jun 28, 2018
Author
Contributor
You are right about the shadowing. Here's the DHAT output:
==60110== -------------------- 1 of 500 --------------------
==60110== max-live: 217,728 in 1,701 blocks
==60110== tot-alloc: 77,545,592 in 1,330,065 blocks (avg size 58.30)
==60110== deaths: 1,330,065, at avg age 2,374,592 (0.00% of prog lifetime)
==60110== acc-ratios: 0.07 rd, 1.51 wr (5,953,680 b-read, 117,537,792 b-written)
==60110== at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so)
==60110== by 0x7C016B0: alloc (alloc.rs:62)
==60110== by 0x7C016B0: alloc (alloc.rs:123)
==60110== by 0x7C016B0: allocate_in<usize,alloc::alloc::Global> (raw_vec.rs:103)
==60110== by 0x7C016B0: with_capacity<usize> (raw_vec.rs:147)
==60110== by 0x7C016B0: with_capacity<usize> (vec.rs:362)
==60110== by 0x7C016B0: to_vec<usize> (slice.rs:167)
==60110== by 0x7C016B0: to_vec<usize> (slice.rs:369)
==60110== by 0x7C016B0: clone<usize> (vec.rs:1670)
==60110== by 0x7C016B0: clone<rustc::mir::Local> (indexed_set.rs:36)
==60110== by 0x7C016B0: clone_from<rustc_data_structures::indexed_set::IdxSetBuf<rustc::mir::Local>> (clone.rs:131)
==60110== by 0x7C016B0: rustc_mir::util::liveness::liveness_of_locals (liveness.rs:144)
==60110== by 0x7B22D4B: compute (liveness.rs:106)
==60110== by 0x7B22D4B: rustc_mir::borrow_check::nll::compute_regions (mod.rs:101)
==60110== by 0x7B33381: rustc_mir::borrow_check::do_mir_borrowck (mod.rs:238)
IdxSetBuf implements Clone, and therefore defines clone_from, which is a provided method that calls clone. IdxSet has a clone_from method, which is probably intended to be called, but the IdxSetBuf implementation is called instead.
I think renaming IdxSet::clone_from as IdxSet::overwrite is reasonable. I wonder if there are any other places like this, where the clone_from being called is not the one intended.
You are right about the shadowing. Here's the DHAT output:
==60110== -------------------- 1 of 500 --------------------
==60110== max-live: 217,728 in 1,701 blocks
==60110== tot-alloc: 77,545,592 in 1,330,065 blocks (avg size 58.30)
==60110== deaths: 1,330,065, at avg age 2,374,592 (0.00% of prog lifetime)
==60110== acc-ratios: 0.07 rd, 1.51 wr (5,953,680 b-read, 117,537,792 b-written)
==60110== at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so)
==60110== by 0x7C016B0: alloc (alloc.rs:62)
==60110== by 0x7C016B0: alloc (alloc.rs:123)
==60110== by 0x7C016B0: allocate_in<usize,alloc::alloc::Global> (raw_vec.rs:103)
==60110== by 0x7C016B0: with_capacity<usize> (raw_vec.rs:147)
==60110== by 0x7C016B0: with_capacity<usize> (vec.rs:362)
==60110== by 0x7C016B0: to_vec<usize> (slice.rs:167)
==60110== by 0x7C016B0: to_vec<usize> (slice.rs:369)
==60110== by 0x7C016B0: clone<usize> (vec.rs:1670)
==60110== by 0x7C016B0: clone<rustc::mir::Local> (indexed_set.rs:36)
==60110== by 0x7C016B0: clone_from<rustc_data_structures::indexed_set::IdxSetBuf<rustc::mir::Local>> (clone.rs:131)
==60110== by 0x7C016B0: rustc_mir::util::liveness::liveness_of_locals (liveness.rs:144)
==60110== by 0x7B22D4B: compute (liveness.rs:106)
==60110== by 0x7B22D4B: rustc_mir::borrow_check::nll::compute_regions (mod.rs:101)
==60110== by 0x7B33381: rustc_mir::borrow_check::do_mir_borrowck (mod.rs:238)
IdxSetBuf implements Clone, and therefore defines clone_from, which is a provided method that calls clone. IdxSet has a clone_from method, which is probably intended to be called, but the IdxSetBuf implementation is called instead.
I think renaming IdxSet::clone_from as IdxSet::overwrite is reasonable. I wonder if there are any other places like this, where the clone_from being called is not the one intended.
nikomatsakis
Jun 29, 2018
Contributor
There aren't many definitions of clone_from, so it wouldn't be hard to audit.
There aren't many definitions of clone_from, so it wouldn't be hard to audit.
The current situation is something of a mess. - `IdxSetBuf` derefs to `IdxSet`. - `IdxSetBuf` implements `Clone`, and therefore has a provided `clone_from` method, which does allocation and so is expensive. - `IdxSet` has a `clone_from` method that is non-allocating and therefore cheap, but this method is not from the `Clone` trait. As a result, if you have an `IdxSetBuf` called `b`, if you call `b.clone_from(b2)` you'll get the expensive `IdxSetBuf` method, but if you call `(*b).clone_from(b2)` you'll get the cheap `IdxSetBuf` method. `liveness_of_locals()` does the former, presumably unintentionally, and therefore does lots of unnecessary allocations. Having a `clone_from` method that isn't from the `Clone` trait is a bad idea in general, so this patch renames it as `overwrite`. This avoids the unnecessary allocations in `liveness_of_locals()`, speeding up most NLL benchmarks, the best by 1.5%. It also means that calls of the form `(*b).clone_from(b2)` can be rewritten as `b.overwrite(b2)`.
|
I have updated. r? @nikomatsakis |
|
So...seems ok. =) |
|
@bors r+ |
|
|
…tsakis Avoid needless allocations in `liveness_of_locals`. We don't need to replace the heap-allocated bitset, we can just overwrite its contents. This speeds up most NLL benchmarks, the best by 1.5%. r? @nikomatsakis
Rollup of 9 pull requests Successful merges: - #51511 (Stabilize Iterator::flatten in 1.29, fixes #48213.) - #51853 (Fix some doc links) - #51864 (Update liblibc) - #51869 (Avoid needless allocations in `liveness_of_locals`.) - #51883 (Suggest correct comparison against negative literal) - #51890 (Fix inconsequential typo in GlobalAlloc doc example) - #51920 (use literal span for concrete type suggestion) - #51921 (improve the error message when `#[panic_implementation]` is missing) - #51931 (Use in-tree libbacktrace on Fuchsia) Failed merges: r? @ghost
Avoid needless allocations in `liveness_of_locals`. We don't need to replace the heap-allocated bitset, we can just overwrite its contents. This speeds up most NLL benchmarks, the best by 1.5%. r? @nikomatsakis
|
|
We don't need to replace the heap-allocated bitset, we can just
overwrite its contents.
This speeds up most NLL benchmarks, the best by 1.5%.
r? @nikomatsakis