Avoid needless allocations in `liveness_of_locals`. #51869

nnethercote · 2018-06-28T09:46:43Z

We don't need to replace the heap-allocated bitset, we can just
overwrite its contents.

This speeds up most NLL benchmarks, the best by 1.5%.

nikomatsakis · 2018-06-28T10:21:30Z

src/librustc_data_structures/indexed_set.rs

+
+    /// Overwrite one `IdxSetBuf` with another of the same length.
+    pub fn overwrite(&mut self, other: &IdxSetBuf<T>) {
+        self.bits.clone_from_slice(&other.bits);


This method probably belongs on IdxSet rather than IdxSetBuf (note that IdxSetBuf derefs to IdxSet). The relationship between those two types is analogous to [] and Vec, so most operations that don't change the size or require allocation belong on IdxSet. There is however an existing clone_from method on that type that does the same thing:

https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/indexed_set/struct.IdxSet.html#method.clone_from

But maybe we should rename that method (IdxSet::clone_from) to overwrite? I wonder if we are having shadowing problems because of Clone::clone_from....

You are right about the shadowing. Here's the DHAT output:

==60110== -------------------- 1 of 500 -------------------- ==60110== max-live: 217,728 in 1,701 blocks ==60110== tot-alloc: 77,545,592 in 1,330,065 blocks (avg size 58.30) ==60110== deaths: 1,330,065, at avg age 2,374,592 (0.00% of prog lifetime) ==60110== acc-ratios: 0.07 rd, 1.51 wr (5,953,680 b-read, 117,537,792 b-written) ==60110== at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so) ==60110== by 0x7C016B0: alloc (alloc.rs:62) ==60110== by 0x7C016B0: alloc (alloc.rs:123) ==60110== by 0x7C016B0: allocate_in<usize,alloc::alloc::Global> (raw_vec.rs:103) ==60110== by 0x7C016B0: with_capacity<usize> (raw_vec.rs:147) ==60110== by 0x7C016B0: with_capacity<usize> (vec.rs:362) ==60110== by 0x7C016B0: to_vec<usize> (slice.rs:167) ==60110== by 0x7C016B0: to_vec<usize> (slice.rs:369) ==60110== by 0x7C016B0: clone<usize> (vec.rs:1670) ==60110== by 0x7C016B0: clone<rustc::mir::Local> (indexed_set.rs:36) ==60110== by 0x7C016B0: clone_from<rustc_data_structures::indexed_set::IdxSetBuf<rustc::mir::Local>> (clone.rs:131) ==60110== by 0x7C016B0: rustc_mir::util::liveness::liveness_of_locals (liveness.rs:144) ==60110== by 0x7B22D4B: compute (liveness.rs:106) ==60110== by 0x7B22D4B: rustc_mir::borrow_check::nll::compute_regions (mod.rs:101) ==60110== by 0x7B33381: rustc_mir::borrow_check::do_mir_borrowck (mod.rs:238)

IdxSetBuf implements Clone, and therefore defines clone_from, which is a provided method that calls clone. IdxSet has a clone_from method, which is probably intended to be called, but the IdxSetBuf implementation is called instead.

I think renaming IdxSet::clone_from as IdxSet::overwrite is reasonable. I wonder if there are any other places like this, where the clone_from being called is not the one intended.

There aren't many definitions of clone_from, so it wouldn't be hard to audit.

The current situation is something of a mess. - `IdxSetBuf` derefs to `IdxSet`. - `IdxSetBuf` implements `Clone`, and therefore has a provided `clone_from` method, which does allocation and so is expensive. - `IdxSet` has a `clone_from` method that is non-allocating and therefore cheap, but this method is not from the `Clone` trait. As a result, if you have an `IdxSetBuf` called `b`, if you call `b.clone_from(b2)` you'll get the expensive `IdxSetBuf` method, but if you call `(*b).clone_from(b2)` you'll get the cheap `IdxSetBuf` method. `liveness_of_locals()` does the former, presumably unintentionally, and therefore does lots of unnecessary allocations. Having a `clone_from` method that isn't from the `Clone` trait is a bad idea in general, so this patch renames it as `overwrite`. This avoids the unnecessary allocations in `liveness_of_locals()`, speeding up most NLL benchmarks, the best by 1.5%. It also means that calls of the form `(*b).clone_from(b2)` can be rewritten as `b.overwrite(b2)`.

nnethercote · 2018-06-29T01:52:50Z

I have updated. r? @nikomatsakis

nikomatsakis · 2018-06-29T09:57:56Z

rg 'fn clone_from\b' reports:

src/liballoc/binary_heap.rs:288:    fn clone_from(&mut self, source: &Self) {
src/liballoc/boxed.rs:302:    fn clone_from(&mut self, source: &Box<T>) {
src/liballoc/borrow.rs:173:    fn clone_from(&mut self, source: &Cow<'a, B>) {
src/liballoc/vec.rs:1682:    fn clone_from(&mut self, other: &Vec<T>) {
src/liballoc/string.rs:1688:    fn clone_from(&mut self, source: &Self) {
src/libcore/clone.rs:130:    fn clone_from(&mut self, source: &Self) {
src/libcore/mem.rs:1040:    fn clone_from(&mut self, source: &Self) {
src/librustc_data_structures/indexed_set.rs:236:    pub fn clone_from(&mut self, other: &IdxSet<T>) {

So...seems ok. =)

nikomatsakis · 2018-06-29T09:58:19Z

@bors r+

bors · 2018-06-29T09:58:20Z

📌 Commit 08683f0 has been approved by nikomatsakis

@nikomatsakis

…tsakis Avoid needless allocations in `liveness_of_locals`. We don't need to replace the heap-allocated bitset, we can just overwrite its contents. This speeds up most NLL benchmarks, the best by 1.5%. r? @nikomatsakis

bors · 2018-07-01T04:23:36Z

⌛ Testing commit 08683f0 with merge e953e46...

@nikomatsakis

Avoid needless allocations in `liveness_of_locals`. We don't need to replace the heap-allocated bitset, we can just overwrite its contents. This speeds up most NLL benchmarks, the best by 1.5%. r? @nikomatsakis

bors · 2018-07-01T06:37:18Z

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing e953e46 to master...

rust-highfive assigned nikomatsakis Jun 28, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 28, 2018

nnethercote mentioned this pull request Jun 28, 2018

[nll] fewer allocations in liveness, use dirty list #51819

Closed

nikomatsakis requested changes Jun 28, 2018

View reviewed changes

nikomatsakis added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 28, 2018

nnethercote force-pushed the rm-clone_from branch from 0352aee to 08683f0 Compare June 29, 2018 01:50

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 29, 2018

kennytm mentioned this pull request Jun 30, 2018

Rollup of 9 pull requests #51948

Closed

bors merged commit 08683f0 into rust-lang:master Jul 1, 2018

bors mentioned this pull request Jul 1, 2018

introduce dirty list to liveness, eliminate ins vector #51896

Merged

nnethercote deleted the rm-clone_from branch July 1, 2018 23:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid needless allocations in `liveness_of_locals`. #51869

Avoid needless allocations in `liveness_of_locals`. #51869

nnethercote commented Jun 28, 2018

nikomatsakis Jun 28, 2018

nnethercote Jun 28, 2018

nikomatsakis Jun 29, 2018

nnethercote commented Jun 29, 2018

nikomatsakis commented Jun 29, 2018

nikomatsakis commented Jun 29, 2018

bors commented Jun 29, 2018

bors commented Jul 1, 2018

bors commented Jul 1, 2018

Avoid needless allocations in liveness_of_locals. #51869

Avoid needless allocations in liveness_of_locals. #51869

Conversation

nnethercote commented Jun 28, 2018

nikomatsakis Jun 28, 2018

Choose a reason for hiding this comment

nnethercote Jun 28, 2018

Choose a reason for hiding this comment

nikomatsakis Jun 29, 2018

Choose a reason for hiding this comment

nnethercote commented Jun 29, 2018

nikomatsakis commented Jun 29, 2018

nikomatsakis commented Jun 29, 2018

bors commented Jun 29, 2018

bors commented Jul 1, 2018

bors commented Jul 1, 2018

Avoid needless allocations in `liveness_of_locals`. #51869

Avoid needless allocations in `liveness_of_locals`. #51869