Make `Vec::push` as non-generic as possible #91848

nnethercote · 2021-12-13T04:03:01Z

Vec::push is a frequently instantiated generic function, and reducing the amount of generic code in functions it calls (e.g. RawVec::grow_amortized has been a performance win in the past, e.g. #72013, #91246. But some additional attempts also failed, e.g. #72189, #73912, #75093, #75129.

This PR goes deeper than those old failed attemps. It makes RawVec::grow_amortized() about as non-generic as possible. Everything remaining in it involves self, or a T-dependent operation, or is passing stuff to the non-generic finish_grow_amortized().

r? @ghost

As long as these functions are called after `needs_to_grow()`, the capacity overflow check has the same result as the special zero-sized type check. This commit removes that check, documents the `needs_to_grow()` constraint, and updates the test accordingly.

It reduces generated code size a bit more.

This factors out common code at the end of `RawVec::finish_grow_exact` and `RawVec::finish_grow_amortized`.

nnethercote · 2021-12-13T04:03:50Z

@bors try @rust-timer queue

cc @SimonSapin, viz. #72013 (comment)

rust-timer · 2021-12-13T04:03:51Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-12-13T04:03:58Z

⌛ Trying commit 24041c3 with merge f09ca0f6143be1dbe123a38a376f2400b4a86e57...

nnethercote · 2021-12-13T04:06:47Z

Results locally have been mixed. It definitely reduces the amount of LLVM IR generated. I think syn is the benchmark most affected, for a debug build the number of lines drops from 98,141 to 95,945, which is a 2.2% drop. But it feels like there is a lot of randomness in the results, things shifting around unpredictably, possibly due to changes in what is inlined, and possibly due to how CGUs are constructed. Let's see how the CI run goes.

bors · 2021-12-13T05:54:25Z

☀️ Try build successful - checks-actions
Build commit: f09ca0f6143be1dbe123a38a376f2400b4a86e57 (f09ca0f6143be1dbe123a38a376f2400b4a86e57)

rust-timer · 2021-12-13T05:54:26Z

Queued f09ca0f6143be1dbe123a38a376f2400b4a86e57 with parent f7fd79a, future comparison URL.

rust-timer · 2021-12-13T07:25:01Z

Finished benchmarking commit (f09ca0f6143be1dbe123a38a376f2400b4a86e57): comparison url.

Summary: This change led to large relevant mixed results 🤷 in compiler performance.

Large improvement in instruction counts (up to -3.8% on incr-unchanged builds of deep-vector)
Large regression in instruction counts (up to 3.0% on full builds of html5ever)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

nnethercote · 2021-12-14T00:18:32Z

The doc builds lead the regressions. I've learned that rustdoc calls push_str a lot, which ends up calling this sequence of functions: String::push_str / Vec::extend_from_slice / Vec::spec_extend /
Vec::append_elements / RawVec::reserve.

So now I need to determine why that code path got slower, and if I can fix it.

nnethercote · 2021-12-15T07:39:30Z

This is a tar pit. After many tries I can't do better than a mixture of improvements and regressions. Time to give up.

nnethercote added 5 commits December 13, 2021 11:36

Inline Layout::array() into RawVec::finish_grow().

bceeb90

Split finish_grow in two, and move capacity computations.

7b820b0

Remove RawVec::capacity_from_bytes and RawVec::set_ptr.

3ee8933

It reduces generated code size a bit more.

Add RawVec::finish_grow().

24041c3

This factors out common code at the end of `RawVec::finish_grow_exact` and `RawVec::finish_grow_amortized`.

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 13, 2021

rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 13, 2021

camelid added T-libs Relevant to the library team, which will review and decide on the PR/issue. I-heavy Issue: Problems and improvements with respect to binary size of generated code. labels Dec 13, 2021

nnethercote closed this Dec 15, 2021

nnethercote deleted the RawVec-full-degen branch December 16, 2021 10:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `Vec::push` as non-generic as possible #91848

Make `Vec::push` as non-generic as possible #91848

nnethercote commented Dec 13, 2021

nnethercote commented Dec 13, 2021

rust-timer commented Dec 13, 2021

bors commented Dec 13, 2021

nnethercote commented Dec 13, 2021 •

edited

Loading

bors commented Dec 13, 2021

rust-timer commented Dec 13, 2021

rust-timer commented Dec 13, 2021

nnethercote commented Dec 14, 2021

nnethercote commented Dec 15, 2021

Make Vec::push as non-generic as possible #91848

Make Vec::push as non-generic as possible #91848

Conversation

nnethercote commented Dec 13, 2021

nnethercote commented Dec 13, 2021

rust-timer commented Dec 13, 2021

bors commented Dec 13, 2021

nnethercote commented Dec 13, 2021 • edited Loading

bors commented Dec 13, 2021

rust-timer commented Dec 13, 2021

rust-timer commented Dec 13, 2021

nnethercote commented Dec 14, 2021

nnethercote commented Dec 15, 2021

Make `Vec::push` as non-generic as possible #91848

Make `Vec::push` as non-generic as possible #91848

nnethercote commented Dec 13, 2021 •

edited

Loading