Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Vec::push as non-generic as possible #91848

Closed
wants to merge 5 commits into from

Conversation

nnethercote
Copy link
Contributor

Vec::push is a frequently instantiated generic function, and reducing the amount of generic code in functions it calls (e.g. RawVec::grow_amortized has been a performance win in the past, e.g. #72013, #91246. But some additional attempts also failed, e.g. #72189, #73912, #75093, #75129.

This PR goes deeper than those old failed attemps. It makes RawVec::grow_amortized() about as non-generic as possible. Everything remaining in it involves self, or a T-dependent operation, or is passing stuff to the non-generic finish_grow_amortized().

r? @ghost

As long as these functions are called after `needs_to_grow()`, the
capacity overflow check has the same result as the special zero-sized
type check.

This commit removes that check, documents the `needs_to_grow()`
constraint, and updates the test accordingly.
It reduces generated code size a bit more.
This factors out common code at the end of `RawVec::finish_grow_exact`
and `RawVec::finish_grow_amortized`.
@nnethercote
Copy link
Contributor Author

@bors try @rust-timer queue

cc @SimonSapin, viz. #72013 (comment)

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 13, 2021
@bors
Copy link
Contributor

bors commented Dec 13, 2021

⌛ Trying commit 24041c3 with merge f09ca0f6143be1dbe123a38a376f2400b4a86e57...

@nnethercote
Copy link
Contributor Author

nnethercote commented Dec 13, 2021

Results locally have been mixed. It definitely reduces the amount of LLVM IR generated. I think syn is the benchmark most affected, for a debug build the number of lines drops from 98,141 to 95,945, which is a 2.2% drop. But it feels like there is a lot of randomness in the results, things shifting around unpredictably, possibly due to changes in what is inlined, and possibly due to how CGUs are constructed. Let's see how the CI run goes.

@bors
Copy link
Contributor

bors commented Dec 13, 2021

☀️ Try build successful - checks-actions
Build commit: f09ca0f6143be1dbe123a38a376f2400b4a86e57 (f09ca0f6143be1dbe123a38a376f2400b4a86e57)

@rust-timer
Copy link
Collaborator

Queued f09ca0f6143be1dbe123a38a376f2400b4a86e57 with parent f7fd79a, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f09ca0f6143be1dbe123a38a376f2400b4a86e57): comparison url.

Summary: This change led to large relevant mixed results 🤷 in compiler performance.

  • Large improvement in instruction counts (up to -3.8% on incr-unchanged builds of deep-vector)
  • Large regression in instruction counts (up to 3.0% on full builds of html5ever)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 13, 2021
@camelid camelid added T-libs Relevant to the library team, which will review and decide on the PR/issue. I-heavy Issue: Problems and improvements with respect to binary size of generated code. labels Dec 13, 2021
@nnethercote
Copy link
Contributor Author

The doc builds lead the regressions. I've learned that rustdoc calls push_str a lot, which ends up calling this sequence of functions: String::push_str / Vec::extend_from_slice / Vec::spec_extend /
Vec::append_elements / RawVec::reserve.

So now I need to determine why that code path got slower, and if I can fix it.

@nnethercote
Copy link
Contributor Author

This is a tar pit. After many tries I can't do better than a mixture of improvements and regressions. Time to give up.

@nnethercote nnethercote deleted the RawVec-full-degen branch December 16, 2021 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-heavy Issue: Problems and improvements with respect to binary size of generated code. perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants