Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not inline finish_grow #78682

Merged
merged 1 commit into from
Dec 15, 2020
Merged

Do not inline finish_grow #78682

merged 1 commit into from
Dec 15, 2020

Conversation

glandium
Copy link
Contributor

@glandium glandium commented Nov 2, 2020

Fixes #78471.

Looking at libgkrust.a in Firefox, the sizes for the gkrust.*.o file is:

  • 18584816 (text) 582418 (data) with unmodified master
  • 17937659 (text) 582554 (data) with Tiny Vecs are dumb. #72227 reverted
  • 17968228 (text) 582858 (data) with #[inline(never)] on grow_amortized and grow_exact, but that has some performance consequences
  • 17927760 (text) 582322 (data) with this change

So in terms of size, at least in the case of Firefox, this patch more than undoes the regression. I don't think it should affect performance, but we'll see.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 2, 2020
@glandium
Copy link
Contributor Author

glandium commented Nov 2, 2020

If I look at libgkrust.a in Firefox, the .text size for the gkrust.*.o file is:

  • 19118203 with current master
  • 18402141 with Tiny Vecs are dumb. #72227 reverted
  • 18356853 with #[inline(never)] on grow_amortized and grow_exact
  • 18337894 with this patch

So in terms of size, at least in the case of Firefox, this patch more than undoes the regression. I don't think it should affect performance, but we'll see.

@jyn514 jyn514 added I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 2, 2020
@jyn514
Copy link
Member

jyn514 commented Nov 2, 2020

Would it make sense to instead mark grow_amortized and grow_exact as inline(never)?

@jyn514
Copy link
Member

jyn514 commented Nov 2, 2020

Oh I missed #78483, sorry.

@gparker42
Copy link

This looks like the right inlining boundary as intended by @nnethercote. finish_grow deliberately handles as much type-independent work as possible, so a single copy could be shared by all Vec types.

#78471 reported grow_amortized being inlined all the way back into callers of Vec::push. @glandium, do you see that in your libgkrust.a? That sounds unprofitable - I'd expect the call to RawVec::reserve to be another inlining boundary - but I couldn't reproduce the reported aggressive inlining in my own small tests.

@glandium
Copy link
Contributor Author

glandium commented Nov 3, 2020

It may only happen with -Clto.

@gparker42
Copy link

Ah, that looks like it: I need both -Clto and a sufficiently large inline-threshold (larger than the opt-level=3 default).

Can you build your libgkrust.a with an inlining boundary at RawVec::reserve? That reduces the Vec::push inlined size to around ten instructions. (I'm still getting set up for code size experiments here.)

@lcnr
Copy link
Contributor

lcnr commented Nov 3, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Nov 3, 2020

⌛ Trying commit 068782448bdf657c6be701c14835d03e1d322c90 with merge 452835b56923f7ef0f8d102f1e96c939cec7af07...

@bors
Copy link
Contributor

bors commented Nov 3, 2020

☀️ Try build successful - checks-actions
Build commit: 452835b56923f7ef0f8d102f1e96c939cec7af07 (452835b56923f7ef0f8d102f1e96c939cec7af07)

@rust-timer
Copy link
Collaborator

Queued 452835b56923f7ef0f8d102f1e96c939cec7af07 with parent 7b5a9e9, future comparison URL.

@glandium
Copy link
Contributor Author

glandium commented Nov 3, 2020

Ah, that looks like it: I need both -Clto and a sufficiently large inline-threshold (larger than the opt-level=3 default).

It happens in Firefox with -Clto -Copt-level=2.

Can you build your libgkrust.a with an inlining boundary at RawVec::reserve? That reduces the Vec::push inlined size to around ten instructions. (I'm still getting set up for code size experiments here.)

That would likely make things slower, similarly to #78483.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (452835b56923f7ef0f8d102f1e96c939cec7af07): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

@gparker42
Copy link

Can you build your libgkrust.a with an inlining boundary at RawVec::reserve? That reduces the Vec::push inlined size to around ten instructions. (I'm still getting set up for code size experiments here.)

That would likely make things slower, similarly to #78483.

I'm betting that reserve can be a more profitable inlining boundary than grow_amortized was, but I'll tinker with that on my own.

@nnethercote
Copy link
Contributor

I'm betting that reserve can be a more profitable inlining boundary than grow_amortized was, but I'll tinker with that on my own.

I tried a number of things in and around reserve, results were sensitive and unpredictable and I couldn't get improvements.

This patch seems fine, though. I'm a little surprised I didn't never-inline finish_grow myself.

@bors r+

@bors
Copy link
Contributor

bors commented Nov 6, 2020

📌 Commit 068782448bdf657c6be701c14835d03e1d322c90 has been approved by nnethercote

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 6, 2020
@bors
Copy link
Contributor

bors commented Nov 6, 2020

⌛ Testing commit 068782448bdf657c6be701c14835d03e1d322c90 with merge 1052db144c002891ffbd4517ea0071a222b52070...

@bors
Copy link
Contributor

bors commented Nov 6, 2020

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 6, 2020
@glandium
Copy link
Contributor Author

glandium commented Nov 6, 2020

/checkout/src/test/codegen/vec-iter-collect-len.rs:9:16: error: CHECK-NOT: excluded string found in input
 // CHECK-NOT: call
               ^
/checkout/obj/build/i686-unknown-linux-gnu/test/codegen/vec-iter-collect-len/vec-iter-collect-len.ll:134:2: note: found here
 call void @llvm.lifetime.start.p0i8(i64 12, i8* nonnull %0)
 ^~~~

How can one work around that?

@jyn514
Copy link
Member

jyn514 commented Nov 12, 2020

$ git log --oneline src/test/codegen/vec-iter-collect-len.rs
2a663555ddf Remove licenses
1001b2beee5 inore some codegen tests when debug assertions are enabled
f67453729c1 std: Ensure OOM is classified as `nounwind`
$ git show f67453729c1 src/test/codegen/vec-iter-collect-len.rs
commit f67453729c19b435686c94936d8145051e7f1284
Author: Alex Crichton <alex@alexcrichton.com>
Date:   Thu May 24 12:03:05 2018 -0700

    std: Ensure OOM is classified as `nounwind`
    
    OOM can't unwind today, and historically it's been optimized as if it can't
    unwind. This accidentally regressed with recent changes to the OOM handler, so
    this commit adds in a codegen test to assert that everything gets optimized away
    after the OOM function is approrpiately classified as nounwind
    
    Closes #50925


added: src/test/codegen/vec-iter-collect-len.rs
────────────────────────────────────────────────────────────────────────────────────

1
// Copyright 2017 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution and at
// http://rust-lang.org/COPYRIGHT.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

// no-system-llvm
// compile-flags: -O
#![crate_type="lib"]

#[no_mangle]
pub fn get_len() -> usize {
    // CHECK-LABEL: @get_len
    // CHECK-NOT: call
    // CHECK-NOT: invoke
    [1, 2, 3].iter().collect::<Vec<_>>().len()
}

@alexcrichton do you understand why this change could have broken the test?

@glandium
Copy link
Contributor Author

glandium commented Dec 8, 2020

FTR, I already validated that it still optimizes the collect().len() case from vec-iter-collect-len.rs.

@glandium
Copy link
Contributor Author

glandium commented Dec 9, 2020

Should we queue another round of perf tests for the SpecFromIterNested change?

@jyn514
Copy link
Member

jyn514 commented Dec 9, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Dec 9, 2020

⌛ Trying commit 76bd145 with merge 1618e13494fc48a6834838c2d90011e29f0434ea...

@bors
Copy link
Contributor

bors commented Dec 9, 2020

☀️ Try build successful - checks-actions
Build commit: 1618e13494fc48a6834838c2d90011e29f0434ea (1618e13494fc48a6834838c2d90011e29f0434ea)

@rust-timer
Copy link
Collaborator

Queued 1618e13494fc48a6834838c2d90011e29f0434ea with parent c16d52d, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (1618e13494fc48a6834838c2d90011e29f0434ea): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

@jyn514
Copy link
Member

jyn514 commented Dec 10, 2020

Wow, that is a really large savings in max-rss 🎉

@the8472
Copy link
Member

the8472 commented Dec 10, 2020

max-rss is really noisy, compare to the noise run which shows similarly-sized fluctuations on other crates.

@lcnr
Copy link
Contributor

lcnr commented Dec 14, 2020

i'm a bit out of the loop here, sry

It looks like it is fine to land this PR as is rn and I would be fine with landing if #78682 (comment) is still up to date. It might even make sense to move that summary into the PR description so that it's added to the merge commit.

@bors delegate+

@bors
Copy link
Contributor

bors commented Dec 14, 2020

✌️ @glandium can now approve this pull request

@glandium
Copy link
Contributor Author

I'll get new numbers and paste them in the PR message. I've never used bors myself, do I @-it with "rollup" when I'm done?

@jyn514
Copy link
Member

jyn514 commented Dec 14, 2020

@glandium you would say @bors r=lcnr rollup=never. rollup=always is only for PRs that don't affect perf or are trivial, like fixing typos in comments.

@bors
Copy link
Contributor

bors commented Dec 14, 2020

📌 Commit 76bd145 has been approved by lcnr

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 14, 2020
@jyn514
Copy link
Member

jyn514 commented Dec 14, 2020

damn it bors

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 14, 2020
@glandium
Copy link
Contributor Author

@bors r=lcnr rollup=never

@bors
Copy link
Contributor

bors commented Dec 15, 2020

📌 Commit 76bd145 has been approved by lcnr

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 15, 2020
@bors
Copy link
Contributor

bors commented Dec 15, 2020

⌛ Testing commit 76bd145 with merge e261649...

@bors
Copy link
Contributor

bors commented Dec 15, 2020

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing e261649 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 15, 2020
@bors bors merged commit e261649 into rust-lang:master Dec 15, 2020
@rustbot rustbot added this to the 1.50.0 milestone Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-heavy Issue: Problems and improvements with respect to binary size of generated code. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code bloat from RawVec::grow_amortized