Skip to content

Conversation

bjorn3
Copy link
Member

@bjorn3 bjorn3 commented Sep 5, 2025

This is likely the cause of the perf regression in #145955. It also caused some functional regressions.

Fixes #146235
Fixes #146239

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 5, 2025
@bjorn3
Copy link
Member Author

bjorn3 commented Sep 5, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Sep 5, 2025
Make the allocator shim participate in LTO again
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 5, 2025
@bjorn3 bjorn3 force-pushed the lto_allocator_shim branch from 4ec9a9b to d0e65a9 Compare September 5, 2025 10:04
@bjorn3
Copy link
Member Author

bjorn3 commented Sep 5, 2025

Forgot to revert the changes in exported_symbols_for_lto. This shouldn't affect the perf run other than possibly showing less of a performance improvement than it should give in the end.

@bjorn3 bjorn3 mentioned this pull request Sep 5, 2025
@rust-bors
Copy link

rust-bors bot commented Sep 5, 2025

☀️ Try build successful (CI)
Build commit: 5ab6398 (5ab63980021f7c1ae280eba3261d66240d594007, parent: ad85bc524b1ad696e42061ad8338d382dffbdbe5)

@rust-timer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Sep 5, 2025

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 5, 2025
@rustbot
Copy link
Collaborator

rustbot commented Sep 5, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@lqd
Copy link
Member

lqd commented Sep 5, 2025

This test reproduces some from of these other two issues (rust-lld: error: undefined hidden symbol: __rustc::__rg_oom without this PR). Can you add it to the PR?

//@ compile-flags: --crate-type cdylib -C lto 

use std::alloc::{GlobalAlloc, Layout};

struct MyAllocator;

unsafe impl GlobalAlloc for MyAllocator {
    unsafe fn alloc(&self, _layout: Layout) -> *mut u8 {
        todo!()
    }

    unsafe fn dealloc(&self, _ptr: *mut u8, _layout: Layout) {
    }
}

#[global_allocator]
static GLOBAL: MyAllocator = MyAllocator;

Forgot to revert the changes in exported_symbols_for_lto

You've since done this, IIUC.

@bjorn3
Copy link
Member Author

bjorn3 commented Sep 5, 2025

Thanks! Had to modify it slightly to work with compiletest.

Forgot to revert the changes in exported_symbols_for_lto

You've since done this, IIUC.

Correct

@lqd
Copy link
Member

lqd commented Sep 5, 2025

Otherwise this looks good to me and fixes the regressions, so that's great, thanks!

I'm not sure we care about the perf results, but they should be available in 3-4hours. You can r=me at your preference.

@bjorn3 bjorn3 force-pushed the lto_allocator_shim branch from e10f5b6 to e072d7d Compare September 5, 2025 15:02
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (5ab6398): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.2% [0.1%, 0.3%] 2
Improvements ✅
(primary)
-1.3% [-27.1%, -0.3%] 229
Improvements ✅
(secondary)
-1.1% [-46.7%, -0.0%] 264
All ❌✅ (primary) -1.3% [-27.1%, -0.3%] 229

Max RSS (memory usage)

Results (primary 1.8%, secondary -2.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
7.4% [7.4%, 7.4%] 1
Regressions ❌
(secondary)
2.1% [1.4%, 2.7%] 5
Improvements ✅
(primary)
-0.9% [-1.4%, -0.5%] 2
Improvements ✅
(secondary)
-4.5% [-6.5%, -2.5%] 9
All ❌✅ (primary) 1.8% [-1.4%, 7.4%] 3

Cycles

Results (primary -13.2%, secondary -8.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.9% [2.9%, 2.9%] 1
Improvements ✅
(primary)
-13.2% [-24.6%, -2.4%] 6
Improvements ✅
(secondary)
-10.8% [-44.4%, -1.6%] 6
All ❌✅ (primary) -13.2% [-24.6%, -2.4%] 6

Binary size

Results (primary 53.0%, secondary 59.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
70.9% [70.9%, 71.0%] 3
Regressions ❌
(secondary)
118.8% [118.8%, 118.8%] 1
Improvements ✅
(primary)
-0.9% [-0.9%, -0.9%] 1
Improvements ✅
(secondary)
-0.4% [-0.4%, -0.4%] 1
All ❌✅ (primary) 53.0% [-0.9%, 71.0%] 4

Bootstrap: 467.829s -> 466.151s (-0.36%)
Artifact size: 390.58 MiB -> 387.87 MiB (-0.69%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 5, 2025
@rustbot rustbot added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Sep 5, 2025
@bjorn3
Copy link
Member Author

bjorn3 commented Sep 5, 2025

It improves things even more than it previously regressed.

$ echo 'fn main() {}' | RUSTC_BOOTSTRAP=1 rustc +nightly-2025-09-01  - -Zhuman-readable-cgu-names -O -Csave-temps -Clto=true && ls -l
-rwxrwxr-x 1 bjorn bjorn 1772392  5 sep 20:53 rust_out
-rw-rw-r-- 1 bjorn bjorn 2609776  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn 6979364  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.lto.after-restriction.bc
-rw-rw-r-- 1 bjorn bjorn 6979324  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.lto.input.bc
-rw-rw-r-- 1 bjorn bjorn    4512  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.no-opt.bc
-rw-rw-r-- 1 bjorn bjorn 3211728  5 sep 20:53 rust_out.rust_out.49aad25d9732bf04-cgu.0.rcgu.o
$ echo 'fn main() {}' | RUSTC_BOOTSTRAP=1 rustc +nightly  - -Zhuman-readable-cgu-names -O -Csave-temps -Clto=true && ls -l
-rwxrwxr-x 1 bjorn bjorn 1766032  5 sep 20:54 rust_out
-rw-rw-r-- 1 bjorn bjorn 2612612  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn 6982788  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.lto.after-restriction.bc
-rw-rw-r-- 1 bjorn bjorn 6982752  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.lto.input.bc
-rw-rw-r-- 1 bjorn bjorn    4512  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.no-opt.bc
-rw-rw-r-- 1 bjorn bjorn 3188680  5 sep 20:54 rust_out.rust_out.f8d6093b640b0034-cgu.0.rcgu.o
-rw-rw-r-- 1 bjorn bjorn    3484  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-crate.allocator.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn    3224  5 sep 20:53 rust_out.rust_out.f8d6093b640b0034-crate.allocator.rcgu.o
$ echo 'fn main() {}' | RUSTC_BOOTSTRAP=1 rustc +5ab63980021f7c1ae280eba3261d66240d594007 - -Zhuman-readable-cgu-names -O -Csave-temps -Clto=true && ls -l
-rwxrwxr-x 1 bjorn bjorn 2264800  5 sep 20:55 rust_out
-rw-rw-r-- 1 bjorn bjorn    4512  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-cgu.0.rcgu.no-opt.bc
-rw-rw-r-- 1 bjorn bjorn 2619640  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.bc
-rw-rw-r-- 1 bjorn bjorn 6982380  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.lto.after-restriction.bc
-rw-rw-r-- 1 bjorn bjorn 6982344  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.lto.input.bc
-rw-rw-r-- 1 bjorn bjorn 3701008  5 sep 20:55 rust_out.rust_out.dbca3ea46f37a61b-crate.allocator.rcgu.o

I suspect what happened is that I didn't restore the check in fat LTO that ensures the allocator module is not used as base to merge all other modules into. The allocator module is not configured to be optimized, so we probably skipped all optimizations after doing the module merging pass of fat LTO. I've added a new commit to fix this.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Sep 5, 2025
Make the allocator shim participate in LTO again
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 5, 2025
@fee1-dead
Copy link
Member

r? @lqd or codegen

@rustbot rustbot assigned lqd and unassigned fee1-dead Sep 5, 2025
@rust-bors
Copy link

rust-bors bot commented Sep 5, 2025

☀️ Try build successful (CI)
Build commit: 499d4b9 (499d4b9d218a37f0dd293ed772cc9ee39f778836, parent: 9cd272dc85320e85a8c83a1a338870de52c005f3)

@rust-timer
Copy link
Collaborator

Queued 499d4b9 with parent 9cd272d, future comparison URL.
There is currently 1 preceding artifact in the queue.
It will probably take at least ~1.6 hours until the benchmark run finishes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-perf Status: Waiting on a perf run to be completed. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hidden symbol in nightly. rustc emits an unexpected _rdl symbols for WASM with lto=true
5 participants