monolithic lto link time very slow when building with several codegen units #48025

matthiaskrgr · 2018-02-06T00:35:17Z

I noticed my project https://github.com/matthiaskrgr/cargo-cache took noticeably longer to build with "lto=true" at some point.
I benchmarked link time (touch src/main.rs; cargo build --release) which appear to blow up as soon as we have several codegen units.

link times:

release	stable: 1.23	beta: 1.24	nightly: 1.25
profile.release, lto=true, codegen-units=1	45s	65s	48s
profile.release, lto=true	46s	278s	244s
profile.release (default)	6.3s	6.39	6.43s

So it seem codegen-units=1 makes for bad parallelism while building but speeds up lto because less translation units need to be merged.

Maybe it is possible to dynamically use as little codegen units as possible (but still enough to utilize all cores) when building? //cc #46910

retep998 · 2018-02-06T06:21:55Z

Or maybe just don't use multiple codegen units by default when monolithic LTO is enabled.

matthiaskrgr · 2018-02-06T10:37:55Z

I think the optimal "balance" for linktime/buildtime with monolithic lto would be to

compile all crates with codgen-units=1 as long as we can compile several crates in parallel.
if we have to compile a single crate at a time, use ${thread_number} codegen units.

However this will probably introduce indeterministic builds depending on the number of threads a host machine has. So a single core machine would optimize better than a 8 core machine when building the same crate with monolithic lto. :/

matthiaskrgr · 2018-02-12T15:06:00Z

I made some benchmarks (cpu: i5-2540M dual core), rustc 1.25.0-nightly (b8398d9 2018-02-11)

[profile.release]
lto = true
opt-level = 3
codegen-units = x

codegen-units	total build time
1	7:29
2	7:56
4	7:45
8	8:28
16	10:00

alexcrichton · 2018-02-12T16:46:44Z

I believe this is fixed in #48163

alexcrichton · 2018-02-12T16:46:52Z

er, didn't mean to close yet

This commit updates our Fat LTO logic to tweak our custom wrapper around LLVM's "link modules" functionality. Previously whenever the `LLVMRustLinkInExternalBitcode` function was called it would call LLVM's `Linker::linkModules` wrapper. Internally this would crate an instance of a `Linker` which internally creates an instance of an `IRMover`. Unfortunately for us the creation of `IRMover` is somewhat O(n) with the input module. This means that every time we linked a module it was O(n) with respect to the entire module we had built up! Now the modules we build up during LTO are quite large, so this quickly started creating an O(n^2) problem for us! Discovered in rust-lang#48025 it turns out this has always been a problem and we just haven't noticed it. It became particularly worse recently though due to most libraries having 16x more object files than they previously did (1 -> 16). This commit fixes this performance issue by preserving the `Linker` instance across all links into the main LLVM module. This means we only create one `IRMover` and allows LTO to progress much speedier. From the `cargo-cache` project in rust-lang#48025 a **full build** locally when from 5m15s to 2m24s. Looking at the timing logs each object file was linked in in single-digit millisecond rather than hundreds, clearly being a nice improvement! Closes rust-lang#48025

…kruppe rustc: Persist LLVM's `Linker` in Fat LTO This commit updates our Fat LTO logic to tweak our custom wrapper around LLVM's "link modules" functionality. Previously whenever the `LLVMRustLinkInExternalBitcode` function was called it would call LLVM's `Linker::linkModules` wrapper. Internally this would crate an instance of a `Linker` which internally creates an instance of an `IRMover`. Unfortunately for us the creation of `IRMover` is somewhat O(n) with the input module. This means that every time we linked a module it was O(n) with respect to the entire module we had built up! Now the modules we build up during LTO are quite large, so this quickly started creating an O(n^2) problem for us! Discovered in rust-lang#48025 it turns out this has always been a problem and we just haven't noticed it. It became particularly worse recently though due to most libraries having 16x more object files than they previously did (1 -> 16). This commit fixes this performance issue by preserving the `Linker` instance across all links into the main LLVM module. This means we only create one `IRMover` and allows LTO to progress much speedier. From the `cargo-cache` project in rust-lang#48025 a **full build** locally went from 5m15s to 2m24s. Looking at the timing logs each object file was linked in in single-digit millisecond rather than hundreds, clearly being a nice improvement! Closes rust-lang#48025

retep998 added the I-compiletime Issue: Problems and improvements with respect to compile times. label Feb 6, 2018

TimNN added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Feb 6, 2018

matthiaskrgr mentioned this issue Feb 9, 2018

Update for Version 0.2.0 matthiaskrgr/aurebuildcheck-rs#1

Closed

5 tasks

alexcrichton mentioned this issue Feb 12, 2018

rustc: Persist LLVM's Linker in Fat LTO #48163

Merged

alexcrichton closed this as completed Feb 12, 2018

alexcrichton reopened this Feb 12, 2018

This was referenced Feb 15, 2018

Compile times. Weird linkage? alacritty/alacritty#1116

Closed

codegen-units=16 doubles compile times for Firefox #48233

Closed

kennytm closed this as completed in #48163 Feb 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

monolithic lto link time very slow when building with several codegen units #48025

monolithic lto link time very slow when building with several codegen units #48025

matthiaskrgr commented Feb 6, 2018 •

edited

Loading

retep998 commented Feb 6, 2018

matthiaskrgr commented Feb 6, 2018

matthiaskrgr commented Feb 12, 2018

alexcrichton commented Feb 12, 2018

alexcrichton commented Feb 12, 2018

monolithic lto link time very slow when building with several codegen units #48025

monolithic lto link time very slow when building with several codegen units #48025

Comments

matthiaskrgr commented Feb 6, 2018 • edited Loading

retep998 commented Feb 6, 2018

matthiaskrgr commented Feb 6, 2018

matthiaskrgr commented Feb 12, 2018

alexcrichton commented Feb 12, 2018

alexcrichton commented Feb 12, 2018

matthiaskrgr commented Feb 6, 2018 •

edited

Loading