#[inline] on generic functions #102539

nikic · 2022-10-01T10:14:09Z

Common Rust wisdom says that #[inline] does not need to be placed on small generic functions. This is because generic functions will get monomorphized in each crate anyway, so the attribute is not necessary for cross-crate inlining.

However, we also know that in practice placing #[inline] on generic functions does help optimization, even for tiny functions where the additional inlinehint this gives to LLVM really shouldn't be relevant. What gives? I believe there are two complications:

The main problem is that #[inline] forces an instantiation of the function in each CGU, while generic functions are normally only instantiated once per crate. This means that a definition of generic functions is available to crate-local LTO, but not to the pre-link optimization pipeline. Especially for trivial generic functions, this may significantly hamper pre-link optimization, and post-link optimization may not be able to recover from this.

The second complication occurs when optimizing for size. In this case, we currently enable -Z share-generics by default, which means that generic functions only get monomorphized once and are exported for downstream crates. This means that the function definition is not available even to crate-local LTO. It only becomes available during full cross-crate LTO.

The second point is something we can fix: We probably should not be enabling -Z share-generics by default in any optimized builds, including size-optimized builds.

The first one is trickier, as instantiating monomorphizations in each CGU by default is likely not desirable. Possibly we should just stop considering whether a function is generic or not when it comes to #[inline] placement.

The text was updated successfully, but these errors were encountered:

nikic · 2022-10-01T15:29:28Z

I did a bit of crude data analysis, based on calls to non-inlined small functions in ThinLTO inputs. Here's the result: https://gist.github.com/nikic/c1d441ca6468b8c2ee114ff4a11f5667

Some of these are not supposed to be inlined (various panic functions) and there are a lot of drop_in_place functions that really should be inlined but aren't due to noinline annotations (fixed by #102099).

But there are also some obvious candidates where per CGU instantiation is likely beneficial, and we'd probably want to add an #[inline] annotation. Cell::replace() looks like one of the worst offenders. From the ecosystem, SmallVec::size() stands out to me.

# Third 30 entries excluding drop_in_place and various panic helpers.
n=20649 s=3  <core::cell::Cell<isize>>::replace
n=18672 s=1  <alloc::vec::Vec<u8> as core::ops::drop::Drop>::drop
n=8691  s=1  <core::ptr::unique::Unique<u8> as core::convert::Into<core::ptr::non_null::NonNull<u8>>>::into
n=6194  s=7  <[u8] as core::cmp::PartialEq>::eq
n=5741  s=3  <core::cell::Cell<usize>>::replace
n=5350  s=1  <rustc_query_system::dep_graph::graph::DepNodeIndex as core::convert::Into<rustc_data_structures::profiling::QueryInvocationId>>::into
n=4693  s=2  alloc::alloc::box_free::<rustc_ast::ast::Ty, alloc::alloc::Global>
n=1950  s=1  <[rustc_middle::ty::subst::GenericArg; 8] as smallvec::Array>::size
n=1930  s=2  <rustc_ast::ptr::P<rustc_ast::ast::Expr> as core::ops::deref::Deref>::deref
n=1760  s=1  <[rustc_middle::ty::Ty; 8] as smallvec::Array>::size
n=1740  s=2  <rustc_errors::diagnostic_builder::DiagnosticBuilder<rustc_errors::ErrorGuaranteed>>::emit
n=1699  s=1  rustc_data_structures::sync::assert_sync::<rustc_middle::ty::context::tls::ImplicitCtxt>
n=1560  s=1  <rustc_hir_analysis::check::inherited::Inherited as core::ops::deref::Deref>::deref
n=1509  s=1  <core::hash::BuildHasherDefault<rustc_hash::FxHasher> as core::default::Default>::default
n=1498  s=2  alloc::alloc::box_free::<rustc_ast::ast::Pat, alloc::alloc::Global>
n=1481  s=6  <&str as core::convert::Into<alloc::borrow::Cow<str>>>::into
n=1119  s=2  <rustc_ast::ptr::P<rustc_ast::ast::Ty> as core::ops::deref::Deref>::deref
n=1112  s=1  <[rustc_middle::ty::sty::Binder<rustc_middle::ty::sty::ExistentialPredicate>; 8] as smallvec::Array>::size
n=1047  s=6  smallvec::infallible::<()>
n=923   s=2  <rustc_target::abi::TyAndLayout<rustc_middle::ty::Ty> as core::ops::deref::Deref>::deref
n=874   s=6  <std::path::Path>::new::<std::ffi::os_str::OsString>
n=850   s=1  <alloc::vec::Vec<rustc_span::span_encoding::Span> as core::ops::drop::Drop>::drop
n=848   s=6  _$LT$alloc..string..String$u20$as$u20$core..clone..Clone$GT$::clone::h3a6e79de4ee6835e
n=825   s=2  alloc::alloc::box_free::<syn::expr::Expr, alloc::alloc::Global>
n=779   s=2  <tracing_core::metadata::Metadata>::fields
n=761   s=3  <hashbrown::set::HashSet<rustc_target::asm::InlineAsmReg, core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::insert
n=706   s=1  <rustc_span::def_id::DefId as core::borrow::Borrow<rustc_span::def_id::DefId>>::borrow
n=696   s=3  <rustc_data_structures::atomic_ref::AtomicRef<fn(rustc_span::def_id::LocalDefId)> as core::ops::deref::Deref>::deref
n=694   s=5  alloc::alloc::box_free::<dyn core::error::Error + core::marker::Send + core::marker::Sync, alloc::alloc::Global>

Kobzol · 2022-10-01T15:51:48Z

Do you think that performing thin/fat LTO for librustc_driver (e.g. #101403) could help with this in general?

bjorn3 · 2022-10-01T16:10:03Z

Generic functions are not included in the same cgu as the user by default. Partitioning produces two cgu's for each module before merging. One containing functions that are actually part of the module and one containing all generic functions used by the module. #[inline] moves functions from the "generics" cgu to the "normal" cgu AFAIK and as such is not useless at all.

Mark Cell::replace() as #[inline] Giving this a try based on rust-lang#102539 (comment).

jachris · 2022-10-26T12:26:36Z

I closed #103499 as a duplicate, but reading this again, I'm not so sure. If I read it correctly, this issue is not so much about the behavior of -Zshare_generics=on, but rather about the general placement of generic inline functions, in particular for -Zshare_generics=off.

We probably should not be enabling -Z share-generics by default in any optimized builds, including size-optimized builds.

What are your thoughts on excluding inline generics from sharing?

Personally, I would have expected that instantiations of generic inline functions behave the same way as inline functions, and was surprised to learn that this is not the case. But then again, I just recently took a look at how generics are exactly codegened, so take that with a grain of salt.

Mark Cell::replace() as #[inline] Giving this a try based on rust-lang/rust#102539 (comment).

veber-alex · 2023-03-23T12:36:18Z

Can we have some compiler heuristic that applies #[inline] automatically?
Even if it is very conservative it will help in some cases to alleviate the need for developers to do it manually.

nikic mentioned this issue Oct 1, 2022

Tentatively #[inline] Option::from #102434

Closed

nikic mentioned this issue Oct 1, 2022

Mark Cell::replace() as #[inline] #102548

Merged

This was referenced Oct 1, 2022

Mark Array::size() as #[inline] servo/rust-smallvec#289

Merged

Tentatively inline(always) the trivial Into::into #102370

Closed

Mark Vec::drop() as #[inline] #102554

Closed

bors added a commit to rust-lang-ci/rust that referenced this issue Oct 2, 2022

Auto merge of rust-lang#102548 - nikic:inline-cell-replace, r=scottmcm

756e7be

Mark Cell::replace() as #[inline] Giving this a try based on rust-lang#102539 (comment).

nikic mentioned this issue Oct 9, 2022

slice: #[inline] a couple iterator methods. #96711

Merged

jachris mentioned this issue Oct 25, 2022

Generic inline functions are not always inline if generics are shared #103499

Closed

Dylan-DPC added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Feb 4, 2023

thomcc pushed a commit to tcdi/postgrestd that referenced this issue Feb 10, 2023

Auto merge of #102548 - nikic:inline-cell-replace, r=scottmcm

4206d37

Mark Cell::replace() as #[inline] Giving this a try based on rust-lang/rust#102539 (comment).

JohnBobbo96 mentioned this issue Apr 16, 2023

Add the inlined_generics lint. rust-lang/rust-clippy#10650

Closed

workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#[inline] on generic functions #102539

#[inline] on generic functions #102539

nikic commented Oct 1, 2022

nikic commented Oct 1, 2022

Kobzol commented Oct 1, 2022

bjorn3 commented Oct 1, 2022

jachris commented Oct 26, 2022

veber-alex commented Mar 23, 2023

#[inline] on generic functions #102539

#[inline] on generic functions #102539

Comments

nikic commented Oct 1, 2022

nikic commented Oct 1, 2022

Kobzol commented Oct 1, 2022

bjorn3 commented Oct 1, 2022

jachris commented Oct 26, 2022

veber-alex commented Mar 23, 2023