Skip to content

MIR inlining: allow backends to opt-in to inlining intrinsics#156398

Open
RalfJung wants to merge 1 commit into
rust-lang:mainfrom
RalfJung:inline-intrinsics
Open

MIR inlining: allow backends to opt-in to inlining intrinsics#156398
RalfJung wants to merge 1 commit into
rust-lang:mainfrom
RalfJung:inline-intrinsics

Conversation

@RalfJung
Copy link
Copy Markdown
Member

@RalfJung RalfJung commented May 10, 2026

View all comments

This is particularly useful for type_id_eq which we currently inline by hand.
Cc @scottmcm who suggested this
Cc @saethlin because inlining

I was a bit confused that apparently the inlining pass checks three times whether something is an intrinsic and hence cannot be inlined (resolve_callsite, check_mir_is_available, is_inline_valid_on_fn). I didn't want to duplicate the new logic in 3 places and I didn't know which place would be the right one -- please have a look, I hope what I picked makes sense.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 10, 2026

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @oli-obk, @lcnr

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 10, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 10, 2026

r? @adwinwhite

rustbot has assigned @adwinwhite.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: compiler
  • compiler expanded to 73 candidates
  • Random selection from 18 candidates

Comment thread compiler/rustc_codegen_llvm/src/lib.rs Outdated
}

fn fallback_intrinsics(&self) -> Vec<Symbol> {
vec![sym::type_id_eq, sym::rotate_left, sym::rotate_right]
Copy link
Copy Markdown
Member Author

@RalfJung RalfJung May 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two "rotate" ones here are not a clear-cut choice: their fallback body uses funnel shifts. Cranelift and GCC don't have implementations for those. So if we inline the fallback body of rotate_left (because the standard library is compiled with the LLVM backend), then they'll end up compiling the funnel shift fallbacks instead of the rotation intrinsics.

I'm going to check whether this makes any perf difference in our benchmarks, though I am not sure if they exercise this codepath.

Cc @bjorn3 @antoyo @GuillaumeGomez -- how common is it to mix the LLVM-built standard library with cranelift/GCC for the remaining crates?

View changes since the review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cc @bjorn3 @antoyo @GuillaumeGomez -- how common is it to mix the LLVM-built standard library with cranelift/GCC for the remaining crates?

This is the default for rustup distribution.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. So we'll have to figure out how to tradeoff optimizing for the common (LLVM) case vs alternative backends.

// The callee is the fallback body.
debug!("callsite is fallback body: {def_id:?}");
callee = ty::Instance { def: ty::InstanceKind::Item(def_id), args: callee.args };
}
Copy link
Copy Markdown
Member Author

@RalfJung RalfJung May 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been slightly nicer to do this check in check_mir_is_available instead of what I did here because there we can log a reason for why we're not inlining -- but that's too late, since we have to actually replace the instance here to be able to later fetch the fallback body.

View changes since the review

@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the inline-intrinsics branch 2 times, most recently from 4310104 to 741b2a6 Compare May 10, 2026 15:58
@RalfJung
Copy link
Copy Markdown
Member Author

@bors try
@rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 10, 2026
rust-bors Bot pushed a commit that referenced this pull request May 10, 2026
MIR inlining: allow backends to opt-in to inlining intrinsics
@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the inline-intrinsics branch from 741b2a6 to 99f166e Compare May 10, 2026 16:56
@bjorn3
Copy link
Copy Markdown
Member

bjorn3 commented May 10, 2026

Instead of inlining the already in the MIR inliner, would it be possible to do the inlining inside the codegen backend right after calling instance_mir using a helper function similar to what optimize_use_clone does? By the way looks like someone forgot to add the optimize_use_clone call to cg_clif.

@RalfJung
Copy link
Copy Markdown
Member Author

RalfJung commented May 10, 2026

The point of the MIR inliner is to do the work once in generic code rather than many times on monomorphized code, so that doesn't seem so useful to me. Also I don't like these ad-hoc transformations during monomorphization.

My plan was to just remove the rotation intrinsics from the list again, under the assumption that perf will come back neutral.

@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Copy Markdown
Member Author

Uh, not sure what's up with that mir-opt test failure -- when I run it locally with --bless it works fine and there's no diff...

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented May 10, 2026

☀️ Try build successful (CI)
Build commit: 4337910 (4337910a70ebea23eb598a97d7694c6bda48ec2b, parent: 99eed207b47aca1fec5c665531db8e948a92d0ca)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Copy Markdown
Collaborator

Finished benchmarking commit (4337910): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking means the PR may be perf-sensitive. Consider adding rollup=never if this change is not fit for rolling up.

@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.3% [-0.3%, -0.2%] 3
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary -2.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.4% [-2.4%, -2.4%] 1
All ❌✅ (primary) - - 0

Cycles

Results (secondary -0.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.4% [3.4%, 3.4%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.2% [-2.2%, -2.2%] 2
All ❌✅ (primary) - - 0

Binary size

Results (primary 0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.6% [0.6%, 0.6%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.6% [0.6%, 0.6%] 1

Bootstrap: 499.454s -> 505.539s (1.22%)
Artifact size: 397.23 MiB -> 397.17 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 10, 2026
@saethlin
Copy link
Copy Markdown
Member

similar to what optimize_use_clone does?

optimize_use_close does a horrendous hack around our lack of post-mono MIR opts which makes the feature impose a large compile-time penalty. That is a hack and should be removed when the feature is stabilized.

@RalfJung
Copy link
Copy Markdown
Member Author

Doesn't look like the rotate ones are worth it so I'll remove them again for now.

@RalfJung RalfJung force-pushed the inline-intrinsics branch from 99f166e to b6edfdf Compare May 10, 2026 20:11
@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the inline-intrinsics branch from b6edfdf to ecdeff1 Compare May 10, 2026 21:08
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 10, 2026

The Cranelift subtree was changed

cc @bjorn3

The GCC codegen subtree was changed

cc @antoyo, @GuillaumeGomez

@adwinwhite
Copy link
Copy Markdown
Contributor

r? wg-mir-opt

@rustbot rustbot assigned oli-obk and unassigned adwinwhite May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants