MIR inlining: allow backends to opt-in to inlining intrinsics#156398
MIR inlining: allow backends to opt-in to inlining intrinsics#156398RalfJung wants to merge 1 commit into
Conversation
|
r? @adwinwhite rustbot has assigned @adwinwhite. Use Why was this reviewer chosen?The reviewer was selected based on:
|
| } | ||
|
|
||
| fn fallback_intrinsics(&self) -> Vec<Symbol> { | ||
| vec![sym::type_id_eq, sym::rotate_left, sym::rotate_right] |
There was a problem hiding this comment.
The two "rotate" ones here are not a clear-cut choice: their fallback body uses funnel shifts. Cranelift and GCC don't have implementations for those. So if we inline the fallback body of rotate_left (because the standard library is compiled with the LLVM backend), then they'll end up compiling the funnel shift fallbacks instead of the rotation intrinsics.
I'm going to check whether this makes any perf difference in our benchmarks, though I am not sure if they exercise this codepath.
Cc @bjorn3 @antoyo @GuillaumeGomez -- how common is it to mix the LLVM-built standard library with cranelift/GCC for the remaining crates?
There was a problem hiding this comment.
Cc @bjorn3 @antoyo @GuillaumeGomez -- how common is it to mix the LLVM-built standard library with cranelift/GCC for the remaining crates?
This is the default for rustup distribution.
There was a problem hiding this comment.
Okay. So we'll have to figure out how to tradeoff optimizing for the common (LLVM) case vs alternative backends.
| // The callee is the fallback body. | ||
| debug!("callsite is fallback body: {def_id:?}"); | ||
| callee = ty::Instance { def: ty::InstanceKind::Item(def_id), args: callee.args }; | ||
| } |
There was a problem hiding this comment.
It would have been slightly nicer to do this check in check_mir_is_available instead of what I did here because there we can log a reason for why we're not inlining -- but that's too late, since we have to actually replace the instance here to be able to later fetch the fallback body.
This comment has been minimized.
This comment has been minimized.
4310104 to
741b2a6
Compare
|
@bors try |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
MIR inlining: allow backends to opt-in to inlining intrinsics
This comment has been minimized.
This comment has been minimized.
741b2a6 to
99f166e
Compare
|
Instead of inlining the already in the MIR inliner, would it be possible to do the inlining inside the codegen backend right after calling |
|
The point of the MIR inliner is to do the work once in generic code rather than many times on monomorphized code, so that doesn't seem so useful to me. Also I don't like these ad-hoc transformations during monomorphization. My plan was to just remove the rotation intrinsics from the list again, under the assumption that perf will come back neutral. |
This comment has been minimized.
This comment has been minimized.
|
Uh, not sure what's up with that mir-opt test failure -- when I run it locally with |
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (4337910): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking means the PR may be perf-sensitive. Consider adding rollup=never if this change is not fit for rolling up. @rustbot label: -S-waiting-on-perf -perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (secondary -2.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -0.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 499.454s -> 505.539s (1.22%) |
|
|
Doesn't look like the rotate ones are worth it so I'll remove them again for now. |
99f166e to
b6edfdf
Compare
This comment has been minimized.
This comment has been minimized.
b6edfdf to
ecdeff1
Compare
|
The Cranelift subtree was changed cc @bjorn3 The GCC codegen subtree was changed |
|
r? wg-mir-opt |
View all comments
This is particularly useful for
type_id_eqwhich we currently inline by hand.Cc @scottmcm who suggested this
Cc @saethlin because inlining
I was a bit confused that apparently the inlining pass checks three times whether something is an intrinsic and hence cannot be inlined (
resolve_callsite,check_mir_is_available,is_inline_valid_on_fn). I didn't want to duplicate the new logic in 3 places and I didn't know which place would be the right one -- please have a look, I hope what I picked makes sense.