JIT: relax inliner heuristics for callees on [Intrinsic] types#127433
JIT: relax inliner heuristics for callees on [Intrinsic] types#127433EgorBo merged 2 commits intodotnet:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adjusts CoreCLR JIT inlining heuristics to avoid penalizing callees whose declaring types are marked [Intrinsic], improving codegen quality for common intrinsic-heavy APIs (e.g., Span<T>, SIMD vectors, HW intrinsics) even in cold blocks or when the inline budget is exhausted.
Changes:
- Introduces a new callee observation (
CALLEE_IS_INTRINSIC_TYPE) based onCORINFO_FLG_INTRINSIC_TYPE. - Relaxes multiplier “caps”/penalties for intrinsic-type callees in
DefaultPolicyandExtendedDefaultPolicy(rare callsites, profile-based penalties, no-return region cap). - Allows over-budget inlining for intrinsic-type callees in
DefaultPolicy.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/coreclr/jit/inlinepolicy.h | Tracks intrinsic-type state (m_IsIntrinsicType) in the default policy data. |
| src/coreclr/jit/inlinepolicy.cpp | Applies intrinsic-type-aware budget and multiplier adjustments; emits the new state in XML dumps. |
| src/coreclr/jit/inline.def | Adds the IS_INTRINSIC_TYPE observation for the callee scope. |
| src/coreclr/jit/fgbasic.cpp | Records CALLEE_IS_INTRINSIC_TYPE from info.compClassAttr during inline candidate IL scanning. |
86187ff to
8a5f924
Compare
|
PTAL @AndyAyersMS For real world code this will only impact Span and ReadOnlySpan functions which are all quite small. |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Curious how you chose 50 as a limit... do we ever get anywhere close to that?
@AndyAyersMS good question, so basically just this code snippet: static void Foo(ReadOnlySpan<int> src, Span<int> dst)
{
Vector.Create(src.Slice(0 * Vector<int>.Count)).CopyTo(dst.Slice(0 * Vector<int>.Count));
Vector.Create(src.Slice(1 * Vector<int>.Count)).CopyTo(dst.Slice(1 * Vector<int>.Count));
Vector.Create(src.Slice(2 * Vector<int>.Count)).CopyTo(dst.Slice(2 * Vector<int>.Count));
Vector.Create(src.Slice(3 * Vector<int>.Count)).CopyTo(dst.Slice(3 * Vector<int>.Count));
Vector.Create(src.Slice(4 * Vector<int>.Count)).CopyTo(dst.Slice(4 * Vector<int>.Count));
Vector.Create(src.Slice(5 * Vector<int>.Count)).CopyTo(dst.Slice(5 * Vector<int>.Count));
Vector.Create(src.Slice(6 * Vector<int>.Count)).CopyTo(dst.Slice(6 * Vector<int>.Count));
Vector.Create(src.Slice(7 * Vector<int>.Count)).CopyTo(dst.Slice(7 * Vector<int>.Count));
}is already 32 For reference, the unsafe variant for this: is 0 calls since all of these are JIT intrinsics. Also, I see a lot of I decided to add the limit because in the past we had terrible experience with inliner in huge methods without guardrails. |
As part of the "remove unsafe" work, we're introducing lots of new
Span.Slice,Vector.Create, etc calls. It turns out these negatively impact on the inliner time budget and lead to bad regressions (I've hit it in #127429).I suggest we exclude these from the budget check just like we already do for small methods.
Just a few hits with PMI jit-diffs.