-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: enable inlining methods with EH #112998
base: main
Are you sure you want to change the base?
Conversation
Enable inlining of method with EH. Inlinee EH clauses are integrated into the root method EH table at the appropriate point (mid-table if the call site is in an EH region; at table end otherwise). Contributes to dotnet#108900.
SPMI is not going to be helpful here. PMI may not be either, as callees with EH may not look like attractive candidates w/o PGO. No special work done on profitability, we just use whatever rules the JIT has already. Also expecting (though perhaps not in our tests) the usual slew of surprises as certain methods no longer show up in stack traces. |
But technically, inliner should resist a bit more for such methods? I think it's fine for this PR not to care about it |
Seems like the x86 crossgen issues are that we have managed methods with both EH and unmanaged calling conventions and we try and inline these and get confused by the call argument reversal. For now I'll just block inlining of managed methods with unmanaged conventions. |
Found a couple places where we rely on |
Various folks have reminded me that the EH clause reporting for catches is token based, and this will need to be generalized somehow to handle cross-module inlines. The assumption today is that the class token in the clause is from the same module as the root method. For runtime dependent types we resolve these tokens into code via For now we can perhaps restrict inlines with EH to be same module. |
Also looks like we need to update |
Hard to say what fraction of the failures are catch-type related, but likely quite a few. So am going to restrict to just try-finally for now. Also x86 still not happy. |
For x86, we need to remove the shadow SP slots var if we manage to remove all EH. Seems like what we ought to do is defer allocating this until we're done changing EH regions, then determine if we need this and the max nesting level. Going to spin that off as a separate change as it leads to some diffs on its own. |
Split off some of the enabling changes here: #113023. |
Don't try an inline managed methods with unmanaged calling conventions. This mainly copes with x86 where unmanaged calling conventions use reversed arg order, but I've disabled it in general. No diffs as these methods seem to always include EH. Remove uses of `compXcptnsCount` as this goes stale whenever we clone or remove EH, or (eventually) inline methods with EH. Instead, rely on `compHndBBtabCount`. Defer allocating x86's shadow SP var and area until later in jitting, so this reflects any changes in EH table structure. In particular we often are able to eliminate EH in part or all together and this saves a low-offset allocation and so leads to some nice code size savings on x86. Also on x86 remove the runtime-dependent catch class case from the computation for keeping this alive, as we now transform such into runtime lookups in filters (that may well keep this alive).
Down to 300-ish failures Libraries:
X86 (widespread)
R2R
|
For the first assert. Before funclet splitting we have
and afterwards we've broken up T1:
Likely fallout from using Here we have
But EH0 comes from an inlinee, so these are not mutual-protect regions. |
Once we can inline methods with EH, IL ranges are no longer a reliable indicator of a mutual-protect try regions. Instead, after importation, we can rely on mutual-protect trys having the same start and end blocks. Also update other case where we were using `info.compXcptnsCount` in morph to decide if we needed a frame pointer. This lets us simplify the logic around frame pointers and EH (though I still think we're making up our minds too early). Contributes to dotnet#108900.
Latest batch of issues x86:
R2R: as above Arm64 Linux: widespread crypto failures |
@jkotas @davidwrighton The jit is triggering an assert in crossgen2 inside I suspect there's a missing check somewhere but not sure where it should be or what form it should take.
This comes up in the R2R version resilience tests. |
Seems like maybe these stubs are not safely inlinable. Perhaps they have some residual state that does not get reset? The jit doesn't seem to have a way to detect that an inlinee is one of these stubs, it's only told if the root method is one (via jit flags). Perhaps it makes sense to mark these as noinline for now. |
@mihu-bot |
There is still some issue on x86 I can't repro... seemingly requires timing like we see on the underpowered machines in CI. |
Arm64 alpine musl failure happens if we allow EH inlining in this method:
Root method had 3 EH clauses. Root table was
3 inlinees have EH (one with 2, others with 1) that nest inside root region 1, so we end up with 7 clauses and 6 deep nesting. After inlining, the table looks like:
Inline tree here is massive (177 inlines). We finally clone in 3 of these, and end up with
Looks like other methods also have issues. If I flip the enable range around and disable just this method things still fail. |
Do you still need help with this one? |
I've blocked inlining of pinvoke methods with EH, so don't need help right now. And I still have some other issues to sort out... |
I think there is an arm64 codegen bug where we convert mul/neg/jcc into mneg/bne, but mneg does not set flags and so this doesn't work right. Good codegen (one less inline) IN0101: 00045C ldr x2, [x20, #0x08]
IN0102: 000460 ldr w2, [x2, #0x14]
IN0103: 000464 cbz w2, G_M62798_IG43
IN0104: 000468 cmn w2, #1
IN0105: 00046C bne G_M62798_IG21
IN0106: 000470 cmp w1, #1
IN0107: 000474 bvs G_M62798_IG44
G_M62798_IG21: ; offs=0x000478, size=0x0014, bbWeight=0.04, PerfScore 0.64, gcrefRegs=2900000 {x20 x23 x25}, byrefRegs=1000000 {x24}, BB109 [0281], BB111 [0284], byref, isz
IN0108: 000478 sdiv w3, w1, w2
IN0109: 00047C msub w1, w3, w2, w1
IN010a: 000480 cbnz w1, G_M62798_IG42 bad codegen ;; because of inlining we realized the operand to be divided is zero and the divisor is nonzero
;; (we could have optimized this better, the divide/multiply are unnecessary
;; but the bug is we lost flags
IN00ff: 000458 ldr x1, [x20, #0x08]
IN0100: 00045C ldr w1, [x1, #0x14]
IN0101: 000460 mov w2, wzr
IN0102: 000464 cbz w1, G_M62798_IG42
IN0103: 000468 sdiv w2, w2, w1
IN0104: 00046C mneg w1, w2, w1
IN0105: 000470 bne G_M62798_IG41 Trying a hacky fix but doubt it is sufficient as we should be avoiding this... we'll see. I can get an SPMI of this compilation and work it out there tomorrow. |
@AndyAyersMS indeed there is a bug like that, #113320. The fix should be simple – |
This reverts commit 695435b.
@jakobbotsch thanks |
@AndyAyersMS the x86 assert failure during control flow opts is because
I suspect the JIT isn't setting |
I have been trying to repro this with no luck, how did you do it? I am recomputing the handler nesting level, but perhaps not early enough or correctly? See the newly added |
Ah maybe it is missing the +1 bias...no, that's just for the max nesting. |
I was able to using the build from this run. I didn't have to do anything special locally to get a consistent repro on my machine.
Let me dig into my repro and take a look... |
I see that we only call this in the backend at the moment, presumably because codegen relies on the handler nesting level to be accurate. From a quick search, I don't think you ought to refactor |
Yeah let's take #113330... |
Finally past those failures. The x86 failure was also popping up in jitstress, too bad I didn't look there... Am going to run some stress legs but currently jit-stress, jit-experimental and pgostress have a handful of known failures. |
/azp run runtime-coreclr libraries-pgo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.
@dotnet/jit-contrib think this is ready for review. I will run some stress modes but am holding off a bit to see if we can get some of them cleaned up first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Definitely needs some stress runs.
if (enclosingRegion == 0) | ||
{ | ||
// The call site is not in an EH region, so we can put the inlinee EH clauses | ||
// at the end of root method's the EH table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
// at the end of root method's the EH table. | |
// at the end of root method's EH table. |
// inlinee eh1 -> eh2 | ||
// | ||
// root eh0 -> eh0 | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true, and if so, desirable? What if the root method has:
try { } ... // #1
call()
try { } ... // #2
and call
has EH and is inlined. Don't we want to put the call
EH info in between the EH info for #1
and #2
in the resulting EH table? So, the lexical order is respected?
Or maybe the lexical order doesn't matter for EH regions without nesting relationships, since we already presumably can move them around ("make them out of order") today.
// | ||
EHblkDsc* const outermostEbd = | ||
fgTryAddEHTableEntries(insertBeforeIndex, inlineeRegionCount, /* deferAdding */ false); | ||
assert(outermostEbd != nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(outermostEbd != nullptr); | |
noway_assert(outermostEbd != nullptr); |
?
{ | ||
return true; | ||
} | ||
|
||
// We do not support setting zero flag for madd/msub. | ||
if (OperIs(GT_NEG) && (!gtGetOp1()->OperIs(GT_MUL) || !gtGetOp1()->isContained())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is from your other PR
Enable inlining of method with EH. Inlinee EH clauses are integrated into the root method EH table at the appropriate point (mid-table if the call site is in an EH region; at table end otherwise).
Contributes to #108900.