-
Notifications
You must be signed in to change notification settings - Fork 5k
Interpreter EH support in the runtime #114649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 11 out of 12 changed files in this pull request and generated 1 comment.
Files not reviewed (1)
- src/coreclr/interpreter/intops.def: Language not supported
src/coreclr/vm/arm64/cgencpu.h
Outdated
inline void SetFirstArgReg(CONTEXT *context, TADDR value) | ||
{ | ||
LIMITED_METHOD_DAC_CONTRACT; | ||
SetReg(context, 0, reg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined variable 'reg' is used in SetFirstArgReg; it should use the parameter 'value' instead.
SetReg(context, 0, reg); | |
SetReg(context, 0, value); |
Copilot uses AI. Check for mistakes.
2a3de0d
to
9e28998
Compare
src/coreclr/vm/interpexec.cpp
Outdated
|
||
TADDR resumeSP; | ||
TADDR resumeIP; | ||
pInterpreterFrame->GetAndClearResumeContext(&resumeSP, &resumeIP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have all this throwing logic extracted in a separate method (like mono's interp_throw
) ? There will be tons of places where we will throw exceptions as part of different opcodes, like null checks, ovf checks etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect that exceptions from things like null checks are going to be thrown using COMPlusThrow
, e.g. COMPlusThrow(kNullReferenceException)
, just like exceptions are thrown everywhere else in the VM. C++ exception is then going to caught by try/catch that's wraps the interpreter execution loop and sent for managed exception processing from there. @janvorli Does this sound right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those scenarios are logically identical to INTOP_THROW
so I would say they should be handled the same way. You could implement CEE_LDIND
for example either as a single INTOP_LDIND
opcode that does the null check as part of the opcode and throws the exception or as multiple granular opcodes: ldnull + bne; ldnull + throw; ldind_unsafe;
. Also I'm not sure why we would want to throw an additional c++ exception when we could do the managed exception dispatch directly.
Also resuming the context from the c++ catch handler back inside the try (since execution might resume in those interpreter frames) seems dubious. It also seems unclear whether wasm would work in this configuration, given longjmp is probably implemented via an exception throw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpret needs to able to deal with C++ exceptions thrown in the rest of the VM.. For example, https://github.com/dotnet/runtime/pull/114529/files#diff-13492d6b1897666f5322f30e5f2ae4a9b35f084ce84d7f78827541070cc57329R990 can throw C++ exception that needs to be converted to managed exception and handled as managed exception. How is it going to be done? Is there a viable alternative to resuming from the C++ catch handler?
I think that the exceptions like NullReferenceException thrown by the interpret are more like C++ exceptions thrown in the rest of the VM, so we may want to use the same mechanism for them. It follows what we do with regular JIT. For example, calling an interface method on null pointer can end up here:
runtime/src/coreclr/vm/virtualcallstub.cpp
Line 1715 in b951b3d
COMPlusThrow(kNullReferenceException); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I imagined solving this problem would be to wrap calls into the runtime that might throw with a try/catch. For example:
try {
pMD = pMD->GetMethodDescOfVirtualizedCode(pThisArg, pMD->GetMethodTable());
} catch (ex) {
ex_managed = convert_to_managed (ex);
}
if (ex_managed) {
ThrowManagedEx(ex_managed);
// resume to the right interp frame here
}
And for all exceptions produced by interpreter code (that we know exactly), we would directly use ThrowManagedEx
. Seems like this approach would be the simplest, I don't expect a ton of callsites where we would need to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect to the current implementation (i.e. not the discussion with C++ try/catch), I have found a bug with that.
By the time we resume into the interpreter to execute the logic after the catch block, the right context to restore into is not the same as the context capture during throw.
In this example, things will fail.
try
{
try
{
throw null;
}
catch
{
throw null;
}
}
catch
{
}
// During the second resume, I am about to resume to here ...
During the second throw, the caller has the following stack (callee on top, caller at the bottom convention)
DispatchException
InterpExecMethod calling funclet // The second throw captured this context
DispatchException
InterpExecMethod calling function
When the second catch resumes, we really shouldn't have the CallFunclet
on the native stack.
Logically, the context to restore should be the context captured on the native frame that hosted with the catch
, not with the throw
.
it would be useful to demonstrate that this works on one callsite as part of this PR. |
I have been working with @janvorli on exception handling for the interpreter. On my side, building on top of this change, we know that resuming from main...cshung:runtime:public/interpreter-exception
|
As Andrew said, without his changes, there is not much we can demonstrate here. With his changes, we can do that. |
This change adds support for exception handling for cases when interpreter frames participate in the process. That means when an exception is thrown from an interpreter frame or when it is propagated over interpreter frames. This doesn't add all the exception handling support to the interpreter itself, there is a follow up PR that contains that code.
ifdef-out setting m_SSP in the InterpreterFrame
We cannot try to get the code manager from a crawl frame when it is not on a frameless frame.
This commit moves the resuming after catch to using native exception handling instead of fragile context capturing, which was not correct anyways. It also adds handling of exceptions comming out of native runtime methods called from the interpreter.
c5e9b79
to
d0a342d
Compare
I've just added a commit that moves the resuming after catch to a helper native exception propagation plan. When resuming in an interpreted frame, we virtually unwind from the current context upto the first native frame that is in the same block of native frames as the InterpExecMethod of the target frame and then throw the special exception from there. |
Why do we need that? I would expect that the core interpreter won't need to call GC protect, and any (throwing) calls to the VM would need to be behind unwind-and-continue wrapper (like Vlad suggested in #114649 (comment)) that that will do natural C++ uwnwind. |
@jkotas this is related to the lengthy comment above related to your question on the |
I would expect WASM exception handling to call the C++ destructor naturally whenever it unwinds a C++ frame. Is it not going to be the case? |
|
||
switch (*ip) | ||
INSTALL_MANAGED_EXCEPTION_DISPATCHER; | ||
INSTALL_UNWIND_AND_CONTINUE_HANDLER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is INSTALL_UNWIND_AND_CONTINUE_HANDLER
going to break exception filter semantics when we have mix of interpreted and JIIT/AOT frames on the stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it works fine with filter. I do not understand why we need INSTALL_UNWIND_AND_CONTINUE_HANDLER
here. What is going to break if INSTALL_UNWIND_AND_CONTINUE_HANDLER
is deleted here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's for handling exceptions coming out of calls to native runtime. The INSTALL_MANAGED_EXCEPTION_DISPATCHER does nothing for Windows. Without it, those exceptions would just flow through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. This means that the interpreter should be able to use regular COMPlusThrow to throw exceptions like I originally assumed in #114649 (comment)
Yes and that's why that change was added. Without it, the destructor of the GCFrame would try to pop it out of the frame list and it would assert, because it would find the frame is not the first on the list. The other way to solve that would be to not to pop the GCFrames upfront when resuming after catch and let the destructor pop them. However, that would make it different from when those frames are popped on WASM and non-WASM and I'd prefer this to be uniform. |
We have introduced a dedicated list for GC protect a while ago since popping them out-of-line as part Frame list was a mess with a lot of problems. It sounds like we are recreating the problem again. Is it possible to make this work without touching how GC protect frames work? |
I am not sure I understand. This change ensures that it works the same way with and without the interpreter. The frames with the GC protects are virtually dead. |
Ok, I have missed that we have special handling for GCFrames in PopExplicitFrames. My mental model for Could you please add a comment to |
|
||
switch (*ip) | ||
INSTALL_MANAGED_EXCEPTION_DISPATCHER; | ||
INSTALL_UNWIND_AND_CONTINUE_HANDLER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. This means that the interpreter should be able to use regular COMPlusThrow to throw exceptions like I originally assumed in #114649 (comment)
Yes, I've verified that it works already. |
I am actually going to do that differently, storing those in the stack walker state instead. The storage in the frame is a remainder of the first version when I was using that during restoring context after catch and I think that keeping it in that state didn't work for some reason. |
* Move the saved registers to the StackFrameIterator * Fix MUSL build * Add {Get/Set}{First/Second}ArgumentRegister to all architectures and use those in ExecuteFunctionBelowContext.
@jkotas I believe I have addressed all of your feedback. |
src/coreclr/vm/codeman.cpp
Outdated
EE_ILEXCEPTION_CLAUSE EHClause; | ||
pJitMan->GetNextEHClause(&EnumState, &EHClause); | ||
|
||
if (EHClause.HandlerStartPC <= relOffset && relOffset < EHClause.HandlerEndPC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work for filters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether you wanted to push this into this PR since there is no matching logic in the JIT, so it is hard to tell whether it is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it works for filters too. The interpreter compiler sets the HandlerEndPC
even for filters.
I am not sure whether you wanted to push this into this PR since there is no matching logic in the JIT, so it is hard to tell whether it is correct.
Makes sense, let's make it part of Andrew's part of the change instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, actually, even JIT sets the HandlerEndPC
for filters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem that I see is that the filter funclet is a separate funclet that is not included in the (HandlerStartPC, HandledEndPC) range, with the JIT at least. I assume that you want to use the same encoding rules for interpreter as well.
I have done a quick test.
try
{
Console.WriteLine("Try");
}
catch (Exception e) when (args.Length > 10)
{
Console.WriteLine("Catch");
}
Produces
Notice that FilterOffset is not included in (HandlerStartPC, HandledEndPC).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I am sorry, you are right, the ranges I am scanning contain the catch handler for the filter, not the filter itself. That's the same thing for the interpreter then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need to figure out some other mechanism for getting the end address of filter funclets. The JIT uses unwind info stuff, which we don't have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add a rule that the filter funclet must immediately precede the handler. (This rule would apply to interpreter JIT only.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, that sounds like a reasonable solution.
The failures look related to the change. |
These were interesting. I was missing FEATURE_INTERPRETER definition for the managed code. The offset verification was passing, because that's done in native build, but the managed code was using wrong offsets. I'll push a fix in a minute. |
I've fixed the issue causing the tests failures and reverted the GetFuncletStartAddress change |
This change adds support for exception handling for cases when interpreter frames either throw an exception or the exception is propagated through them. It doesn't add all the exception handling support to the interpreter itself, there is a follow up PR that contains that code.
I had to reorder the code around handling thread abort when resuming after catch for the case when the resume occurs in the interpreted code. The resuming goes to the native context of the
InterpExecMethod
in that case, so we need to update the current context in theREGDISPLAY
to point there before we extract thedwResumePC
. TheInterpreterFrame
stores copies of the registers that we reuse for interpreted frames stack walking and restores them before resuming there and at the same time stores the interpreter PC and SP for the resuming is stored in theInterpreterFrame