Skip to content

Fix x86 runtime async frame pointer mismatch in GetSpForDiagnosticReporting#126717

Open
tommcdon wants to merge 1 commit intodotnet:mainfrom
tommcdon:dev/tommcdon/fixx86debuggernotification
Open

Fix x86 runtime async frame pointer mismatch in GetSpForDiagnosticReporting#126717
tommcdon wants to merge 1 commit intodotnet:mainfrom
tommcdon:dev/tommcdon/fixx86debuggernotification

Conversation

@tommcdon
Copy link
Copy Markdown
Member

@tommcdon tommcdon commented Apr 9, 2026

For certain runtime async frames this resulted in the ICorDebugManagedCallback2::Exception to return a null ICorDebugFrame for DEBUG_EXCEPTION_CATCH_HANDLER_FOUND notifications. The fix addresses this by adjusting GetSpForDiagnosticReporting to account for runtime async variant method stack layout differences on x86.

…orting

Adjust GetSpForDiagnosticReporting to correctly handle runtime async variant method stack layout on x86.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tommcdon tommcdon added this to the 11.0.0 milestone Apr 9, 2026
@tommcdon tommcdon requested review from janvorli and noahfalk April 9, 2026 16:18
@tommcdon tommcdon self-assigned this Apr 9, 2026
Copilot AI review requested due to automatic review settings April 9, 2026 16:18
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts the stack pointer reported to the debugger on x86 for runtime “async call” frames so that ICorDebugManagedCallback2::Exception callbacks (notably DEBUG_EXCEPTION_CATCH_HANDLER_FOUND) can resolve a non-null ICorDebugFrame.

Changes:

  • Extends GetSpForDiagnosticReporting to optionally accept a MethodDesc* and apply an extra x86 adjustment for runtime async methods (IsAsyncMethod()).
  • Updates exception/debugger callback sites to pass the current MethodDesc* into GetSpForDiagnosticReporting.
Comments suppressed due to low confidence (1)

src/coreclr/vm/exceptionhandling.cpp:2968

  • pMD is only referenced under ESTABLISHER_FRAME_ADDRESS_IS_CALLER_SP + TARGET_X86. In other builds (e.g., x64 Unix where -Wall is enabled, often with -Werror), this new parameter can become unused and may trigger an unused-parameter warning. Consider adding UNREFERENCED_PARAMETER(pMD); in the #else path and/or in the non-x86 path to keep all configurations warning-free.
static TADDR GetSpForDiagnosticReporting(REGDISPLAY *pRD, MethodDesc *pMD = NULL)
{
#ifdef ESTABLISHER_FRAME_ADDRESS_IS_CALLER_SP
    TADDR sp = CallerStackFrame::FromRegDisplay(pRD).SP;
#if defined(TARGET_X86)
    sp -= sizeof(TADDR);
    // On x86, runtime async methods have stack parameters that cause CallerSP
    // to sit above the parameter area.  The DBI uses PCTAddr as the frame
    // pointer, which is at the return address (below the parameters).
    // Subtract an extra sizeof(TADDR) to account for the stack parameter.
    if (pMD != NULL && pMD->IsAsyncMethod())
    {
        sp -= sizeof(TADDR);
    }
#endif
    return sp;
#else
    return GetSP(pRD->pCurrentContext);
#endif

Comment on lines +2956 to +2963
// On x86, runtime async methods have stack parameters that cause CallerSP
// to sit above the parameter area. The DBI uses PCTAddr as the frame
// pointer, which is at the return address (below the parameters).
// Subtract an extra sizeof(TADDR) to account for the stack parameter.
if (pMD != NULL && pMD->IsAsyncMethod())
{
sp -= sizeof(TADDR);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because of the continuation? Or what exactly is the difference here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused here because the continuation is not the only possible extra argument we can have -- we can also have generic context (and vararg cookie), but those do not seem to require handling here.

I am wondering if we instead need a fix on the JIT side in the GC information. Perhaps something like

unsigned GetPushedArgSize(hdrInfo * info, PTR_CBYTE table, DWORD curOffs)
{
SUPPORTS_DAC;
unsigned sz;
if (info->interruptible)
{
sz = scanArgRegTableI(skipToArgReg(*info, table),
curOffs,
curOffs,
info);
}
else
{
sz = scanArgRegTable(skipToArgReg(*info, table),
curOffs,
info);
}
return sz;
}

is not working properly for the continuation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because of the continuation? Or what exactly is the difference here?

Correct - this is due to the continuation parameter in the async calling convention being passed on the stack

I am wondering if we instead need a fix on the JIT side in the GC information

Good suggestion- taking a look

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - this is due to the continuation parameter in the async calling convention being passed on the stack

The continuation parameter is not always passed on the stack. It is just a regular argument that follows the regular calling convention, which may or may not put it on the stack.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I would hope that this can be addressed as part of GC or unwind information attached to the method rather than custom post-processing for async frames. I'd worry that any discrepancy we try to address here means we've got a leaky abstraction and this probably isn't the only place it would be leaking.

@janvorli
Copy link
Copy Markdown
Member

janvorli commented Apr 9, 2026

@tommcdon I wonder - was this always a problem? I am asking since there was a change in the ifdef in this code in January (#122833) from

#if defined(FEATURE_EH_FUNCLETS) && defined(TARGET_X86)
    sp -= sizeof(TADDR); // For X86 with funclets we want the address 1 pointer into the callee.
#endif // defined(FEATURE_EH_FUNCLETS) && defined(TARGET_X86)

I wonder if the ifdef really meant to be for Linux x86 only (which was previously the only one with defined(FEATURE_EH_FUNCLETS)) and was unrelated to the funclets per se.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants