Skip to content

Fix x64 data breakpoint handling after CORINFO_HELP_ARRADDR_ST inlining#127251

Open
tommcdon wants to merge 1 commit intodotnet:mainfrom
tommcdon:dev/tommcdon/fix_data_breakpoints
Open

Fix x64 data breakpoint handling after CORINFO_HELP_ARRADDR_ST inlining#127251
tommcdon wants to merge 1 commit intodotnet:mainfrom
tommcdon:dev/tommcdon/fix_data_breakpoints

Conversation

@tommcdon
Copy link
Copy Markdown
Member

After #126547, the WriteBarrier FCall was converted from native (FCall) to managed. This affected the debugger's unwind logic for data breakpoint handling (AdjustContextForJITHelpersForDebugger) resulting in the debugger to unwind into the JIT helper (CastHelpers.StelemRef) rather than user code.

The fix adds a loop after the initial unwind that checks whether the landed-on frame belongs to the CastHelpers
class and continues unwinding until it reaches user code. This only affects x64 data breakpoints, as x86 does a raw single-frame stack pop (restores EIP from ESP) rather than VirtualUnwindToFirstManagedCallFrame, so it was unaffected. ARM64 does not support data breakpoints.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tommcdon tommcdon added this to the 11.0.0 milestone Apr 21, 2026
@tommcdon tommcdon requested a review from noahfalk April 21, 2026 22:22
@tommcdon tommcdon self-assigned this Apr 21, 2026
Copilot AI review requested due to automatic review settings April 21, 2026 22:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes x64 data breakpoint unwind behavior after CORINFO_HELP_ARRADDR_ST inlining changed the write barrier call path, so the debugger unwinds back to user code instead of stopping in CastHelpers frames.

Changes:

  • Extends the x64 FEATURE_DATABREAKPOINT unwind logic to keep unwinding past CastHelpers-owned managed frames after unwinding out of the native write barrier.
  • Identifies CastHelpers frames via the owning MethodTable to remain stable across tiered compilation.

Comment thread src/coreclr/vm/excep.cpp
Comment on lines +5526 to +5527
s_pCastHelpersMT = CoreLibBinder::GetExistingClass(CLASS__CASTHELPERS);

Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CoreLibBinder::GetExistingClass(CLASS__CASTHELPERS) will _ASSERTE(pMT != NULL) if CastHelpers hasn't been loaded yet (e.g., very early startup / debugger attaches early). Since this path can run on a data breakpoint, please switch to a non-asserting lookup (e.g., CoreLibBinder::GetClassIfExist(CLASS__CASTHELPERS) or g_pCastHelpers once initialized) and handle the null case by breaking out of the loop.

Suggested change
s_pCastHelpersMT = CoreLibBinder::GetExistingClass(CLASS__CASTHELPERS);
s_pCastHelpersMT = CoreLibBinder::GetClassIfExist(CLASS__CASTHELPERS);
if (s_pCastHelpersMT == nullptr)
break;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local caching is useless. CoreLibBinder has similar cache already.

Comment thread src/coreclr/vm/excep.cpp
Comment on lines +5510 to +5512
static MethodTable* s_pCastHelpersMT = nullptr;
while (true)
{
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static MethodTable* s_pCastHelpersMT is updated via an unsynchronized check-then-set. If multiple threads hit data breakpoints concurrently, this becomes a C++ data race. Consider removing the cache (data breakpoints are rare) or switching to an atomic/volatile pattern (e.g., VolatilePtr/VolatileLoad+VolatileStore or InterlockedCompareExchange).

Copilot uses AI. Check for mistakes.
Comment thread src/coreclr/vm/excep.cpp
if (IsIPInMarkedJitHelper(ip))
{
Thread::VirtualUnwindToFirstManagedCallFrame(pContext);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have other helpers that have internal managed code between the user code and a write barrier. For example, BulkMoveWithWriteBarrier. What is the invariant that the debugger expects here?

This looks fragile to filter it here. Can this filtering be done in the higher-level debugger instead? The higher level debugger has a better idea what' a user code.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Apr 21, 2026

@EgorBo FYI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants