Fix x64 data breakpoint handling after CORINFO_HELP_ARRADDR_ST inlining#127251
Fix x64 data breakpoint handling after CORINFO_HELP_ARRADDR_ST inlining#127251tommcdon wants to merge 1 commit intodotnet:mainfrom
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes x64 data breakpoint unwind behavior after CORINFO_HELP_ARRADDR_ST inlining changed the write barrier call path, so the debugger unwinds back to user code instead of stopping in CastHelpers frames.
Changes:
- Extends the x64
FEATURE_DATABREAKPOINTunwind logic to keep unwinding pastCastHelpers-owned managed frames after unwinding out of the native write barrier. - Identifies
CastHelpersframes via the owningMethodTableto remain stable across tiered compilation.
| s_pCastHelpersMT = CoreLibBinder::GetExistingClass(CLASS__CASTHELPERS); | ||
|
|
There was a problem hiding this comment.
CoreLibBinder::GetExistingClass(CLASS__CASTHELPERS) will _ASSERTE(pMT != NULL) if CastHelpers hasn't been loaded yet (e.g., very early startup / debugger attaches early). Since this path can run on a data breakpoint, please switch to a non-asserting lookup (e.g., CoreLibBinder::GetClassIfExist(CLASS__CASTHELPERS) or g_pCastHelpers once initialized) and handle the null case by breaking out of the loop.
| s_pCastHelpersMT = CoreLibBinder::GetExistingClass(CLASS__CASTHELPERS); | |
| s_pCastHelpersMT = CoreLibBinder::GetClassIfExist(CLASS__CASTHELPERS); | |
| if (s_pCastHelpersMT == nullptr) | |
| break; |
There was a problem hiding this comment.
The local caching is useless. CoreLibBinder has similar cache already.
| static MethodTable* s_pCastHelpersMT = nullptr; | ||
| while (true) | ||
| { |
There was a problem hiding this comment.
static MethodTable* s_pCastHelpersMT is updated via an unsynchronized check-then-set. If multiple threads hit data breakpoints concurrently, this becomes a C++ data race. Consider removing the cache (data breakpoints are rare) or switching to an atomic/volatile pattern (e.g., VolatilePtr/VolatileLoad+VolatileStore or InterlockedCompareExchange).
| if (IsIPInMarkedJitHelper(ip)) | ||
| { | ||
| Thread::VirtualUnwindToFirstManagedCallFrame(pContext); | ||
|
|
There was a problem hiding this comment.
We have other helpers that have internal managed code between the user code and a write barrier. For example, BulkMoveWithWriteBarrier. What is the invariant that the debugger expects here?
This looks fragile to filter it here. Can this filtering be done in the higher-level debugger instead? The higher level debugger has a better idea what' a user code.
|
@EgorBo FYI |
After #126547, the WriteBarrier FCall was converted from native (FCall) to managed. This affected the debugger's unwind logic for data breakpoint handling (AdjustContextForJITHelpersForDebugger) resulting in the debugger to unwind into the JIT helper (CastHelpers.StelemRef) rather than user code.
The fix adds a loop after the initial unwind that checks whether the landed-on frame belongs to the CastHelpers
class and continues unwinding until it reaches user code. This only affects x64 data breakpoints, as x86 does a raw single-frame stack pop (restores EIP from ESP) rather than VirtualUnwindToFirstManagedCallFrame, so it was unaffected. ARM64 does not support data breakpoints.