[cDAC] implement GetRuntimeNameByAddress#127134
Conversation
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
This PR adds cDAC support needed to implement IXCLRDataProcess::GetRuntimeNameByAddress by enabling stub-kind classification in the ExecutionManager contract, exposing additional RangeSection/range-list metadata, and wiring up the corresponding test and native data-descriptor updates.
Changes:
- Implement
GetRuntimeNameByAddressin the managed legacy SOS DAC implementation using cDAC contracts (ExecutionManager, PrecodeStubs, AuxiliarySymbols). - Add stub classification (
StubKind,GetStubKind) and the data plumbing needed for stub range lists (RangeSection.RangeList,CodeRangeMapRangeList). - Extend tests/mocks and native CDAC descriptors/globals to support the new data contract fields and globals.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/MockDescriptors/MockDescriptors.ExecutionManager.cs | Extends mock RangeSection with RangeList and adds mock CodeRangeMapRangeList layout + builder helpers. |
| src/native/managed/cdac/tests/ExecutionManager/ExecutionManagerTests.cs | Adds tests for stub-kind classification and registers the new mock datatype. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/SOSDacImpl.IXCLRDataProcess.cs | Implements GetRuntimeNameByAddress using cDAC contracts and formats CLRStub names. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/RangeSection.cs | Adds RangeList pointer field to the managed RangeSection data model. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/CodeRangeMapRangeList.cs | Introduces managed data model for CodeRangeMapRangeList. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/PrecodeStubs_Common.cs | Adds GetCandidateEntryPoints enumeration support for precode name resolution. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManager_1.cs | Exposes GetStubKind in v1 contract wrapper. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManager_2.cs | Exposes GetStubKind in v2 contract wrapper. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManagerCore.cs | Implements GetStubKind logic (globals + range-section lookup) and reads new optional globals. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManagerCore.ReadyToRunJitManager.cs | Adds stub-kind classification for R2R thunk regions. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/ExecutionManager/ExecutionManagerCore.EEJitManager.cs | Adds stub-kind classification for range-list stubs and code-header stub code blocks. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Constants.cs | Adds new global names for well-known stubs/helpers. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/DataType.cs | Adds CodeRangeMapRangeList to the DataType enum. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IPrecodeStubs.cs | Adds GetCandidateEntryPoints API to support precode resolution. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IExecutionManager.cs | Adds StubKind enum and GetStubKind API. |
| src/coreclr/vm/loaderallocator.hpp | Exposes CodeRangeMapRangeList::_rangeListType via cdac_data<>. |
| src/coreclr/vm/datadescriptor/datadescriptor.inc | Adds RangeSection.RangeList, introduces CodeRangeMapRangeList type descriptor, and exports new stub globals. |
| src/coreclr/vm/codeman.h | Adds cdac_data<RangeSection>::RangeList offset. |
| src/coreclr/inc/gfunc_list.h | Adds DACGFN entries for new helper globals (platform-conditional). |
| src/coreclr/debug/daccess/daccess.cpp | Adjusts CLRStub address formatting to fixed-width hex. |
| docs/design/datacontracts/PrecodeStubs.md | Documents new GetCandidateEntryPoints API and its intended behavior. |
| docs/design/datacontracts/ExecutionManager.md | Documents GetStubKind, new RangeList/CodeRangeMapRangeList descriptors, and new globals. |
…rTests.cs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
|
||
| ### Stub Kind Classification | ||
|
|
||
| `GetStubKind` classifies a code address as a known stub type or managed code. It first checks the address against well-known global stub pointers (`ThePreStub`, `VarargPInvokeStub`, `VarargPInvokeStub_RetBuffArg`, `GenericPInvokeCalliHelper`, `TailCallJitHelper`). If the address matches one of these, it returns the corresponding `StubKind` immediately. |
There was a problem hiding this comment.
Are we doing this classification just to match the name returned by the legacy DAC?
I think we should depend on coreclr.dll symbols for the names of these assembly helpers (similar to what we have done for JIT helpers).
There was a problem hiding this comment.
Yes, we are matching the DAC. The DAC uses FindStubManager, which does find these helpers. We can modify the DAC to exclude those, and rely on the debugger to fall back to coreclr symbols.
There was a problem hiding this comment.
I assume that this is only about symbols displayed by disassembly windbg. We do not need to match 1:1 what windbg displays in disassembly. We just need to make sure that it displays it is reasonable.
If we do nothing (ie stop recognizing these addresses in GetRuntimeNameByAddress), what is going to break?
There was a problem hiding this comment.
BTW: windbg disassembly experience that involves this API is very poor today. When I run !u on a method that just calls Console.ReadLine, I see this:
00007ff9`c0cb4780 55 push rbp
00007ff9`c0cb4781 4883ec20 sub rsp,20h
00007ff9`c0cb4785 488d6c2420 lea rbp,[rsp+20h]
00007ff9`c0cb478a 48894d10 mov qword ptr [rbp+10h],rcx
>>> 00007ff9`c0cb478e ff155c051d00 call qword ptr [CLRStub[MethodDescPrestub]@00007FF9C0E84CF0 (00007ff9`c0e84cf0)]
00007ff9`c0cb4794 90 nop
00007ff9`c0cb4795 4883c420 add rsp,20h
00007ff9`c0cb4799 5d pop rbp
00007ff9`c0cb479a c3 ret
I assume that MethodDescPrestub in the disassembly comes from this API. It is close to useless information. Raw address with no symbols name would be about as useful.
I would like to see something like call qword ptr [System_Console!System.Console.ReadLine (00007ff9c0e84cf0)]` in the disassembly, similar to what it was the case in .NET Framework. The contract of these DAC APIs is poorly defined and windbg makes a lot of assumptions about what they do. These details changed as the runtime evolved and ended up breaking the experience. Instead of trying to match 1:1 what these APIs do currently, we should focus on what they need to do to produce human friendly disassembly in windbg in common cases.
| // Compute the address as a string safely. | ||
| WCHAR addrString[Max64BitHexString + 1]; | ||
| FormatInteger(addrString, ARRAY_SIZE(addrString), "%p", stubAddr); | ||
| FormatInteger(addrString, ARRAY_SIZE(addrString), sizeof(void*) == 8 ? "%016llX" : "%08X", stubAddr); |
There was a problem hiding this comment.
FormatInteger is passed stubAddr (a TADDR/uintptr_t) with the format string %016llX on 64-bit. On non-Windows 64-bit platforms, uintptr_t is typically unsigned long, which does not match %llX and can cause undefined behavior in sprintf_s. Use a format specifier that matches uintptr_t (e.g., PRIxPTR) or cast stubAddr to an explicit unsigned long long (and similarly to unsigned int for the 32-bit branch) so the varargs type always matches the format string.
| FormatInteger(addrString, ARRAY_SIZE(addrString), sizeof(void*) == 8 ? "%016llX" : "%08X", stubAddr); | |
| #ifdef HOST_64BIT | |
| FormatInteger(addrString, ARRAY_SIZE(addrString), "%016llX", static_cast<unsigned long long>(stubAddr)); | |
| #else | |
| FormatInteger(addrString, ARRAY_SIZE(addrString), "%08X", static_cast<unsigned int>(stubAddr)); | |
| #endif |
| // Enumerates candidate precode entry points near a given code address. | ||
| // This is used to resolve a code address that falls within a | ||
| // precode stub back to its entry point. | ||
| IEnumerable<TargetCodePointer> GetCandidateEntryPoints(TargetCodePointer address); |
There was a problem hiding this comment.
We should be able to find the precode start address using some address math rather than having to do guess-and-check repeatedly. I think the algorithm would look something like this:
TADDR GetStubStartFromInterior(RangeSection* pRS, TADDR interiorPrecodeAddress)
{
if (pRS == nullptr || !(pRS->_flags & RangeSection::RANGE_SECTION_RANGELIST))
return 0;
size_t stubSize = 0;
switch (pRS->_pRangeList->GetCodeBlockKind())
{
case STUB_CODE_BLOCK_FIXUPPRECODE:
stubSize = FixupPrecode::CodeSize;
break;
case STUB_CODE_BLOCK_STUBPRECODE:
stubSize = StubPrecode::CodeSize;
break;
default:
return 0;
}
const size_t pageMask = GetStubCodePageSize() - 1;
const TADDR pageBase = interiorPrecodeAddress & ~static_cast<TADDR>(pageMask);
const size_t offset = static_cast<size_t>(interiorPrecodeAddress - pageBase);
return pageBase + (offset / stubSize) * stubSize;
}
|
|
||
| // Classify a code address as a known stub kind (precode, jump stub, VSD stub, etc.) | ||
| // or as managed code. Returns CodeBlockUnknown if the address is not recognized. | ||
| StubKind GetStubKind(TargetCodePointer jittedCodeAddress); |
There was a problem hiding this comment.
I'm worried that this classification would make the interface easily broken if we added/removed/merged/split/redefined stubs in the future. How about instead of returning an explicit enumeration value we return a string with a looser guarantee? Something like this:
// If the code refers to a dynamically generated runtime stub, return a non-null name
// describing what kind of stub it is. The exact kinds of stubs and their functionality
// might shift across different runtime versions.
string? GetStubSymbol(CodeBlockHandle codeInfoHandle)Hopefully this together with GetRelativeOffset() could implement the GetRuntimeNameByAddress API. I am using a CodeBlockHandle here thinking we might benefit from caching some CodeBlock data about stubs too (RangeSection, StartAddress, CodeBlockStubKind) even though it currently appears we don't. Even if we continue to not cache anything it still makes the API a little more consistent.
No description provided.