Skip to content

Make ETW rundown operate more effectively with JIT #116354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

AaronRobinsonMSFT
Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT commented Jun 5, 2025

Fixes #102858

Make the EECodeGenManager API surface narrower.
Rename some locals removing critical section notion and replace with lock.

Rework the CodeHeapIterator.
Rename members to more accurately reflect their purpose. Generally move to more canonical C++.

EECodeGenManager now has an iterator counter. This is used to track the number of outstanding iterators and can be used to defer deletes. Adds are still valid since the iterator will only operate on the state at the time the iterator was created.

Always emit UnwindInfo for methods. This was previously only enabled when the Runtime provider was enabled and a rundown triggered.

Rename some locals removing critical section notion
and replace with lock.

Rework the CodeHeapIterator.
Rename members to more accurately reflect their purpose.
Generally move to more canonical C++.

EECodeGenManager now has a iterator counter. This is
used to track the number of outstanding iterators and
can be used to defer deletes. Adds are still valid since
the iterator will only operate on the state at the time the
iterator was created.
@AaronRobinsonMSFT AaronRobinsonMSFT marked this pull request as ready for review June 6, 2025 17:43
@AaronRobinsonMSFT AaronRobinsonMSFT requested a review from Copilot June 6, 2025 22:39
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refines the ETW rundown and JIT-related code by narrowing the EECodeGenManager API, renaming variables and functions for improved clarity, and reworking the CodeHeapIterator. It also updates logging formats, modifies type usage for iteration counters, and adjusts several low‐level components to support the new design.

  • Refactored API calls (e.g. AllocCode, AllocFromJitMetaHeap) and adapted braced initialization for RemoveJitData.
  • Renamed members (e.g. m_pChunks → m_pChunksBegin) and updated iteration variable types (USHORT → INT32) for consistency.
  • Improved log formatting and introduced move semantics in CrstHolder for better concurrency safety.

Reviewed Changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/coreclr/vm/jitinterface.cpp Updated API calls and braced initialization for RemoveJitData.
src/coreclr/vm/excep.cpp Changed iteration counter types from USHORT to INT32.
src/coreclr/vm/eventtrace.cpp Modified event logging flow and adjusted the code heap iterator usage.
src/coreclr/vm/dynamicmethod.{h,cpp} Added new setters and updated LOG formatting.
src/coreclr/vm/crst.h Added move semantics to CrstHolder for improved lock management.
src/coreclr/vm/codeman.h Renamed members and updated virtual destructors to default.
src/coreclr/vm/class.{inl,h,cpp} Renamed chunk pointers to improve clarity of the method list.
src/coreclr/utilcode/{pedecoder.cpp,pedecoder.h} Adjusted MethodSectionIterator to use non-const pointers.
src/coreclr/inc/utilcode.h Updated the type for Count() from USHORT to INT32.
src/coreclr/inc/contract.inl Revised error message text for tracking failures.
src/coreclr/debug/daccess/* Updated references to use the new m_pAllCodeHeaps naming.
Comments suppressed due to low confidence (6)

src/coreclr/vm/dynamicmethod.cpp:876

  • The updated LOG formatting now uses '%zx' for size_t values; please verify that this format specifier is supported on all target platforms.
LOG((LF_BCL, LL_INFO1000, "Level3 - CodeHeap released [%p, vt(0x%zx)] - ref count %d\n", this, *(size_t*)this, m_AllocationCount));

src/coreclr/utilcode/pedecoder.cpp:2581

  • The removal of const qualifiers from the MethodSectionIterator parameters changes the method's contract; please confirm that allowing mutation on these pointers is intentional and safe for the overall design.
MethodSectionIterator::MethodSectionIterator(void *code, SIZE_T codeSize, void *codeTable, SIZE_T codeTableSize)

src/coreclr/inc/contract.inl:588

  • The updated error message provides clearer information regarding missing tracking data; please ensure that this change is consistently reflected in related documentation and error handling conventions.
strcat_s(Buf,ARRAY_SIZE(Buf), "Missing tracking information. Look for data structures that manipulate contract state (i.e., CrstHolder).\n");

src/coreclr/vm/jitinterface.cpp:11044

  • The use of braced initialization for RemoveJitData is nontraditional; please double-check that overload resolution works as intended across all build configurations.
jitMgr->RemoveJitData({pCodeHeader, m_GCinfo_len, m_EHinfo_len});

src/coreclr/inc/utilcode.h:1095

  • Changing the return type of Count() from USHORT to INT32 is fine, but please ensure that all dependent functions and loops consuming this value are updated to handle INT32 appropriately.
INT32 Count()

src/coreclr/vm/crst.h:371

  • The introduction of move semantics for CrstHolder is a good improvement; please verify that the new move constructor and assignment operator correctly preserve lock invariants in all multi-threaded scenarios.
CrstHolder(CrstHolder&& other)

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test failures look related:

20:03:09.305 Running test: tracing/runtimeeventsource/nativeruntimeeventsource/nativeruntimeeventsource.cmd
ASSERT FAILED
	Expression: codeLength > 0
	Location:   line 1866 in /Users/runner/work/1/s/src/coreclr/vm/eetwain.cpp

@jkotas jkotas added the tenet-performance Performance related issue label Jun 8, 2025
@AaronRobinsonMSFT
Copy link
Member Author

AaronRobinsonMSFT commented Jun 8, 2025

Test failures look related:

20:03:09.305 Running test: tracing/runtimeeventsource/nativeruntimeeventsource/nativeruntimeeventsource.cmd
ASSERT FAILED
	Expression: codeLength > 0
	Location:   line 1866 in /Users/runner/work/1/s/src/coreclr/vm/eetwain.cpp

Yep. I'm trying to root it out on macOS. I tried on Windows and did find a contract violation that I needed to fix. I pushed up that fix and will see where we are.

possible due to MethodDesc iteration.
Rework how DynamicMethodDesc and LCGMethodResolver
destruction works. This was needed due to Crst lock inversion.
Remove CrstIbcProfile
@AaronRobinsonMSFT AaronRobinsonMSFT requested a review from jkotas June 14, 2025 00:39
Fixup Crst relationship of WrapperTemplate and ExecutableAllocatorLock.
@AaronRobinsonMSFT
Copy link
Member Author

@jkotas I think this should be ready to review now.

Thread* finalizerThread = GetFinalizerThread();
return finalizerThread->RequireSyncBlockCleanup()
|| SystemDomain::System()->RequireAppDomainCleanup()
|| (finalizerThread->m_DetachCount > 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|| (finalizerThread->m_DetachCount > 0)
|| (Thread::m_DetachCount > 0)

m_DetachCount is static

SystemDomain::System()->ProcessDelayedUnloadLoaderAllocators();
}

if (finalizerThread->m_DetachCount > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (finalizerThread->m_DetachCount > 0
if (Thread::m_DetachCount > 0

@@ -6790,70 +6790,13 @@ T_CONTEXT *Thread::GetFilterContext(void)

void Thread::ClearContext()
{
CONTRACTL {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not strictly related to this PR) Thread::ClearContext can be deleted. I do not see why we need to set `m_fDisableComObjectEagerCleanup = false; in the two places where it is called: One is Thread destructor - clearly useless there, and the other is dead thread during shutdown - no RCWs should be created in that case either.

// On non X86 platforms, the OS defined UnwindInfo (accessed from RUNTIME_FUNCTION
// structures) to support the ability unwind the stack. Unfortunatey the pre-Win8
typedef DPTR(class UnwindInfoTable) PTR_UnwindInfoTable;
// On Windows non-x64 platforms, the OS defined UnwindInfo (accessed from RUNTIME_FUNCTION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// On Windows non-x64 platforms, the OS defined UnwindInfo (accessed from RUNTIME_FUNCTION
// On Windows x64, publish OS defined UnwindInfo (accessed from RUNTIME_FUNCTION

STANDARD_VM_CONTRACT;
_ASSERTE(pDMD != NULL);

s_delayDestroyDynamicMethodList.LinkHead(new SimpleList<DynamicMethodDesc*>::Node(pDMD));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocating to destroy an item tends to be cause trouble in corner-cases.

Would it be better to maintain this linked list without allocations? There is convenient LCGMethodResolver::m_next that we can reuse for this link list.

Comment on lines +58 to +59
// On AMD64 JumpStub is used to call functions that is 2GB away. JumpStubs have a CodeHeader
// with NULL MethodDesc, are stored in code heap and are reported by EEJitManager::EnumCode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// On AMD64 JumpStub is used to call functions that is 2GB away. JumpStubs have a CodeHeader
// with NULL MethodDesc, are stored in code heap and are reported by EEJitManager::EnumCode.
// Stubs (see StubCodeBlockKind) have no MethodDesc. Skip them.


#ifndef TARGET_UNIX
if (!RtlUnwindFtnsInited)
#ifdef TARGET_WINDOWS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#ifdef TARGET_WINDOWS

Redundant ifdef

}
#endif // TARGET_WINDOWS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#endif // TARGET_WINDOWS

static void Initialize();

#if defined(TARGET_AMD64) && defined(TARGET_WINDOWS)
private:
// These are lower level functions that assume you have found the list of UnwindInfoTable entries
// These are used by the stublinker and the high-level method functions above
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// These are used by the stublinker and the high-level method functions above
// These are used by the high-level method functions above

Stublinker does not use these anymore

EX_CATCH
{
MethodDesc* pMD = arg.GetMethodDesc();
LOG((LF_JIT, LL_ERROR, "Failed to delay RemoveJitData for method, %s::%s\n", pMD->m_pszDebugClassName, pMD->m_pszDebugMethodName));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether dealing with RemoveJitData is worth the trouble:

For !IsLCGMethod case, the backout for GetJitMetaHeap looks like overengineering given that we are not backing out the code itself that is by far the largest chunk of memory. I think it would be fine to delete GetJitMetaHeap backout logic to keep this simpler.

It leaves is with the IsLCGMethod case. If we hit this OOM path, we will end up with code allocated for DynamicMethod method, but that DynamicMethod will be eventually destroyed. I am not sure whether all parts of the system will be able to handle this stray piece of code heap memory attached to destroyed DynamicMethod gracefully. It may be better to turn LCGMethodResolver::m_recordCodePointer into a linked list, same way as e.g. LCGMethodResolver::m_DynamicStringLiterals are done, and do cleanup of all code blocks as part of the regular DynamicMethod cleanup. As side effect, it would prepare us for enabling tiered compilation for DynamicMethods. DynamicMethods do not support tiered compilation because of it is not possible to attach multiple pieces of code to a single DynamicMethod.

If we do both of these, the entire RemoveJitData can be deleted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Very Long JIT Times During ETW Rundown
2 participants