Conversation
|
@jkotas PTAL |
| m_movR10[1] = 0xBF; | ||
| #endif | ||
|
|
||
| FlushInstructionCache(GetCurrentProcess(), &m_movR10[0], &m_jmpRAX[3]-&m_movR10[0]); |
There was a problem hiding this comment.
Could you please use ClrFlushInstructionCache? It is no-op on Intel platforms that guarantee code cache coherency.
There was a problem hiding this comment.
Yes, thank you for suggestion. Should we use ClrFlushInstructionCache instead of direct calls of the FlushInstructionCache in other places (for example in UMEntryThunkCode::Encode in arm/stubs.cpp, arm64/stubs.cpp, i386/cgenx86.cpp)?
| MODE_COOPERATIVE; | ||
| PRECONDITION(CheckPointer(pEntryThunk)); | ||
| } | ||
| CONTRACTL_END; |
There was a problem hiding this comment.
I would be useful to add a comment here that this diagnostic is best effort, it won't report the problem in 100% of cases, and it may sometime crash while trying to report the problem.
| *ppHead = pNewBlock; | ||
|
|
||
| MergeBlock(pNewBlock, pHeap); | ||
| if (pHeap->IsFIFO()) |
There was a problem hiding this comment.
I am wondering how this would can interact with other places that use the executable heap, and whether we can make this generally more reliable.
Would it be better to move the FIFO into one level up to where UMEntryThunks are allocate? I am thinking about something like:
UMEntryThunk::CreateUMEntryThunk
{
if (number of cached thunks < 100)
Allocate a new thunk using GetGlobalLoaderAllocator()->GetExecutableHeap()
else
Use thunk from the LIFO cache
}
UMEntryThunk::Terminate
{
Add a thunk to LIFO cache
}
|
It looks good to me overall, modulo comments. Thank you for implementing it! |
e025ec5 to
c6e61e3
Compare
|
Thank you for review! I've updated PR. |
|
@dotnet-bot test Tizen armel Cross Checked Innerloop Build and Test |
|
|
||
| if (p == NULL) | ||
| { | ||
| // On the phone, use loader heap to save memory commit of regular executable heap |
There was a problem hiding this comment.
Nit: Delete this comment. It is not relevant anymore.
c6e61e3 to
44eb523
Compare
| ++m_count; | ||
| } | ||
|
|
||
| m_list.InsertTail(new SListElem<UMEntryThunk*>(pThunk)); |
There was a problem hiding this comment.
Allocating here is problematic. This method needs to have:
CONTRACTL
{
NOTHROW;
}
CONTRACTL_END;
contract because of UMEntryThunk::FreeUMEntryThunk that calls it is nothrow.
We should just use the UMEntryThunks memory itself to maintain the list.
It may be better to not use SList for this, and just implement a custom linked list just for the UMEntryThunk here.
|
|
||
| if (pElem != NULL) | ||
| { | ||
| UMEntryThunkFreeList::FreeThunk(pElem->GetValue()); |
There was a problem hiding this comment.
We can put the thunk to the list all the time. No need to ever return it back to the LoaderHeap.
There was a problem hiding this comment.
If you do this, we can also delete LHF_ZEROINIT flag on the LoaderHeap added by previous iteration of this change.
There was a problem hiding this comment.
Why we don't need to return allocated chunks to the LoaderHeap? I think they can be reused since GlobalLoaderAllocator's executable heap is also used in other places.
Thank you!
There was a problem hiding this comment.
Outside UMEntryThunks, the executable heap is used in a very few rarely used places. There is a close to zero chance that the returned memory would be reused for anything but UMEntryThunks.
There was a problem hiding this comment.
Thank you for explanation, I've removed returning memory to the LoaderHeap in UMEntryThunkFreeList.
Is LHF_ZEROINIT flag not useful? I think it can reduce time of allocation if we don't need initialized memory.
There was a problem hiding this comment.
LoaderHeap allocates memory directly from the OS using mmap. This memory is zero initialized, so the zero initialization is free for the normal loader heap use.
!LHF_ZEROINIT can only save something for cases where the memory is returned back to LoaderHeap. This was only done by UMEntryThunks on mainline paths. After this change, it will pretty much never happen. The memory is generally returned to LoaderHeap on exceptional paths only, like a complex operations like type loading fails in the middle and we need to backout the memory allocated so far - so that we do not have a memory leak if this operation is repeated again and again. We do not optimize performance on exceptional paths. We prefer simplicity for exceptional paths.
There was a problem hiding this comment.
Thank you! I've removed this option.
| @@ -33,6 +33,90 @@ struct UM2MThunk_Args | |||
| int argLen; | |||
| }; | |||
There was a problem hiding this comment.
You can the MDA_SUPPORTED code in dllimportcallback.* because of it is superceeded by this change.
There was a problem hiding this comment.
We can delete MDA_SUPPORTED code in dllimportcallback, yes?
|
|
||
| ~UMEntryThunkFreeList() | ||
| { | ||
| SListElem<UMEntryThunk*> *pElem = m_list.GetHead(); |
There was a problem hiding this comment.
This destructor won't be necessary once we stop allocating our own heap. (You do not need to worry about freeing memory allocated on LoaderHeap.)
44eb523 to
757f7a8
Compare
|
cc @parjong |
|
|
||
| #define DEFAULT_THUNK_FREE_LIST_THRESHOLD 64 | ||
|
|
||
| static UMEntryThunkFreeList s_thunkFreeList(DEFAULT_THUNK_FREE_LIST_THRESHOLD); |
There was a problem hiding this comment.
This contains Crst. Crsts in static variables has to be CrstStatic to avoid issues like: https://github.com/dotnet/coreclr/issues/13779#issuecomment-328007409
| union | ||
| { | ||
| // Pointer to the shared structure containing everything else | ||
| PTR_UMThunkMarshInfo m_pUMThunkMarshInfo; |
There was a problem hiding this comment.
👍
Thanks for making it explicit where the link lives.
| m_jmp = X86_INSTR_JMP_REL32; | ||
| m_execstub = (BYTE*) ((pTargetCode) - (4+((BYTE*)&m_execstub))); | ||
|
|
||
| FlushInstructionCache(GetCurrentProcess(),GetEntryPoint(),sizeof(UMEntryThunkCode)); |
There was a problem hiding this comment.
Could you please keep the full FlushInstructionCache in Encode (on x86 and x64 at least)?
We used to have issues with Time Travel Debugging that required explicit FlushInstructionCache to fix/workaround. I am not sure whether these issues still exist.
Using ClrFlushInstructionCache in Poison should be fine.
| End | ||
|
|
||
| Crst UMEntryThunkFreeList | ||
| AcquiredBefore LoaderHeap |
There was a problem hiding this comment.
Ordering this with LoaderHeap should not be needed - you are not calling into LoaderHeap when the lock is taken.
Since your lock is a pretty simple leaf lock, you can just use CrstLeafLock for it and not add to this file at all.
There was a problem hiding this comment.
Thank you for suggestion!
Improve UMEntryThunkCode::Poison to produce diagnostic message when collected delegate was called.
757f7a8 to
8238c90
Compare
Use free list to delay reusing deleted thunks. It improves collected delegate calls diagnostic.
This option was used for UMEntryThunkCode::Poison. Now we use own free list to store freed thunks and don't return allocated memory to the LoaderHeap. So reused thunks are always uninitialized.
8238c90 to
564db77
Compare
Improve collected delegate calls diagnostic (https://github.com/dotnet/coreclr/issues/15465):
UMEntryThunkCode::Poisonto produce diagnostic messageExample: