-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Enable unloading of AssemblyLoadContext #18476
Conversation
@@ -236,6 +236,11 @@ static Helper() | |||
GCHANDLE = GCHandle.Alloc(Console.Out); | |||
} | |||
|
|||
public static void Shutdown() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better for this Helper
to create object with finalizer to clean this up or register unload event to clean this up, rather than changing hundreds of tests to call the Shutdown method manually.
Thanks @janvorli for bringing this back! And very happy to see also that "Enable unloading of assemblies with PInvokes" ❤️ |
48915c5
to
4bc3ea1
Compare
@jkotas I've updated the commit that fixes the test unloadability according to your feedback. |
[DllImport(JitHelpers.QCall, CharSet = CharSet.Unicode)] | ||
private static extern void LoadFromPath(IntPtr ptrNativeAssemblyLoadContext, string ilPath, string niPath, ObjectHandleOnStack retAssembly); | ||
[SuppressUnmanagedCodeSecurity] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SuppressUnmanagedCodeSecurity annotation are not necessary in CoreCLR (we have stripped them a while ago).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Multiple places)
src/vm/assemblyspec.cpp
Outdated
|
||
UPTR lookupKey = key; | ||
|
||
#if defined(FEATURE_CORECLR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#if defined(FEATURE_CORECLR)
is not needed
src/vm/loaderallocator.inl
Outdated
{ | ||
LIMITED_METHOD_CONTRACT; | ||
_ASSERTE(m_type == LAT_Assembly); | ||
|
||
// Link domain assembly together |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Indentation
I have written a small test so I can step through some of the code. I am having troubles to make it unload properly. E.g. if I run the following - the memory will just keep growing (I have killed it when the working set was 2GB, on Windows x64 checked): using System;
using System.Threading.Tasks;
using System.Reflection;
using System.Runtime.Loader;
class MyLoadContext : AssemblyLoadContext
{
public MyLoadContext() : base(isCollectible: true)
{
}
protected override Assembly Load(AssemblyName assemblyName)
{
return null;
}
static void Main()
{
for (;;)
{
var alc = new MyLoadContext();
alc.Unload();
}
}
} Edit: It looks like that there is a scalability problem. It does collect when I sprinkle it with sleeps. Still, it does not look right that it is possible to create these faster than the system is able to get rid of them. |
return loadedAssembly; | ||
private void VerifyIsAlive() | ||
{ | ||
if (state != InternalState.Alive) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the unload event handlers are executing, their JITing may trigger loads of more assemblies into the AssemblyLoadContext. But this block will prevent those assemblies from being loaded because the state is not Alive anymore. I think we need to allow loading of more stuff into the assembly load context even when the unload event handlers are executing.
We should have a test for this.
#endif | ||
} | ||
|
||
public void Unload() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot what contract for this method we agreed on: Is it required to call this method to unload the stuff? Or is the ALC going to auto-unload itself too when these are no references to it?
It may be worth a comment.
This isn't particularly surprising, considering that the amount of work that has to be done to verify a context is safe to unload is greater than the amount of work that has to be done to create one, and that (I can only assume, because if not then none of this makes any sense in the first place) it's happening asynchronously on a thread that occasionally has other things to do, while this is simply spinning forever creating contexts and dumping them as quickly as possible. This doesn't feel like something that would happen in real-world usage. We'd definitely never see it with file-based assemblies, and even a script engine that's generating assemblies dynamically wouldn't approximate this level of turnover, because there would be space in between where it's compiling and then executing its generated code. So basically what you've shown is that it's possible for deliberately malicious code to launch a DOS attack against this system, but that's a bit redundant because the hypothetical malicious code could be even more effective by not calling |
ok |
src/vm/assemblynative.cpp
Outdated
@@ -1299,7 +1304,46 @@ INT_PTR QCALLTYPE AssemblyNative::InitializeAssemblyLoadContext(INT_PTR ptrManag | |||
{ | |||
// Initialize a custom Assembly Load Context | |||
CLRPrivBinderAssemblyLoadContext *pBindContext = NULL; | |||
|
|||
// Create a new AssemblyLoaderAllocator for an AssemblyLoadContext | |||
AssemblyLoaderAllocator* loaderAllocator = new AssemblyLoaderAllocator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not need to do this for the existing regular non-unloadable load context. It is unnecessary overhead for them.
CMakeLists.txt
Outdated
@@ -584,6 +584,7 @@ add_subdirectory(src/pal/prebuilt/inc) | |||
|
|||
add_subdirectory(src/debug/debug-pal) | |||
|
|||
add_definitions(-DFEATURE_COLLECTIBLE_ALC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether we need this define. It made sense when this started and the same code was still building for desktop. It is not the case anymore.
src/vm/assemblyspec.cpp
Outdated
@@ -1560,7 +1559,12 @@ BOOL AssemblySpecBindingCache::StoreAssembly(AssemblySpec *pSpec, DomainAssembly | |||
UPTR key = (UPTR)pSpec->Hash(); | |||
|
|||
// On CoreCLR, we will use the BinderID as the key | |||
ICLRPrivBinder* pBinderContextForLookup = pAssembly->GetFile()->GetBindingContext(); | |||
ICLRPrivBinder* pBinderContextForLookup = pSpec->GetBindingContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just use pAssembly->GetFile()->GetBindingContext()
as before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't remember exactly why I did this but I think I had cases where the binding context from the spec was the one to use instead of the real assembly loaded (this was the original commit c98620f )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem unreasonable that there may be a bug here, but I don't think its related to unloadability. This change is plenty complicated without adding possibly unrelated changes to it. @janvorli could you look into if this is necessary for the scenarios which you are testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverting this change doesn't have any impact on my testing scenarios.
And adding tests... |
LGTM modulo comments. |
Yeah, but I remember the problem was also on the original PR but afair, the ALC are delayed collected via |
How long would those sleeps need to be? Fuzzlyn loads a lot of assemblies, and because of no app domains in .NET core, it launches a new process for every 100 assemblies to load them. With unloadable |
There is something weird I can't remember exactly... but I'm vaguely remembering that there was a code that was actually called on some IDLE event happening on the app domain... but now can't find this code anymore... I also recall to have a brute force simple app, creating a bunch of ALC and unload them directly, it was not increasing in memory (even though there was two cases, one that hasn't loaded anything and one that loaded already something... maybe I tested only one that tested with at least one assembly loaded through it)... That's difficult to recall all the details after more than 1.5 years, I should have put more notes around 😓 |
@dotnet-bot test Windows_NT x64 Checked corefx_baseline |
Looks like one of the System.Reflection.Emit.Tests failed with assert in the runtime that I was not hitting locally. I'm looking into it. |
@janvorli A wild WOMM appears! |
@dotnet-bot test Windows_NT x64 Checked corefx_baseline |
0683d09
to
3153f6b
Compare
The assert that pLoaderAllocator is not NULL in the CLRPrivBinderAssemblyLoadContext::SetupContext is now obsolete. The same in AppDomain::LoadDomainAssemblyInternal, where the pLoaderAllocator can also be NULL for non-collectible AssemblyLoadContext.
For collectible dynamic assemblies, we were not adding the DomainAssembly to the AssemblyLoaderAllocator. After fixing that, another issue surfaced. The AssemblyLoaderAllocator for dynamic assemblies doesn't have the m_binderToRelease set and we were asserting that it is not NULL.
* Let the the default global COMDelegate::m_pShuffleThunkCache use global stub heap instead of C++ heap for stub allocation. * Add separate instance of the shuffle thunk cache into collectible AssemblyLoaderAllocator.
In one of the recent commits, I have removed changing the managed AssemblyLoadContext handle from weak to strong in CLRPrivBinderAssemblyLoadContext::PrepareForLoadContextRelease. However I have not noticed that several tests from the CoreCLR test suite have started to throw NullReferenceException during unload. The issue was that SafeBCryptAlgorithmHandle finalizer was trying to PInvoke into a native library in order to close its handle and needed to get the AssemblyLoadContext in AssemblyLoadContext.ResolveUnmanagedDll.
f299886
to
44ff704
Compare
@davidwrighton, as we have already discussed offline, I've looked into your concerns related to the binder cache. When an AssemblyLoadContext.Load override gets an Assembly by calling another context's LoadFromAssemblyName name, it gets cached only for this other context. And this AssemblyLoadContext is held alive by the RuntimeAssembly's m_syncRoot. I've also reverted back the way we get the binder for lookup in the AssemblySpecBindingCache::StoreAssembly and AssemblySpecBindingCache::StoreFile based on the feedback - my testing didn't show a need for that. |
I've apparently made some mistake when removing my instrumentation logging that I have added for my personal debugging purposes, that's why the tests are failing. I'm looking into it. |
The code in AssemblySpecBindingCache::Store{File,Assembly} and in AppDomain::LoadDomainAssemblyInternal to get LoaderAllocator from a ICLRPrivBinder was previously assuming that the binder was always BINDER_SPACE::Assembly at that point. However, some new recently enabled tests have discovered that this assumption is wrong and that it can be CLRPrivAssemblyWinRT too. To fix that, I've added the GetLoaderAllocator method to ICLRPrivBinder interface and removed the ICollectibleAssemblyLoadContext. The binders that don't have LoaderAllocator return E_FAIL from this method.
Running tests with assembly loading in one AssemblyLoadContext delegated to another one has discovered an issue in selecting the LoaderAllocator for allocating shuffle thunks. The correct way is to get it from the pDelMT instead of the pTargetMeth.
The original set of changes modified the pBinderContextForLookup to be taken from pSpec and only if the spec had none, it would take it from the pFile as before the change. My testing hasn't shown this as necessary, so I am reverting this back based on the PR feedback.
7ded8e6
to
5ad5feb
Compare
Found and fixed the issue in the "Add GetLoaderAllocator to ICLRPrivBinder" commit. |
Woohoo! 🎉 |
Awesome job! So, what does this PR going through and being merged mean for the potential future release of AssemblyLoadContext unloading, exactly? |
@TorVestergaard this is the first step towards that goal. With the merged-in state, I am able to run coreclr tests using a wrapper tool that loads a test and its dependencies into an AssemblyLoadContext and unloads it after the test finishes.
I am currently working on investigating these limitations with the ultimate goal to get rid of them. We also need more testing and also testing more complex scenarios like multiple AssemblyLoadContexts coexisting and interacting, etc. |
We also need:
|
This change is basically the work of @xoofx done some time ago, rebased
to the current master and with added fixes.
My contributions to it were:
AssemblyLoadContext including their dependencies and ensuring
that they work (except the ones that use features not yet supported
for unloading, like thread local members, COM interop).
That uncovered issues listed below.
was preventing the tests from unloading.
I recommend viewing the commits separately.
There is still a lot of work to do to enable e.g. unloading of assemblies
with classes having thread local members and other stuff. But it seems
worth merging these changes in.