New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable unloading of AssemblyLoadContext #18476

Merged
merged 24 commits into from Aug 23, 2018

Conversation

9 participants
@janvorli
Member

janvorli commented Jun 14, 2018

This change is basically the work of @xoofx done some time ago, rebased
to the current master and with added fixes.

My contributions to it were:

  • Extensive testing - running all 11753 coreclr Pri1 tests loaded into
    AssemblyLoadContext including their dependencies and ensuring
    that they work (except the ones that use features not yet supported
    for unloading, like thread local members, COM interop).
    That uncovered issues listed below.
  • Fix few memory leaks
  • Enable unloading of assemblies with PInvokes
  • Fix virtual stub manager initialization
  • Fix issue with unwind info regions reporting for stub linker
  • Fix misplaced assert related to statics in collectible assemblies
  • Fix issue with SOS domain enumeration.
  • Fixed tests that were allocating, but not freeing GCHandle that
    was preventing the tests from unloading.

I recommend viewing the commits separately.

There is still a lot of work to do to enable e.g. unloading of assemblies
with classes having thread local members and other stuff. But it seems
worth merging these changes in.

@janvorli janvorli added the area-VM label Jun 14, 2018

@janvorli janvorli self-assigned this Jun 14, 2018

@janvorli janvorli requested review from jkotas and davidwrighton Jun 14, 2018

@janvorli

This comment has been minimized.

Member

janvorli commented Jun 14, 2018

@@ -236,6 +236,11 @@ static Helper()
GCHANDLE = GCHandle.Alloc(Console.Out);
}
public static void Shutdown()

This comment has been minimized.

@jkotas

jkotas Jun 15, 2018

Member

It would be better for this Helper to create object with finalizer to clean this up or register unload event to clean this up, rather than changing hundreds of tests to call the Shutdown method manually.

@xoofx

This comment has been minimized.

Member

xoofx commented Jun 15, 2018

Thanks @janvorli for bringing this back! And very happy to see also that "Enable unloading of assemblies with PInvokes" ❤️

@janvorli

This comment has been minimized.

Member

janvorli commented Jun 15, 2018

@jkotas I've updated the commit that fixes the test unloadability according to your feedback.

[DllImport(JitHelpers.QCall, CharSet = CharSet.Unicode)]
private static extern void LoadFromPath(IntPtr ptrNativeAssemblyLoadContext, string ilPath, string niPath, ObjectHandleOnStack retAssembly);
[SuppressUnmanagedCodeSecurity]

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

SuppressUnmanagedCodeSecurity annotation are not necessary in CoreCLR (we have stripped them a while ago).

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

(Multiple places)

UPTR lookupKey = key;
#if defined(FEATURE_CORECLR)

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

#if defined(FEATURE_CORECLR) is not needed

{
LIMITED_METHOD_CONTRACT;
_ASSERTE(m_type == LAT_Assembly);
// Link domain assembly together

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

Nit: Indentation

@jkotas

This comment has been minimized.

Member

jkotas commented Jun 17, 2018

I have written a small test so I can step through some of the code. I am having troubles to make it unload properly. E.g. if I run the following - the memory will just keep growing (I have killed it when the working set was 2GB, on Windows x64 checked):

using System;
using System.Threading.Tasks;
using System.Reflection;
using System.Runtime.Loader;

class MyLoadContext : AssemblyLoadContext
{
    public MyLoadContext() : base(isCollectible: true)
    {
    }

    protected override Assembly Load(AssemblyName assemblyName)
    {
        return null;
    }

    static void Main()
    {
        for (;;)
        {
            var alc = new MyLoadContext();
            alc.Unload();
        }
    }
}

Edit: It looks like that there is a scalability problem. It does collect when I sprinkle it with sleeps. Still, it does not look right that it is possible to create these faster than the system is able to get rid of them.

return loadedAssembly;
private void VerifyIsAlive()
{
if (state != InternalState.Alive)

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

When the unload event handlers are executing, their JITing may trigger loads of more assemblies into the AssemblyLoadContext. But this block will prevent those assemblies from being loaded because the state is not Alive anymore. I think we need to allow loading of more stuff into the assembly load context even when the unload event handlers are executing.

We should have a test for this.

#endif
}
public void Unload()

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

I forgot what contract for this method we agreed on: Is it required to call this method to unload the stuff? Or is the ALC going to auto-unload itself too when these are no references to it?

It may be worth a comment.

@masonwheeler

This comment has been minimized.

masonwheeler commented Jun 17, 2018

@jkotas

Edit: It looks like that there is a scalability problem. It does collect when I sprinkle it with sleeps. Still, it does not look right that it is possible to create these faster than the system is able to get rid of them.

This isn't particularly surprising, considering that the amount of work that has to be done to verify a context is safe to unload is greater than the amount of work that has to be done to create one, and that (I can only assume, because if not then none of this makes any sense in the first place) it's happening asynchronously on a thread that occasionally has other things to do, while this is simply spinning forever creating contexts and dumping them as quickly as possible.

This doesn't feel like something that would happen in real-world usage. We'd definitely never see it with file-based assemblies, and even a script engine that's generating assemblies dynamically wouldn't approximate this level of turnover, because there would be space in between where it's compiling and then executing its generated code.

So basically what you've shown is that it's possible for deliberately malicious code to launch a DOS attack against this system, but that's a bit redundant because the hypothetical malicious code could be even more effective by not calling Unload in the first place. That being true, I don't think this scenario is something to worry about in and of itself.

@jkotas

This comment has been minimized.

Member

jkotas commented Jun 17, 2018

This doesn't feel like something that would happen in real-world usage

ok

@@ -1299,7 +1304,46 @@ INT_PTR QCALLTYPE AssemblyNative::InitializeAssemblyLoadContext(INT_PTR ptrManag
{
// Initialize a custom Assembly Load Context
CLRPrivBinderAssemblyLoadContext *pBindContext = NULL;
// Create a new AssemblyLoaderAllocator for an AssemblyLoadContext
AssemblyLoaderAllocator* loaderAllocator = new AssemblyLoaderAllocator();

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

We should not need to do this for the existing regular non-unloadable load context. It is unnecessary overhead for them.

@@ -584,6 +584,7 @@ add_subdirectory(src/pal/prebuilt/inc)
add_subdirectory(src/debug/debug-pal)
add_definitions(-DFEATURE_COLLECTIBLE_ALC)

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

I am not sure whether we need this define. It made sense when this started and the same code was still building for desktop. It is not the case anymore.

@@ -1560,7 +1559,12 @@ BOOL AssemblySpecBindingCache::StoreAssembly(AssemblySpec *pSpec, DomainAssembly
UPTR key = (UPTR)pSpec->Hash();
// On CoreCLR, we will use the BinderID as the key
ICLRPrivBinder* pBinderContextForLookup = pAssembly->GetFile()->GetBindingContext();
ICLRPrivBinder* pBinderContextForLookup = pSpec->GetBindingContext();

This comment has been minimized.

@jkotas

jkotas Jun 17, 2018

Member

Why can't we just use pAssembly->GetFile()->GetBindingContext() as before?

This comment has been minimized.

@xoofx

xoofx Jun 17, 2018

Member

I really don't remember exactly why I did this but I think I had cases where the binding context from the spec was the one to use instead of the real assembly loaded (this was the original commit c98620f )

This comment has been minimized.

@davidwrighton

davidwrighton Jun 28, 2018

Member

It doesn't seem unreasonable that there may be a bug here, but I don't think its related to unloadability. This change is plenty complicated without adding possibly unrelated changes to it. @janvorli could you look into if this is necessary for the scenarios which you are testing?

This comment has been minimized.

@janvorli

janvorli Jun 29, 2018

Member

Reverting this change doesn't have any impact on my testing scenarios.

@jkotas

This comment has been minimized.

Member

jkotas commented Jun 17, 2018

There is still a lot of work to do to enable e.g. unloading of assemblies with classes having thread local members and other stuff.

And adding tests...

@jkotas

This comment has been minimized.

Member

jkotas commented Jun 17, 2018

LGTM modulo comments.

@xoofx

This comment has been minimized.

Member

xoofx commented Jun 17, 2018

Edit: It looks like that there is a scalability problem. It does collect when I sprinkle it with sleeps. Still, it does not look right that it is possible to create these faster than the system is able to get rid of them.

Yeah, but I remember the problem was also on the original PR but afair, the ALC are delayed collected via SystemDomain::ProcessDelayedUnloadDomains which is indirectly trigerred via the finalizer of thread worker (via FinalizerThread::FinalizerThreadWorker), so unlikely to happen in the pathological case you described. So I agree with @masonwheeler.

@jakobbotsch

This comment has been minimized.

Collaborator

jakobbotsch commented Jun 17, 2018

How long would those sleeps need to be? Fuzzlyn loads a lot of assemblies, and because of no app domains in .NET core, it launches a new process for every 100 assemblies to load them. With unloadable AssemblyLoadContext I assume the new processes could be avoided? I just measured and it amounts to around 45 assemblies loaded per second.

@xoofx

This comment has been minimized.

Member

xoofx commented Jun 18, 2018

How long would those sleeps need to be?

There is something weird I can't remember exactly... but I'm vaguely remembering that there was a code that was actually called on some IDLE event happening on the app domain... but now can't find this code anymore... I also recall to have a brute force simple app, creating a bunch of ALC and unload them directly, it was not increasing in memory (even though there was two cases, one that hasn't loaded anything and one that loaded already something... maybe I tested only one that tested with at least one assembly loaded through it)... That's difficult to recall all the details after more than 1.5 years, I should have put more notes around 😓

@jkotas

This comment has been minimized.

Member

jkotas commented Jun 18, 2018

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

@janvorli

This comment has been minimized.

Member

janvorli commented Jun 18, 2018

Looks like one of the System.Reflection.Emit.Tests failed with assert in the runtime that I was not hitting locally. I'm looking into it.

@masonwheeler

This comment has been minimized.

masonwheeler commented Jun 18, 2018

@janvorli A wild WOMM appears!

@Apollo3zehn Apollo3zehn referenced this pull request Jun 20, 2018

Open

Nuget plugin manager: Remaining tasks #140

8 of 20 tasks complete
@janvorli

This comment has been minimized.

Member

janvorli commented Jun 22, 2018

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

2 similar comments
@janvorli

This comment has been minimized.

Member

janvorli commented Jun 22, 2018

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

@janvorli

This comment has been minimized.

Member

janvorli commented Jun 22, 2018

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

janvorli added some commits Jun 21, 2018

Remove obsolete asserts
The assert that pLoaderAllocator is not NULL in the
CLRPrivBinderAssemblyLoadContext::SetupContext is now obsolete.

The same in AppDomain::LoadDomainAssemblyInternal, where the
pLoaderAllocator can also be NULL for non-collectible
AssemblyLoadContext.
Fix unloading of dynamic assemblies
For collectible dynamic assemblies, we were not adding the DomainAssembly to the
AssemblyLoaderAllocator. After fixing that, another issue surfaced. The
AssemblyLoaderAllocator for dynamic assemblies doesn't have the
m_binderToRelease set and we were asserting that it is not NULL.
Fix shuffle thunk cache for unloadability
* Let the the default global COMDelegate::m_pShuffleThunkCache use
global stub heap instead of C++ heap for stub allocation.
* Add separate instance of the shuffle thunk cache into collectible
AssemblyLoaderAllocator.
Put back keeping managed ALC alive until the binder releases it
In one of the recent commits, I have removed changing the managed
AssemblyLoadContext handle from weak to strong in
CLRPrivBinderAssemblyLoadContext::PrepareForLoadContextRelease.
However I have not noticed that several tests from the CoreCLR
test suite have started to throw NullReferenceException during
unload. The issue was that SafeBCryptAlgorithmHandle finalizer
was trying to PInvoke into a native library in order to close
its handle and needed to get the AssemblyLoadContext in
AssemblyLoadContext.ResolveUnmanagedDll.
@janvorli

This comment has been minimized.

Member

janvorli commented Aug 22, 2018

@davidwrighton, as we have already discussed offline, I've looked into your concerns related to the binder cache. When an AssemblyLoadContext.Load override gets an Assembly by calling another context's LoadFromAssemblyName name, it gets cached only for this other context. And this AssemblyLoadContext is held alive by the RuntimeAssembly's m_syncRoot.

I've also reverted back the way we get the binder for lookup in the AssemblySpecBindingCache::StoreAssembly and AssemblySpecBindingCache::StoreFile based on the feedback - my testing didn't show a need for that.

@janvorli

This comment has been minimized.

Member

janvorli commented Aug 22, 2018

I've apparently made some mistake when removing my instrumentation logging that I have added for my personal debugging purposes, that's why the tests are failing. I'm looking into it.

janvorli added some commits Aug 20, 2018

Add GetLoaderAllocator to ICLRPrivBinder
The code in AssemblySpecBindingCache::Store{File,Assembly} and in
AppDomain::LoadDomainAssemblyInternal to get LoaderAllocator from a
ICLRPrivBinder was previously assuming that the binder was always
BINDER_SPACE::Assembly at that point. However, some new recently enabled
tests have discovered that this assumption is wrong and that it can
be CLRPrivAssemblyWinRT too.
To fix that, I've added the GetLoaderAllocator method to ICLRPrivBinder
interface and removed the ICollectibleAssemblyLoadContext. The binders
that don't have LoaderAllocator return E_FAIL from this method.
Fix shuffle thunk loader allocator
Running tests with assembly loading in one AssemblyLoadContext delegated
to another one has discovered an issue in selecting the LoaderAllocator
for allocating shuffle thunks. The correct way is to get it from the
pDelMT instead of the pTargetMeth.
Revert binder for lookup in StoreAssembly / StoreFile
The original set of changes modified the pBinderContextForLookup to be
taken from pSpec and only if the spec had none, it would take it from
the pFile as before the change. My testing hasn't shown this as
necessary, so I am reverting this back based on the PR feedback.
@janvorli

This comment has been minimized.

Member

janvorli commented Aug 22, 2018

Found and fixed the issue in the "Add GetLoaderAllocator to ICLRPrivBinder" commit.
@davidwrighton can you please take a look and approve the PR if you are fine with it now?

@janvorli janvorli merged commit 8cd4b39 into dotnet:master Aug 23, 2018

7 checks passed

Linux-musl x64 Debug Build Build finished.
Details
WIP ready for review
Details
Windows_NT x64 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x64 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
license/cla All CLA requirements met.
Details

@janvorli janvorli deleted the janvorli:collectible-assemblies branch Aug 23, 2018

@masonwheeler

This comment has been minimized.

masonwheeler commented Aug 23, 2018

Woohoo! 🎉

@TorVestergaard

This comment has been minimized.

TorVestergaard commented Aug 23, 2018

Awesome job! So, what does this PR going through and being merged mean for the potential future release of AssemblyLoadContext unloading, exactly?

@janvorli

This comment has been minimized.

Member

janvorli commented Aug 23, 2018

@TorVestergaard this is the first step towards that goal. With the merged-in state, I am able to run coreclr tests using a wrapper tool that loads a test and its dependencies into an AssemblyLoadContext and unloads it after the test finishes.
There are still some tests that don't pass in such an environment due to four limitations that the unloadability now has. This is their list in the order of priority:

  • There can be no thread local member or static variables in the assemblies loaded into the unloadable AssemblyLoadContext
  • Delegate marshaling for types within collectible assemblies is not supported.
  • COM Interop is not supported for collectible types
  • FixedAddressValueTypeAttribute is not supported for fields in collectible types
    If an attempt to load assemblies with these limitations into an unloadable AssemblyLoadContext is made, an exception is thrown.

I am currently working on investigating these limitations with the ultimate goal to get rid of them.

We also need more testing and also testing more complex scenarios like multiple AssemblyLoadContexts coexisting and interacting, etc.

@jkotas

This comment has been minimized.

Member

jkotas commented Aug 23, 2018

We also need:

  • Expose the collectible option in the public API surface: dotnet/corefx#14724
  • Add related APIs that make it possible for libraries to do the right thing for collectible assemblies like: dotnet/corefx#25671 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment