AssemblyLoadContext.Unload silently fails to unload Assemblies, leaking filehandles #44679

dave-yotta · 2020-11-12T10:18:51Z

Issue Title

The Unload method does not free any assemblies returned from LoadFromStream that are still referenced somewhere.

General

netcoreapp3.1

        public class TestAssemblyLoader : AssemblyLoadContext, IAssemblyLoader
        {
            public TestAssemblyLoader() : base(true)
            {
                LoadedAssemblies = new Dictionary<string, Assembly>();

                Resolving += LoadContext_Resolving;
            }

            private Assembly LoadContext_Resolving(AssemblyLoadContext arg1, AssemblyName arg2)
            {
                return LoadedAssemblies[arg2.FullName];
            }

            private Dictionary<string, Assembly> LoadedAssemblies { get; }

            Assembly IAssemblyLoader.LoadFromStream(Stream stream)
            {
                var assembly = LoadFromStream(stream);
                return LoadedAssemblies[assembly.FullName] = assembly;
            }

            protected sealed override Assembly Load(AssemblyName assemblyName)
            {
                return null;
            }

            public void Dispose()
            {
                LoadedAssemblies.Clear(); // Without this call, the assemblies in this dictionary don't get unloaded.
                Unload();
            }
        }

Symptom is visible in unreleased file handles - example repro here: https://github.com/dave-yotta/roslyn-assemblyunload

It might make sense for this to be expected behaviour, but I certainly didn't expect it. (nor did our production machines 😢)

At least can there be an argument to unload like bool throwIfAnyUnloadFailed = false so that it's self-documenting to callers somehow?

Related issue I'm coming from here: dotnet/roslyn#49282

The text was updated successfully, but these errors were encountered:

PathogenDavid · 2020-11-12T16:08:54Z

Unloading an AssemblyLoadContext is more of a request for future unload than a command for immediate unload. The runtime won't actually unload the assemblies if they're still in use. (And having them be accessible via your LoadedAssemblies constitutes as "in use".)

As noted on the documentation, unloading doesn't happen right away and calling the method only initiates the process:

There's one noteworthy difference between the unloading using AssemblyLoadContext and using AppDomains. With AppDomains, the unloading is forced. At unload time, all threads running in the target AppDomain are aborted, managed COM objects created in the target AppDomain are destroyed, and so on. With AssemblyLoadContext, the unload is "cooperative". Calling the AssemblyLoadContext.Unload method just initiates the unloading.

The documentation lists some examples of things that prevent unloading and implies that keeping references to assemblies in the AssemblyLoadContext implementation is OK, but it's worth noting that it's "OK" only if you also discard all references to the AssemblyLoadContext after initiating unload. (Which you're supposed to do: "Unloading will not occur while there are references to the AssemblyLoadContext.")

At least can there be an argument to unload like bool throwIfAnyUnloadFailed = false so that it's self-documenting to callers somehow?

I don't claim to be an expert on the internals of AssemblyLoadContext's implementation, but based on my understanding I don't think this is really possible without doing an immediate full garbage collection and wouldn't be possible if you're using the pattern of discarding the AssemblyLoadContext reference its self since it couldn't be discarded while you're calling Unload.

(Also dotnet/runtime would've been a more appropriate repo for this issue.)

dave-yotta · 2020-11-13T09:42:04Z

Ok - should I move this issue across?

The documentation does imply that Assembly references within the AssemblyLoadContext will not stop the unload, however that's not what I see here. I don't see any drop in file handles and have tested for a minute, stacking up 2000 file handles.

This could just be waiting for a GC; I could test by calling the GC after the AssemblyLoadContext is out of scope, perhaps this will trigger the unload - perhaps file handles are not tracked by the GC? 2000 is a lot, and in production it simply hits 100% and dies.

I suppose one could add an way to do an unload that makes the caller aware of any assemblies that are not yet up for collection, but the use case for that might be limited - an update to the documentation might be more appropriate - however leaks are scary so it would be nice to somehow get that self documented while coding...

ghost · 2020-11-14T02:40:30Z

Tagging @vitek-karas, @agocke, @CoffeeFlux as subscribed to this area.
See info in area-owners.md if you want to be subscribed.

Issue Details

Description:	Issue Title The `Unload` method does not free any assemblies returned from `LoadFromStream` that are still referenced somewhere. General netcoreapp3.1 public class TestAssemblyLoader : AssemblyLoadContext, IAssemblyLoader { public TestAssemblyLoader() : base(true) { LoadedAssemblies = new Dictionary<string, Assembly>(); Resolving += LoadContext_Resolving; } private Assembly LoadContext_Resolving(AssemblyLoadContext arg1, AssemblyName arg2) { return LoadedAssemblies[arg2.FullName]; } private Dictionary<string, Assembly> LoadedAssemblies { get; } Assembly IAssemblyLoader.LoadFromStream(Stream stream) { var assembly = LoadFromStream(stream); return LoadedAssemblies[assembly.FullName] = assembly; } protected sealed override Assembly Load(AssemblyName assemblyName) { return null; } public void Dispose() { LoadedAssemblies.Clear(); // Without this call, the assemblies in this dictionary don't get unloaded. Unload(); } } Symptom is visible in unreleased file handles - example repro here: https://github.com/dave-yotta/roslyn-assemblyunload It might make sense for this to be expected behaviour, but I certainly didn't expect it. (nor did our production machines 😢) At least can there be an argument to unload like `bool throwIfAnyUnloadFailed = false` so that it's self-documenting to callers somehow? Related issue I'm coming from here: dotnet/roslyn#49282
Author:	dave-yotta
Assignees:	-
Labels:	`area-AssemblyLoader-coreclr`, `untriaged`
Milestone:	-

janvorli · 2020-11-20T00:19:18Z

This is actually by design, although I agree that it is not obvious and can be confusing.
The main point is that an AssemblyLoadContext instance is just a representation of the assembly load context. Your TestAssemblyLoader instance that's in the default context holds the assemblies loaded into the context represented by the TestAssemblyLoader. Unload can complete only if all references from outside of the context to assemblies loaded into an unloadable AssemblyLoadContext, to types from those assemblies and instances of those types are gone.
It may seem that this should still be fine, since once the reference to the TestAssemblyLoader in your code is gone, GC should be able to collect it and thus remove references to the LoadedAssemblies and after that to the instances of Assembly inside of it. The problem though is that the unload process in the runtime needs to keep the TestAssemblyLoader alive by itself until the last phase of the unload when we detect that there are no more outside references to assemblies loaded into the context, types from those assemblies and their instances. So when unloading is initiated, we create a strong GC handle that keeps the TestAssemblyLoader alive. And then the Assemblies stored in the TestAssemblyLoader are a problem, as they prevent the unload process from entering the phase when we destroy that strong GC handle.
That means that clearing the dictionary you have in your example is necessary. Alternatively, you can store weak references to the Assemblies in your dictionary instead. That way, no clearing would be necessary.

dave-yotta · 2020-11-23T10:10:56Z

Ok thanks - makes sense, this isn't a bug.

I wasn't aware the AssemblyLoadContext was under a keepalive there, maybe an extra bullet point could be added to the top of https://docs.microsoft.com/en-us/dotnet/standard/assembly/unloadability#use-collectible-assemblyloadcontext?

It would be really nice to have something in the unload signature that indicates the cooperative nature, for example Task Unload() taking the signal from event Unloading which apparently is fired when unloading is completed. Or a clearer method name like RequestUnload/BeginUnload. I understand these things are difficult to change.

I don't want to take from reading docs and careful design - but this one is so easy to miss.

vitek-karas · 2020-11-23T10:17:58Z

Can you please file an issue for the docs (there's a link at the bottom of the page) - it's better if it comes from you because you know what is confusing/missing for you specifically?

As for the Unload I just wanted to mention that even the event is not the actual unload. Unloading fires when the runtime decides to actually do something about the unload, it's meant as a notification to interested parties that unload is about to happen and that they should "cleanup". At that point the load context is still alive including all the assemblies in it. I assume it would be technically possible to fire a different event when the unload is actually finished, but the tricky bit is what would be passed as a parameter. The load context itself should be collected at that point and thus not available. If we fire a bit earlier and pass the load context (even if it's empty), what would prevent the code to reintroduce references to it or actually load assemblies into it... it just gets rather complicated. Maybe the Task based interface would work nicer (as it doesn't need any context parameter).

dave-yotta · 2020-11-24T11:48:00Z

Yep no problem. Not sure how you might do this - typically unload needs to be called from the dispose method - I guess IAsyncDisposable can be used and would be compatible with a Task based interface. I'll leave it up to you to decide if there's anything that can be done here.

vitek-karas · 2020-11-24T15:54:42Z

The problem with just blind wait on unload is that it can take a really long time - calling Unload itself doesn't trigger GC, and so the actual unload would wait for GC to collect and more importantly finalize some of the objects. So a better approach would be to trigger GC manually, but that's not something a normal API should do, it's probably best to leave to the app to do that.
And even with all that, there's no guarantee the unload will happen, if something somewhere holds on to a reference, it will not unload.

The way I look at it - if unloading in the app is mainly about freeing resources, then there's no reason to actually wait for it (GC should handle that mostly). If the app needs the code to be unloaded (typical case is it wants to update the "plugin" and load it again, in-place), then it will need to manually trigger GC and implement some kind of timeout with error handling if the unload can't happen.

I know this is not an ideal behavior of the runtime for certain scenarios. The opposite approach was implemented in .NET Framework, where runtime basically guaranteed the unload would happen within some reasonable time frame, but that came with many problems, reliability being one (rudely unloading random code could and did lead to state corruption).

jkotas · 2020-11-24T16:25:04Z

The opposite approach was implemented in .NET Framework, where runtime basically guaranteed the unload would happen within some reasonable time frame

FWIW, AppDomain unloading was not guaranteed to succeed. It could timeout out and fail too, e.g. when the AppDomain was stuck in unmanaged code.

dave-yotta · 2020-11-25T09:53:33Z

Yeah I see the problems here. FYI we're compiling code fragments associated with actions in our workflow system (user defined stuff), so easily in the thousands of compilations which is what made it such a problem for us, not that this is your problem :).

More broadly, maybe what's desired here is the information required to gracefully identify and deal with failing unloads. An additional event UnloadPending(UnloadPendingEventArgs args) which can in some way list the refs it's waiting on in those args and maybe maybe UnloadFailed, UnloadCompleted events too could be provided. The presence of these events will speak to the programmer about the non-immediate and non-guaranteed nature of the unload, and can be hooked into to provide some kind of circuit breaker.

On the other hand, one does not start using the collectable assembly load context without already thinking about leaks. Therefore tests/investigations will be underway to confirm the assemblies are being released. It either works at this stage - in which case graceful handling is not necessary, or it does not work at this stage - in which case further investigations are done. It either works or it doesn't. In our investigations, memory usage flattened out but we did not think to check file handles or to look at the heap more closely.

janvorli · 2020-11-25T11:55:30Z

@dave-yotta are you aware of the doc on using and debugging unloadability I have written in the past?
https://docs.microsoft.com/en-us/dotnet/standard/assembly/unloadability

vitek-karas · 2020-11-25T12:05:44Z

FYI we're compiling code fragments associated with actions in our workflow system (user defined stuff), so easily in the thousands of compilations which is what made it such a problem for us, not that this is your problem :).

Thanks for the scenario, it's great to learn what the feature is used for. And I think that at least partially it is our problem, if the feature we built doesn't help solve real-world scenarios, it's no good.

One other option how to improve both debuggability and potentially usability might be to use tracing instead of "callbacks". We could add tracing events around unloading (we really should do that, regardless). When debugging it would be probably much easier to consume the events than to rely on modifying the code. And it's possible to get the events from the code as well (although not as easy as direct callbacks). The other advantage is that tracing is VERY cheap if turned off (likely to be by far the most common case).

dave-yotta · 2020-11-25T12:28:13Z

@janvorli thanks, yes I read this at least a couple of times, once when initially developing the unloadability (which also involved some tricky work figuring out how to get roslyn to give assembly bytecode with zero residual allocations/loads of assemblies), and once when debugging the file handle leak observed in production.

Unfortunately since memory usage flattened out compared to the eventual OOM we were getting previously, it seemed fixed and we didn't use sos to check more deeply. This might have had more to do with changes to the other code changes e.g. using CSharpCompilation.Create instead of CSharpScript.Create in our roslyn stuff. I also believe the file handle leak is only happening on linux (which is not the platform we develop on) - I'd have to confirm all that though.

I've linked further up here a ticket for the docs page to add a note that AssemblyLoadContext has a strong ref and wont be up for collection when you might normally expect - perhaps this is stated further on in the document and I've not read carefully enough - but it does feel worth mentioning at the top there.

dave-yotta · 2020-11-25T12:39:44Z

Tracing might be helpful. The reason I suggest adding something to the code is just to imply something about the nature of the unload to the developer. I agree the way unloading is done is correct - a corrupt stack is a very dangerous thing.

Ideally this stuff shouldn't reach prod - we should try to have tests which assert that resources are freed, which could be done from a load testing point of view just by checking memory and file handle usage under load. However we don't really know what kind of resources the AssemblyLoadContext is going to allocate, so a way, just from a unit testing perspective, to ask the runtime if an AssemblyLoadContext is available to be collected so that we can fail the test if it is not would be really helpful. However it's not something you'd want to advertise in non testing scenarios, since it will likely interact with the GC and other native handles...

vitek-karas · 2020-11-25T12:44:10Z

The way to test this in a controlled environment is to hold a weak reference to the AssemblyLoadContext and then in a loop trigger GC and test if the load context was collected. The sample code for that is in the doc @janvorli mentioned: https://docs.microsoft.com/en-us/dotnet/standard/assembly/unloadability#use-a-custom-collectible-assemblyloadcontext

If the weak ref points to nowhere, the load context was collected and I think it's reasonable then to trust that runtime freed all of the resources associated with it. That said if there's custom code loaded into the context which may have native resource there's nothing the runtime can do to make sure those are released as well. So that would probably require some custom testing after the load context is gone.

janvorli · 2020-11-25T12:51:14Z

You can also use weak refs on the assemblies loaded into an unloadable context and use them to indicate whether a specific assembly was unloaded or not.

dave-yotta · 2020-11-25T13:13:34Z

Yep you're both absolutely correct - I need to read that document more carefully. I can write a test for this now :)

The code fragments are in a whitelisted sandbox of sorts and thus very limited and simple so I do hope the day never comes that they can allocate native resources - but that's a very good point.

If you want to keep this open to provide tracing of unload events, please do so, otherwise feel free to close this ticket. Thanks to all for taking the time to help me with this!

vitek-karas · 2020-11-25T13:17:17Z

For the tracing support - if you can think of things you would find useful, could you please list those. It's always better to have real users tell us what they need than for us to try to guess.

dave-yotta · 2020-11-25T13:24:57Z

I would suggest for the AssemblyLoadContext per assembly if that level of detail is available:

Assembly: loaded/unload initiated/unload failed/unload completed
AssemblyLoadContext: unload initiated/unload completed

And perhaps at a more verbose trace level, if possible:

Assembly: not yet unloadable (each time it is checked following unload initiated)

janvorli · 2020-11-25T13:43:56Z

Please note that there is nothing that we can qualify as unload failure. There is no deadline after which we would consider the unloading as failed in the runtime. So if I take it to the extreme, you can call Unload and hold a reference to something inside of the context. If you release that reference a week later, the unload will complete at that point. The initiated / completed event should not be problematic to add.
Regarding the "not yet unloadable" - there is no periodic checking happening. It is all GC driven. So when the last reference to an assembly is gone and GC collects it, it results in removing a reference to the LoaderAllocator managed instance (that reference is stored in the Assembly's SyncRoot). So there is no explicit walking of a list of assemblies in the AssemblyLoadContext and checking whether they can be unloaded.
Btw, if you are interested in the low level behavior, there is an image of the dependency references that keep AssemblyLoadContext here: https://github.com/dotnet/runtime/blob/master/docs/design/features/unloadability.md#assemblyloadcontext-unloading-process

dave-yotta · 2020-11-25T14:02:12Z

Those two trace events for initiate/complete it is then :)

vitek-karas · 2020-11-26T15:57:11Z

I created #45264 to track the tracing support for unload. (It's cleaner to have it as a separate issue).

dave-yotta · 2020-12-09T12:56:05Z

I guess this can be closed then! :)

mairaw transferred this issue from dotnet/core Nov 14, 2020

Dotnet-GitSync-Bot added area-AssemblyLoader-coreclr untriaged New issue has not been triaged by the area owner labels Nov 14, 2020

dave-yotta mentioned this issue Nov 24, 2020

Add to docs: AssemblyLoadContext is GC.Pinned dotnet/docs#21708

Open

vitek-karas mentioned this issue Nov 26, 2020

Add tracing around assembly unload #45264

Open

dave-yotta closed this as completed Dec 9, 2020

ghost locked as resolved and limited conversation to collaborators Jan 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssemblyLoadContext.Unload silently fails to unload Assemblies, leaking filehandles #44679

AssemblyLoadContext.Unload silently fails to unload Assemblies, leaking filehandles #44679

dave-yotta commented Nov 12, 2020

PathogenDavid commented Nov 12, 2020

dave-yotta commented Nov 13, 2020 •

edited

Loading

ghost commented Nov 14, 2020

Issue Title

General

janvorli commented Nov 20, 2020

dave-yotta commented Nov 23, 2020 •

edited

Loading

vitek-karas commented Nov 23, 2020

dave-yotta commented Nov 24, 2020

vitek-karas commented Nov 24, 2020

jkotas commented Nov 24, 2020

dave-yotta commented Nov 25, 2020 •

edited

Loading

janvorli commented Nov 25, 2020

vitek-karas commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

vitek-karas commented Nov 25, 2020

janvorli commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

vitek-karas commented Nov 25, 2020

dave-yotta commented Nov 25, 2020 •

edited

Loading

janvorli commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

vitek-karas commented Nov 26, 2020

dave-yotta commented Dec 9, 2020

AssemblyLoadContext.Unload silently fails to unload Assemblies, leaking filehandles #44679

AssemblyLoadContext.Unload silently fails to unload Assemblies, leaking filehandles #44679

Comments

dave-yotta commented Nov 12, 2020

Issue Title

General

PathogenDavid commented Nov 12, 2020

dave-yotta commented Nov 13, 2020 • edited Loading

ghost commented Nov 14, 2020

Issue Title

General

janvorli commented Nov 20, 2020

dave-yotta commented Nov 23, 2020 • edited Loading

vitek-karas commented Nov 23, 2020

dave-yotta commented Nov 24, 2020

vitek-karas commented Nov 24, 2020

jkotas commented Nov 24, 2020

dave-yotta commented Nov 25, 2020 • edited Loading

janvorli commented Nov 25, 2020

vitek-karas commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

vitek-karas commented Nov 25, 2020

janvorli commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

vitek-karas commented Nov 25, 2020

dave-yotta commented Nov 25, 2020 • edited Loading

janvorli commented Nov 25, 2020

dave-yotta commented Nov 25, 2020

vitek-karas commented Nov 26, 2020

dave-yotta commented Dec 9, 2020

dave-yotta commented Nov 13, 2020 •

edited

Loading

dave-yotta commented Nov 23, 2020 •

edited

Loading

dave-yotta commented Nov 25, 2020 •

edited

Loading

dave-yotta commented Nov 25, 2020 •

edited

Loading