Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to isolate script execution with AssemblyLoadContext #631

Merged
merged 19 commits into from
Aug 25, 2021
Merged

Ability to isolate script execution with AssemblyLoadContext #631

merged 19 commits into from
Aug 25, 2021

Conversation

hrumhurum
Copy link
Contributor

.NET Core deprecated the concept of app domains. Only one (main) AppDomain is allowed and it cannot contain the assemblies with different versions when their simple names clash.

When such a clash occurs, an attempt to load the assembly produces a specific exception:

System.IO.FileLoadException: Could not load file or assembly 'sample-ref-asm, Version=1.0.0.1, Culture=neutral, PublicKeyToken=null'. Could not find or load a specific file. (0x80131621)

When the script happens to reference an assembly which is referenced by the host, but with different version, it cannot be executed due to this error.

AssemblyLoadContext is the new way of assembly isolation that was introduced in .NET Core, and it allows to overcome the aforementioned limitation.

This pull request will provide a gradual implementation of AssemblyLoadContext support.


Let's start with a technical refactoring that changes the affected interactions with AppDomain class to AssemblyLoadPal. That name stands for "Platform Abstraction Layer (PAL) for assembly loading functionality of a host environment" and comes from Gapotchenko.FX.Reflection.Loader module. The module contains other useful primitives as well but we leave them out for now.

Everything AssemblyLoadPal does is routes the assembly loading interactions with AppDomain to its own API. In this way, this tiny stable API can be reused when a host environment provides another ways of assembly loading such as AssemblyLoadContext. So AssemblyLoadPal is a stable abstraction that transparently works over app domains, assembly load contexts, and maybe some other future ways of dealing with assemblies.

In the first set of changes, the existing behavior of a script runner remains exactly the same but with the addition of that abstraction.

(To be continued)

@seesharper
Copy link
Collaborator

Just out of curiosity. Where you plan to plug this into Roslyn scripting? Take a look here for some of the details around this
https://github.com/dotnet/roslyn/tree/main/src/Scripting/Core/Hosting/AssemblyLoader

If I remember correctly this is where it stopped for me since everything in this area is either sealed or internal :)

@hrumhurum
Copy link
Contributor Author

Just out of curiosity. Where you plan to plug this into Roslyn scripting?

The current implementation of Roslyn already covers this by using its own AssemblyLoadContext in https://github.com/dotnet/roslyn/blob/96d6017c378ab311544c5521a3e5c6616fea48a5/src/Scripting/Core/Hosting/AssemblyLoader/CoreAssemblyLoaderImpl.cs#L45. This should provide a proper assembly isolation.

@hrumhurum
Copy link
Contributor Author

hrumhurum commented Aug 19, 2021

The second set of changes presented above allows to configure ScriptRunner class with a custom AssemblyLoadContext.

Since we want to use AssemblyLoadContext class, Dotnet.Script.Core project now additionally targets netcoreapp2.1 framework.

The API of ScriptRunner class remains fully compatible with the existing code at source and binary levels. This is achieved by adding ScriptRunner.AssemblyLoadContext property which has a getter and initializer (as opposed to a setter). In this way, ScriptRunner class remains immutable (as by design) and its public API does not impose breaking changes upon its customers.

I've also used nullable notation for that addition as the folks who switched to nullable checks reap the immediate benefits by seeing a question mark hints in IntelliSense dropdown, meaning that corresponding value is optional.

When ScriptRunner.AssemblyLoadContext property is not set, the behavior of ScriptRunner class remains absolutely intact comparing to the baseline version. This provides a behavioral compatibility for the users.

(To be continued)

@seesharper
Copy link
Collaborator

Just out of curiosity. Where you plan to plug this into Roslyn scripting?

The current implementation of Roslyn already covers this by using its own AssemblyLoadContext in https://github.com/dotnet/roslyn/blob/96d6017c378ab311544c5521a3e5c6616fea48a5/src/Scripting/Core/Hosting/AssemblyLoader/CoreAssemblyLoaderImpl.cs#L45. This should provide a proper assembly isolation.

Yes, I know about that one :) I am the one who wrote most of the assembly loading stuff :)

What I meant was how to plug the new custom AssemblyLoadContext into play so that the InteractiveAssemblyLoader actually uses this new AssemblyLoadContext

@hrumhurum
Copy link
Contributor Author

hrumhurum commented Aug 19, 2021

how to plug the new custom AssemblyLoadContext into play so that the InteractiveAssemblyLoader actually uses this new AssemblyLoadContext

AssemblyLoadContext implementations seem to support semi-automatic chaining. When a custom context cannot resolve the assembly, it is automatically delegated to its parent. The parent is either AssemblyLoadContext.Default or AssemblyLoadContext.CurrentContextualReflectionContext for .NET Core 3.0+ hosts.

We will plug using AssemblyLoadContext.EnterContextualReflection method, more on that later.

Design doc: https://github.com/dotnet/coreclr/blob/master/Documentation/design-docs/AssemblyLoadContext.ContextualReflection.md

Specific sample: https://github.com/dotnet/coreclr/blob/master/Documentation/design-docs/AssemblyLoadContext.ContextualReflection.md#maintaining-and-restoring-original-behavior. Note that all Assembly.Load calls in SomeCallbackMethod will be implicitly routed to CurrentContextualReflectionContext.

@seesharper
Copy link
Collaborator

how to plug the new custom AssemblyLoadContext into play so that the InteractiveAssemblyLoader actually uses this new AssemblyLoadContext

AssemblyLoadContext implementations support semi-automatic chaining. When a custom context cannot resolve the assembly, it is automatically delegated to its parent. The parent is either AssemblyLoadContext.Default or AssemblyLoadContext.CurrentContextualReflectionContext for .NET Core 3.0+ hosts.

We will plug using AssemblyLoadContext.EnterContextualReflection method, more on that later.

Ah, I did not know about that one😊 Never a day without learning something new 😊

With the regards to the dependency on Gapotchenko.FX.Reflection.Loader I'd rather see that we inlined the needed bits if any.
Although this assembly is not used by OmniSharp, we would like to keep dependencies down to an absolute minimum.

@hrumhurum
Copy link
Contributor Author

hrumhurum commented Aug 19, 2021

The third set of changes presented above completes the local support of AssemblyLoadContext in ScriptRunner class,

As we know, the script is compiled to a temp folder and its dependencies are laying around at the same folder. The default logic embodied into AssemblyLoadContext does not handle that scenario, so there is a need for a custom assembly loader that would handle that.

In order to achieve that, the AutoAssemblyLoader class is installed for the given assembly load context. It provides a scoped resolution for the just loaded assembly by handling assembly version selection, redirects, and unification of the dependencies. It comes from Gapotchenko.FX.Reflection.Loader module and has a pretty substantial implementation.

Other than providing a functional assembly loading, we should propagate our AssemblyLoadContext instance downstream to the script we run. This is achieved by using AssemblyLoadContext.EnterContextualReflection method for .NET Core 3.0+ targets.

(To be continued)

@hrumhurum
Copy link
Contributor Author

hrumhurum commented Aug 19, 2021

The last set of changes presented above glues and propagates a custom AssemblyLoadContext through the higher level API. The compatibility at source and binary levels is preserved.

Here is an example of how it can be used:

class IsolatedLoadContext : AssemblyLoadContext
{
    protected override Assembly Load(AssemblyName assemblyName) => null;
}

Task<int> ExecuteScript(string filePath, IReadOnlyList<string> arguments)
{
    var command = new ExecuteScriptCommand(ScriptConsole.Default, _LogFactory);

    var scriptFile = new ScriptFile(filePath);
    var commandOptions = new ExecuteScriptCommandOptions(scriptFile, arguments.ToArray(), OptimizationLevel.Release, null, false, false)
    {
        AssemblyLoadContext = new IsolatedLoadContext()
    };

    return command.Run<int, CommandLineScriptGlobals>(commandOptions);
}

The sample demonstrates running the script in an isolated assembly load context. When script is being run in this way, its assemblies do not clash with the host assemblies.

If we do not specify a custom AssemblyLoadContext then the behavior is the same as in baseline. So using a custom assembly load context is opt-in.


The commits above finish the core implementation of the feature.

@hrumhurum hrumhurum marked this pull request as ready for review August 19, 2021 11:28
@hrumhurum
Copy link
Contributor Author

hrumhurum commented Aug 19, 2021

With the regards to the dependency on Gapotchenko.FX.Reflection.Loader I'd rather see that we inlined the needed bits if any.

This module is used to the fullest extent here, so we would need almost all of the bits :)

I would vote against inlining of that particular one. If you dive deeper into the details, there are quite a few intricate aspects that should be handled in a particular and not always straightforward way.

This module encompasses maybe nearly 15 years of our work with .NET and custom assembly loading, but still, there are the days when we say "Aha!", learn something new, and stream those improvements into Gapotchenko.FX.Reflection.Loader.

I would really prefer if the maximum amount of living beings would benefit from that expertise, so really want to leave it upstreamed, not inlined. Reinventing the wheel is not energy-efficient and at some point of life you just want to rely on quality (which is hard to find, BTW).

However, a final decision is always yours and I would follow it.

@seesharper
Copy link
Collaborator

seesharper commented Aug 19, 2021

This module encompasses maybe nearly 15 years of our work with .NET and custom assembly loading, but still, there are the days when we say "Aha!", learn something new, and stream those improvements into Gapotchenko.FX.Reflection.Loader.

We might be able to live with that although it is not my decision alone, @filipw also need to give his stamp of approval 👍

Just a couple of other things to be aware of

  • Debugging - We need to verify that debugging scripts still work in VS Code.
  • Add a test that verifies that this custom assembly loading actually solves the problem with a version conflict between the host and the script. We currently reference Newtonsoft.Json 12.0.3. Maybe a test script which references an older version and verify that the version loaded is actually the one referenced in the script
#!/usr/bin/env dotnet-script
#r "nuget:Newtonsoft.Json, 10.0.1"

using System.Reflection;
using Newtonsoft.Json;
var versionAttribute = typeof(JsonConvert).Assembly.GetCustomAttributes<AssemblyInformationalVersionAttribute>().Single();
Console.WriteLine(versionAttribute.InformationalVersion);

This currently spits out 12.0.3+7c3d7f8da7e35dde8fa74188b0decff70f8f10e3 while it should have been 10.0.1xxxxxx if the script is running in true isolation

@hrumhurum
Copy link
Contributor Author

The changes so far only addressed the API side of the feature, so dotnet-script is unable to isolate assemblies yet as it does not specify a custom assembly load context.

The next commits will gradually address those scenarios.

@hrumhurum
Copy link
Contributor Author

hrumhurum commented Aug 19, 2021

With the commit above, isolation is turned on for script file execution via dotnet-script command-line utility.

It has an interesting behavior. The script that references a newer Newtonsoft.Json is isolated correctly:

#r "nuget:Newtonsoft.Json, 13.0.1"

using System.Reflection;
using Newtonsoft.Json;
var versionAttribute = typeof(JsonConvert).Assembly.GetCustomAttributes<AssemblyInformationalVersionAttribute>().Single();
Console.WriteLine(versionAttribute.InformationalVersion);

It produces the expected output 13.0.1+ae9fe44e1323e91bcbd185ca1a14099fba7c021f.

Without the isolation in place, the script just errors out with Could not load file or assembly 'Newtonsoft.Json, Version=13.0.0.0, Culture=neutral, PublicKeyToken=30ad4fe6b2a6aeed'.

But, the script referencing an older Newtonsoft.Json version 10.0.1 with or without isolation in place gets satisfied with an already loaded Newtonsoft.Json 12.0.3 and produces 12.0.3+7c3d7f8da7e35dde8fa74188b0decff70f8f10e3. It does not lead to the defects as Newtonsoft.Json is mostly binary compatible between the versions.

Should we really aim for the full assembly isolation? The existing behavior assuming assembly backward compatibility and partial isolation looks pretty sane.

@seesharper
Copy link
Collaborator

IMHO I think for this to have real value, we should try to reach for the stars and provide 100% isolation. Otherwise we are possibly left with the same or additional edge cases related to inference from the host.

@hrumhurum
Copy link
Contributor Author

hrumhurum commented Aug 19, 2021

I think for this to have real value, we should try to reach for the stars and provide 100% isolation.

Agree. Meanwhile I was able to achieve 100% isolation in a prototype implementation. Should I continue in this pull request or is it better to split in two? The first one - API support which is safe and already covers some use cases, the second pull request - full isolation, including dotnet-script.

@seesharper
Copy link
Collaborator

I think we should try to keep this in a single PR. The changes so far is not too comprehensive. It would also be easier to review these changes as a whole. Great work btw. Really appreciate your efforts on this 👍👌

@hrumhurum
Copy link
Contributor Author

The current implementation reaches the full assembly isolation at API and dotnet-script levels.

There is a remaining thing: the full isolation of interactive runner. Currently it runs with a partial isolation which is pretty good.

I've tried to make it fully isolated and was blocked by this line in Roslyn: https://github.com/dotnet/roslyn/blob/2f6768efa3f5575bae411524bb11364b2208c2c4/src/Scripting/Core/Hosting/AssemblyLoader/CoreAssemblyLoaderImpl.cs#L35

Unfortunately an explicit AssemblyLoadContext.LoadAssemblyFromPath call is not routed to the context returned by AssemblyLoadContext.CurrentContextualReflectionContext. This means that we cannot intercept it without modifying Roslyn. So for now, only the ScriptRunner class can be fully isolated. InteractiveRunner is at max partially isolated due to the way Roslyn works.


The pull requested is finished and ready for review/merge.

@seesharper
Copy link
Collaborator

Impressive work indeed 👍. Currently in beer-mode, but I will have a look at this tomorrow 👍😊. My greatest concern here is the debugger. Have you tried debugging in VS Code using 100% isolation?

@seesharper seesharper requested a review from filipw August 21, 2021 19:20
@seesharper
Copy link
Collaborator

I testet this just briefly today and it looks very promising indeed. Also ran a quick test to ensure debugging still works. This probably means that we can get rid of a couple of hacks like https://github.com/filipw/dotnet-script/blob/2c21516f1f42ab16712af32b365a4e1966c29ca5/src/Dotnet.Script.Core/Dotnet.Script.Core.csproj#L32

@hrumhurum
Copy link
Contributor Author

Great suggestions regarding issues #166 and #268. I've removed the corresponding workarounds as they are no longer needed.

@seesharper
Copy link
Collaborator

This is awesome 👍 Just one more thing

  • <PackageReference Include="Gapotchenko.FX.Reflection.Loader" Version="2021.2.10-beta" /> Do you know if a stable version of this is available? We'd rather not have beta packages being referenced from a stable package 😀

@hrumhurum
Copy link
Contributor Author

We'd rather not have beta packages being referenced from a stable package

Good point. Fixed.

@hrumhurum
Copy link
Contributor Author

The following example documents the available levels of assembly isolation:

Task<int> ExecuteScript(string filePath, IReadOnlyList<string> arguments)
{
    var command = new ExecuteScriptCommand(ScriptConsole.Default, _LogFactory);

    var scriptFile = new ScriptFile(filePath);
    var commandOptions = new ExecuteScriptCommandOptions(scriptFile, arguments.ToArray(), OptimizationLevel.Release, null, false, false)
    {
        AssemblyLoadContext = CreateAssemblyLoadContext()
    };

    return command.Run<int, CommandLineScriptGlobals>(commandOptions);
}

1. No assembly isolation:

static AssemblyLoadContext? CreateAssemblyLoadContext() => null;

2. Partial assembly isolation:

sealed class IsolatedLoadContext : AssemblyLoadContext
{
    protected override Assembly Load(AssemblyName assemblyName) => null;
}

static AssemblyLoadContext? CreateAssemblyLoadContext() => new IsolatedLoadContext();

3. Full assembly isolation:

static AssemblyLoadContext? CreateAssemblyLoadContext() => new ScriptAssemblyLoadContext();

ScriptAssemblyLoadContext class is provided by Dotnet.Script.Core module and represents a fully isolated assembly load context. If so desired, the end user of the API can inherit that class and build on top of it.

4. Custom assembly isolation:

static AssemblyLoadContext? CreateAssemblyLoadContext() => ...; // custom implementation of assembly load context

@seesharper
Copy link
Collaborator

@hrumhurum We just merged #633 which means that this PR needs to be updated to drop the netcoreapp2.1 target framework 👍

@hrumhurum
Copy link
Contributor Author

The support for .NET Core 2.1 has been dropped.

Copy link
Member

@filipw filipw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, this is great work. We set off in this direction in the past but never arrived anywhere 😂

@filipw filipw merged commit 282963a into dotnet-script:master Aug 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants