Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hosting multiple CLR side-by-side #22529

Open
kevingosse opened this issue Feb 11, 2019 · 21 comments

Comments

Projects
None yet
@kevingosse
Copy link
Contributor

commented Feb 11, 2019

I'm working on a plugin system for a native app, and I was hoping to load multiple CLRs side-by-side (for isolation and other practical reasons).

The hosting documentation states it's supported for different versions of the CLR:

Because .NET Core is able to run side-by-side with itself, it's even possible to create hosts which initialize and start multiple versions of the .NET Core runtime and execute apps on all of them in the same process.

However, when calling coreclr_initialize a second time, I always get an error code 0x80131022. Could it be that side-by-side only works when using different versions of the CLR?

Note: I don't know if it's relevant, but I only tried on Linux.

@davidfowl

This comment has been minimized.

Copy link
Contributor

commented Feb 11, 2019

You can't host multiple CLR versions in the same process. AFAIK it's unsupported for various reasons. Why do you need to do this?

@kevingosse

This comment has been minimized.

Copy link
Contributor Author

commented Feb 11, 2019

You can't host multiple CLR versions in the same process. AFAIK it's unsupported for various reasons.

If so, the above documentation should be updated.

Why do you need to do this?

I'm making a plugin for LLDB that can load .NET assemblies to use ClrMD. Hosting multiple CLRs was a nice way of making sure each plugin can work with its own assemblies without conflict (between different versions of ClrMD for instance), and was also a nice way to support multiple probing paths (not the only way, but convenient).

That said, I see one more issue: as I was investigating on the cause of this error, I just noticed that libsosplugin can host a CLR in some conditions:

coreclr_initialize_ptr initializeCoreCLR = (coreclr_initialize_ptr)GetProcAddress(coreclrLib, "coreclr_initialize");
(one way to reach that code is to use gcroot on an instance that is held by a thread)

It means that:

  • if libsosplugin is loaded first, my plugin needs a way to retrieve the hostHandle to interact with the hosted CLR
  • if my plugin is loaded first, then the initialization of SOS will fail, as it does not expect coreclr_initialize to return an error code:
    if (FAILED(Status))
    {
        ExtErr("Error: Fail to initialize CoreCLR %08x\n", Status);
        return Status;
    }
@hoyosjs

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

@mikem8361 mikem8361 self-assigned this Feb 12, 2019

@mikem8361 mikem8361 added this to the Future milestone Feb 12, 2019

@mikem8361

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

We may be able to do something in the "new" SOS in the diagnostics repo https://github.com/dotnet/diagnostics.git. My current work in the branch "soshost" branch that adds a ISOSHostServices interface might help but it is very specific to hosting the native SOS under a managed command line client using clrmd.

I'll think about this and let you know if I come up with something that might work for you.

@danmosemsft

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

cc @jeffschwMSFT to confirm docs are wrong.

@jeffschwMSFT

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

Thanks for the feedback I filed dotnet/docs#10440 to track updating the docs. It looks like AssemblyLoadContexts are what you really need (https://docs.microsoft.com/en-us/dotnet/api/system.runtime.loader.assemblyloadcontext?view=netcore-2.2). When using an AssemblyLoadContext you can provide complete isolation for the plug-ins. Here is an excellent article to help get you started: https://docs.microsoft.com/en-us/dotnet/core/tutorials/creating-app-with-plugin-support

@kevingosse

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

So far I've only considered AssemblyLoadContext for "managed code loading managed code" scenarios, not "native code loading managed code". Still, I'll see if I can work my way with that.

That said, it doesn't solve the issue of multiple components in the process that don't know each other but all want to host the CLR (in this case: my plugin and SOS).

Looking into the code, the error seems to come from there:

coreclr/src/vm/corhost.cpp

Lines 141 to 145 in 5bb7eb6

if (m_fStarted)
{
// This host had already invoked the Start method - return them an error
hr = HOST_E_INVALIDOPERATION;
}

(disclaimer: I just tracked down the error code, not actually debugged, so I could be wrong)

The comment above makes me think that side-by-side hosting has been considered at some point:

    // This prevents a host from invoking Stop twice and hitting the refCount to zero, when another
    // host is using the CLR, as CLR instance sharing across hosts is a scenario for CoreCLR.

Anyway, because we have an error code, we directly exit the coreclr_initialize method:

hr = host->Start();
IfFailRet(hr);

I'm thinking, would it make sense to populate the hostHandle parameter before exiting? This way, multiple components could interact with the same hosted CLR even without knowing each other.
I'm not sure it actually make sense, as it also raises a few issues (like the fact that the CLR may not have been initialized with the right parameters). That said, I'm really surprised to see support for side-by-side dropped like that, as it was really a big deal when it was introduced with .net 4.0.

@jeffschwMSFT

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

Let us know how the ALC use goes.

It is possible to use an already loaded coreclr that is in the process. If you first check if coreclr is loaded and use that handle, you do not need to reinitalize. It is definitely the case that once an instance is initialized, it is locked in on what version and other pertinent settings.

That said, I'm really surprised to see support for side-by-side dropped like that, as it was really a big deal when it was introduced with .net 4.0.

To be honest, we have very little positive feedback on the use of .NET 4.0 v 2.0 in proc SxS. As a result, the cost of maintaining the in product infrastructure was not work the cost. Thanks for the feedback that you have found it useful.

@kevingosse

This comment has been minimized.

Copy link
Contributor Author

commented Feb 12, 2019

If you first check if coreclr is loaded and use that handle, you do not need to reinitalize.

But how do you retrieve that handle if the CLR was initialized by another component? Is there another API like coreclr_initialize that allows to retrieve it? If not, wouldn't it make sense to have coreclr_initialize return it when the CLR is already initialized?

@mikem8361

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

What Kevin is proposing isn't SxS of different versions (I hope), but just having two or more independent pieces of native code that want to create delegates to an managed assembly in the same version of the runtime. I ran into something similar in SOS. We do need to think about any side effects or consequences of his change. One thing I did just think about is how this host code knows what version of the runtime to load if it is already loaded by some other host code.

/cc: @janvorli, @jkotas

@jeffschwMSFT

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

The current pattern is to use GetModuleHandle/dlopen to get an already loaded copy. Then the start entry point is allowed to be called multiple times and will increment the ref count. This becomes a bit more of a challenge on Linux if coreclr is not loaded globally.

We are right now designing entry points to hopefully make this experience better. We appreciate your feedback.

cc @vitek-karas

@vitek-karas

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

You can see the "use coreclr multiple times" in this change: dotnet/core-setup#4577.
It's basically a mechanism to instantiate multiple "plugins" (COM objects) from native code.

We're currently thinking on enabling something very similar for straight up native->managed interaction (native loading managed components) without the COM in between.

@mjsabby

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

Why can't we load multiple runtimes? I remember being able to do this, and while I haven't tried recently, did something change fundamentally that disallows it?

Our process used to have coreclr.dll, clr.dll and .net native in the same process while we did our .NET Core migration. We have .NET Native and coreclr.dll today and we're thinking of having more versions of the runtime loaded into the same process for certain compat scenarios.

I understand that hosting multiple runtimes complicates debugging, picking the right DAC and SOS .. and maybe some ETW stuff too, but is there more?

I would therefore think the docs are correct. Or are we saying this is a lightly treaded path and therefore certain things might not work?

@jkotas

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

Why can't we load multiple runtimes?

There is nothing fundamental preventing that. It is very likely going to work if you do that. The end-to-end experience is not there - a lot of things won't work (e.g. debugging the multiple times at the same time in Visual Studio is known to not work).

@mjsabby

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

The end-to-end experience is not there - a lot of things won't work (e.g. debugging the multiple times at the same time in Visual Studio is known to not work).

Thanks for confirming.

Right, the E2E experience is annoying, but I would expect that for the advanced user who is hosting the runtime(s) themselves that a caveated note in the documentation that it is an advanced scenario untested for E2E is more useful than a blanket not supported statement.

So I'd prefer if we were more nuanced in our documentation. I'm happy to take this part of the conversation to the issue @jeffschwMSFT created.

@chrisnas

This comment has been minimized.

Copy link
Contributor

commented Feb 13, 2019

@jeffschwMSFT

The current pattern is to use GetModuleHandle/dlopen to get an already loaded copy.

There should not be any relation with any GetModuleHandle call: the hosthandle is simply the address of the CorHost2 instance already newed.

As @mikem8361 mentioned,

What Kevin is proposing isn't SxS of different versions (I hope), but just having two or more independent pieces of native code that want to create delegates to an managed assembly in the same version of the runtime. I ran into something similar in SOS

This is exactly the context where @kevingosse is working: if it is not possible to "create" a CLR, it should be possible to attach to the existing one via a new coreclr_attach API. However, as @kevingosse mentioned, ALL plugins should follow this new pattern (including SOS)

BTW, after debugging the CoreClr on Windows with the hosting sample where coreclr_initialize is called twice, the call to SetStartupFlags fails because g_fEEStarted has been set to TRUE by the previous CLR initialization.

@AaronRobinsonMSFT

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

@chrisnas That is true, but not @jeffschwMSFT's point. What he is saying is the general pattern has been to get access to the already loaded coreclr.dll library then go from there, because there is at present no way to get access to the handle of the running clr instance without turning the crank again. As you pointed out, an API like coreclr_attach would satisfy this, but then back to @jeffschwMSFT's point you would need to get a handle to coreclr.dll to get at that export.

@govert

This comment has been minimized.

Copy link

commented May 8, 2019

@vitek-karas Yes please - I think I need exactly the native->managed interaction you are suggesting for loading multiple CoreClr versions from native code side-by-side.

I build an add-in framework (Excel-DNA) for creating Excel add-ins with .NET. Since I have no control over the hosting process (Excel) and add-ins developed independently need to work side-by-side, being able to load multiple CLR versions into the same process is important. Currently I have a small native stub the manages the hosting of the (possibly two different) runtimes and load / unload of the managed assemblies into separate AppDomains. With .NET Framework 2.0 and 4.0 runtimes it works fine, and it was quite common a few years ago to have add-ins targeting the different runtimes loaded together.

It sounds like AssemblyLoadContexts will be fine for some isolation between add-ins instead of AppDomains in a single CoreClr version. But this thread indicates there might be problems with multiple CoreClr versions in the same scenario.

  • Are there actual problems in having a native host be able to load multiple CLR instances and versions into the same native host process, or is it just a matter of defining the right hosting API?

  • What are the parts of CoreClr that have to be 'global' in the process?

@vitek-karas

This comment has been minimized.

Copy link
Member

commented May 9, 2019

@govert beside the points made by @jkotas it is technically possible. That said I believe that most users would expect to get the already loaded runtime when there is one. So if I call the native->managed hosting APIs multiple times, I expect to get to the same runtime every time.

For this reason the current design of the hosting layer is to support only one runtime in the process. Nothing prevents you from loading coreclr itself multiple times, and there may be some clever ways to use the new hosting APIs to prepare some of the data needed to initialize it. But the overall cost is up to your native code to handle this.

So to answer your question - I don't see any technical reason why the native host could not load multiple runtimes. At the same time, it goes against our current design, so it would require a lot of code changes to make that possible.

As for the overall scenario of having a native host which loads managed plugins: Our current design is to only load one runtime into the process, but make it so that as many plugins as possible can share it. To achieve this the native host should use the new native hosting APIs to load/get the runtime and the plugins should use the new roll-forward options (and also here) LatestMinor or LatestMajor.
The general idea is that managed components should load the latest compatible runtime/framework available on the machine, and thus make it very likely that other components will also be able to use the same runtime/framework.

@govert

This comment has been minimized.

Copy link

commented May 9, 2019

@vitek-karas Thank you for your response.

I think part of my confusion relates to the marketing around versioning in .NET Core, which has clearly been in flux.

Under .NET Framework 4.x backward compatibility (at least to earlier .NET 4.x versions) was a key premise, and so an add-in that was built targeting .NET Framework 4.5 could safely be loaded into the machine's (one and only) 4.x runtime, as long as that version was 4.5+.

A premise of .NET Core thus far has been that it can evolve more quickly, in incompatible ways, and that applications can decide what they target, and bring their own runtime along or at least have strong control over the version targeted. Even with the roll-forward options, the app still has control, and might not roll forward to another major version. This allows the .NET Core runtime to evolve at a faster pace then has been the case with .NET Framework.

For my Excel add-in scenario, this brings a conflict. There might be one add-in that targets .NET Core 3 without roll-forward, and completely independently developed add-ins targeting, say .NET 5. These need to live safely in the same process (calculating in the same workbook on the same thread), and able to load in any order. The add-ins have no control over the (basically hostile to .NET) native host, the add-ins might incorporate a native shim to help with loading, calling the runtime hosting APIs, COM support etc.

One promising approach was the CoreRT project and making a native AOT compiled library, but CoreRT seems to be deprecated for now. It's not clear that the new Mono AOT direction will cover creating purely native .dlls where multiple versions can live in the same process this way.

Anyway, I just wanted to highlight this specific scenario for you to consider as the designs evolve.

@weltkante

This comment has been minimized.

Copy link

commented May 9, 2019

How about this if you want to be more robust against "foreign" runtimes: if the runtime is deployed with the addin, and if you register a native shim class as Office addin, the shim class could load the runtime from the addin folder and then pass over to managed addin code; I don't know if there are any potential conflicts with runtimes loaded from different folders, but I'd expect each having their own "global" state (because different native images), unless coreclr is allocating named resources under a shared name nothing should conflict.

PS: side note, WinForms in .NET Core doesn't yet play well with Office, the ComponentManager code has not been ported yet, see issue dotnet/winforms#247

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.