Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please provide an API to locate an assembly file by name without loading #23546

Open
jnm2 opened this issue Sep 14, 2017 · 30 comments
Open

Please provide an API to locate an assembly file by name without loading #23546

jnm2 opened this issue Sep 14, 2017 · 30 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Milestone

Comments

@jnm2
Copy link
Contributor

jnm2 commented Sep 14, 2017

I'm using the MetadataReader to generate public API surface files such as https://pastebin.com/nLSzWPBA. For the most part, it's a breeze.

When it comes to deserializing enum attribute arguments, the System.Reflection.Metadata API requires me to provide the underlying type and then of course I want to be able to output that as the appropriate field name rather than as an integer cast to the enum's name. This requires me to look up an assembly on disk given (1) the path to the assembly I'm currently examining and (2) an AssemblyName instance which I create from the AssemblyReferenceHandle.

It's easy enough to use Assembly.ReflectionOnlyLoad and non-Metadata reflection API to locate the enum field, but I see even ReflectionOnlyLoad as undesirable:

  • It keeps the assembly in memory
  • It slows the process down by reading and processing more than it absolutely needs to
  • It loads in the context of the generating process, not the assembly being examined which might be anywhere on disk, so it might resolve to the wrong file
  • In the process of examining multiple assemblies, it might try to load different versions of an assembly with the same name and no strong name
  • It forces me to switch from doing everything with the MetadataReader to reimplementing the same logic with ‘normal’ reflection for enums defined outside the assembly being examined

Would you please expose an API which locates the assembly that would be loaded if the running process were the assembly being examined, given the AssemblyName, and passes back a file path which I can then open with the MetadataReader? Is there a better way to achieve what I want without this API, and without the list of issues that I see with Load and ReflectionOnlyLoad?

I'm imagining it could be something like this:

[updated per comment]

class System.Runtime.Loader.AssemblyLoadContext
{
    public static string Locate(AssemblyName assemblyString, string baseDirectory);
}

Or if it could be useful to the location logic to know not only the base path but some metadata in the assembly that the assembly reference came from:

class System.Runtime.Loader.AssemblyLoadContext
{
    public static string Locate(AssemblyName assemblyString, string forAssemblyPath);
}
@jnm2
Copy link
Contributor Author

jnm2 commented Sep 14, 2017

Actually, almost the exact inverse of System.Runtime.Loader.Assembly​Load​Context.​Get​Assembly​Name.

@ghost
Copy link

ghost commented Sep 14, 2017

(We don't seem to have a label for System.Runtime.Loader so tagging it as System.Runtime, before it gets auto-tagged as System.Reflection)

A few thoughts:

  • I think System.Runtime.Loader.AssemblyLoadContext would be the place for such functionality. We don't want to keep repeating the old mistakes of stuffing loader related stuff into the Reflection api.

  • I think in general, the idea behind System.Reflection.Metadata is that you get to be in charge of the assembly binding policy. You don't need to be tied to the runtime's.

  • That said, there's something to be said for separating the requests for binding results and that of creating a real live running Assembly out of it. I'm not sure this is the best use case for it, but the idea itself has merit.

@jnm2
Copy link
Contributor Author

jnm2 commented Sep 14, 2017

@atsushikan I'm fine being in charge of the assembly binding policy, so long as I can delegate that burden to the BCL and runtime. You say this may not be the best use case, but what would the ideal alternative then be? That I manually duplicate some subset of the runtime's assembly loading behavior?

@ghost
Copy link

ghost commented Sep 14, 2017

Well in general, I'm used to thinking of tooling binding environment as a separate concept from the runtime binding environment (Think cross-compiler scenarios.) so mingling them raises red flags in my mind.

But yeah, sometimes, it's just handy to say "the runtime's policy is good enough for the task at hand - just let me at it."

@joperezr
Copy link
Member

cc: @kouvel

@jnm2
Copy link
Contributor Author

jnm2 commented Sep 27, 2017

I'm completely blocked here whether my utility targets .NET Core or .NET Framework. My console app is reading a .NET Standard 1.3 assembly which references an enum in System.Runtime, Version=4.0.20.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a.

When my console app is targeting net47, it can resolve System.Runtime, Version=4.0.0.0 using fusion.dll's IAssemblyCache but not 4.0.20.0. I'm missing knowledge of some loading logic here.

When my console app is targeting netcoreapp1.0, I have no idea where to start to locate the System.Runtime which the .NET Standard assembly is specifically referencing. It's not in the same folder. Searching the internet hasn't turned anything up for me yet.

I need to be unblocked on .NET Framework so that this can run as an MSBuild task rather than a console app, and I need to be unblocked on .NET Core because my .NET Core integration tests are failing.

@jnm2
Copy link
Contributor Author

jnm2 commented Sep 27, 2017

@atsushikan I get your point about cross-compiler scenarios and all, but I'm worried that I'm up against reverse-engineering military grade assembly binding logic on both .NET Framework and .NET Core. I shouldn't have to know about shared frameworks, the GAC, binding redirection, and whatever specialized logic and edge cases are handled on each platform.

I need to be able to resolve assembly references from .NET Core, .NET Framework and .NET Standard assemblies– all three from a .NET Core process, and all three from a .NET Framework process.

@ghost
Copy link

ghost commented Sep 27, 2017

@joperezr, @kouvel, @karelz, just to set expectations, I consider this a System.Runtime.Loader request, not a Reflection request so other than the 2 cents I've already thrown in, I'm not planning on driving this further as I'm not the area owner for the loader. System.Runtime was the closest area match available.

@kouvel
Copy link
Member

kouvel commented Sep 27, 2017

When my console app is targeting net47, it can resolve System.Runtime, Version=4.0.0.0 using fusion.dll's IAssemblyCache but not 4.0.20.0. I'm missing knowledge of some loading logic here.

I tried what you described and I'm also seeing the same thing. I had to add this NuGet package:
https://www.nuget.org/packages/NETStandard.Library.NETFramework/2.0.0-preview2-25405-01
To the project to get it to load the netstandard1.3 assembly properly. Then, I believe what happens is the 4.0.20.0 reference gets redirected (either through a binding redirect or through some unification rules) to the System.Runtime assembly version from that package, which in turn type-forwards types from it to other assemblies (eventually to mscorlib for many types).

However, it looks like that those version redirects don't apply to ReflectionOnlyLoad and an assembly resolver may be necessary for that, which probably doesn't work for your scenario.

So on netfx one thought is to use Assembly.Load, maybe in a separate AppDomain if you need to unload it later.

CC @AlexGhiondea if he has any other guidance.

Regarding using MetadataReader, I'm not familiar with it, but since many types are forwarded to other assemblies, it may be necessary to walk the forwards to find the actual type in, say, mscorlib.

@russellhadley, I understand that your team owns the assembly loader now, is there someone who would be driving that area?

@kouvel
Copy link
Member

kouvel commented Sep 27, 2017

Another thought is to publish the netstandard1.3 project so that all of the non-runtime dependencies are in a known location. For runtime dependencies, you could just use Assembly.Load.

@jnm2
Copy link
Contributor Author

jnm2 commented Sep 28, 2017

Assembly.Load

😞 I'm really trying to minimize overhead. Also, what if I'm an MSBuild task analyzing a .NET Core assembly? I don't want to load what .NET Framework would load, I want to follow the logic of whichever .NET Core runtime the assembly was compiled against.

Same again, what if I'm a .NET Core console app analyzing a .NET Framework assembly? I don't want to load what .NET Core would load, I want to follow the logic of whichever .NET Framework version the assembly was compiled against.

See the points in my starting comment for more reasons this is infeasible- not least of which is that I'm in-process with MSBuild and loading multiple versions of system assemblies.

Regarding using MetadataReader, I'm not familiar with it, but since many types are forwarded to other assemblies, it may be necessary to walk the forwards to find the actual type in, say, mscorlib.

That's very easy to do, but I'm still stuck on the part where I can't read the type forwards because I can't locate the file they're in. 😄

@kouvel
Copy link
Member

kouvel commented Sep 28, 2017

I don't want to load what .NET Framework would load, I want to follow the logic of whichever .NET Core runtime the assembly was compiled against.

Same again, what if I'm a .NET Core console app analyzing a .NET Framework assembly? I don't want to load what .NET Core would load, I want to follow the logic of whichever .NET Framework version the assembly was compiled against.

If it's a netstandard type, does it matter which runtime it's coming from?

But if you need the type from the same runtime as the assembly being inspected, I don't see how the API you suggested adding would provide you with that location. The runtime you're running on doesn't know anything about the runtime the assembly being inspected is targeted for, or where that runtime would be available on the machine, it may not even be installed on the machine. Maybe the CLI tools could have some API surface that provides that info since they deal with runtimes and how to locate them.

@jnm2
Copy link
Contributor Author

jnm2 commented Sep 28, 2017

With .NET Standard assemblies we should be fine, but I already know I'm going to have to deal with multitargeted projects that target .NET Framework and I don't want to close off the possibility ot handling any platform target.

I don't see how the API you suggested adding would provide you with that location.

This is true, and I'm no longer sure how to solve the problem in a general fashion. Seems it would require a NuGet package shipping the resolution logic for each platform which sounds like a huge endeavor.
Since my purposes (for now) are just to aggregate which types are structs and the names and values of all the enums, maybe I can get away with treating every assembly as a .NET Framework assembly? And just forget about being about to run on .NET Core? If so, I'm back to wanting that API so that I don't have to use Assembly.Load as outlined at the top.

@kouvel
Copy link
Member

kouvel commented Sep 28, 2017

What about:

Another thought is to publish the netstandard1.3 project so that all of the non-runtime dependencies are in a known location. For runtime dependencies, you could just use Assembly.Load.

At least then, any non-standard types would be covered by published binaries that you can easily inspect and the only remaining types would be (probably) netstandard types for which you can use Assembly.Load and the number of these assemblies should be very limited. ?

@jnm2
Copy link
Contributor Author

jnm2 commented Sep 28, 2017

That's an okay starting point, but that brings me back to analyzing .NET Standard and .NET Framework dlls (which don't have system DLLs published in the same folder) from MSBuild and also analyzing both from a .NET Core Console app.

Problems with Assembly.Load:

  • It's documented bad practice to load assemblies in an execution context that are only to be used for reflection.
  • .NET Core forces me to keep loaded assemblies in memory.
  • It slows the process down by reading and processing more than it absolutely needs to.
  • It loads in the context of the generating process, not the assembly being examined which might be anywhere on disk, so it might resolve to the wrong file. For example, binding redirects and app.config.
  • In the course of examining assemblies references for multiple assemblies, there may be version conflicts.
  • It forces me to switch from doing everything with the MetadataReader to reimplementing the same logic with ‘normal’ reflection for enums defined outside the assembly being examined.

So the API would be preferable.

@jnm2
Copy link
Contributor Author

jnm2 commented Sep 28, 2017

If I have to use Assembly.Load, the next best thing honestly would be shelling out to a transient .NET Framework exe to locate the assembly out of process. But I'm still not sure how I could respect binding redirects this way.

@karelz
Copy link
Member

karelz commented Sep 28, 2017

Same again, what if I'm a .NET Core console app analyzing a .NET Framework assembly? I don't want to load what .NET Core would load, I want to follow the logic of whichever .NET Framework version the assembly was compiled against.

Following that logic, either .NET Core has to know all the logic in .NET Framework (that's obviously a non-starter given how complex the logic is and given the code duplication it would lead to), or .NET Framework needs to expose hosting APIs, providing answer (which I think it does).

But I'm still not sure how I could respect binding redirects this way.

You need to know anyway how the app is supposed to be executed - how it is hosted, which config it uses, etc. You should be able to use it as input for the spin-off process.

Overall, I think I understand your scenario.
I don't think the functionality aligns well with AssemblyLoadContext, because it is not tied to the current app/process (baseDirectory parameter is needed), I think it belongs more into hosting or separate similar-ish class entirely abstracted. We should IMO align it with ASP.NET requirements (e.g. deps.json reading code in core-setup repo) and other tools' requirements.
We should keep in mind that this is rather advanced scenario (not quite mainline) and the key value is perf, not functionality (there is 'ugly' slower workaround to shell it out into separate process).

@jnm2
Copy link
Contributor Author

jnm2 commented Oct 8, 2017

Perhaps I'll wait and see what comes of this discussion: https://github.com/dotnet/coreclr/issues/14263#issuecomment-335045286

@masonwheeler
Copy link
Contributor

@atsushikan

I think in general, the idea behind System.Reflection.Metadata is that you get to be in charge of the assembly binding policy. You don't need to be tied to the runtime's.

Here's my use case for being "tied to the runtime's binding policy:"

I've got a compiler that uses System.Reflection for its external type system. I'm trying to make it work as a VS language support plugin, but using it to analyze code under compilation breaks things because it loads the project's referenced assemblies into devenv.exe, and some of those references are likely to be other projects in the same solution, which makes rebuilding impossible.

Trying to put things into a different AppDomain so they can be unloaded turns out to be problematic because of various quirks in how VS works. So what I need is to replace the external type system with something else, while keeping all observable behavior exactly the same. This means I need the method @jnm2 is asking for, which will let me resolve an assembly's dependencies into exactly the same ones that Assembly.Load would have given me.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@joperezr joperezr removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2020
@joperezr joperezr modified the milestones: 5.0.0, Future Jul 6, 2020
@teo-tsirpanis
Copy link
Contributor

Isn't that solved by the System.Runtime.Loader.AssemblyDependencyResolver class?

@AaronRobinsonMSFT
Copy link
Member

@teo-tsirpanis I believe so. @elinor-fung or @vitek-karas Can we close this?

@vitek-karas
Copy link
Member

AssemblyDependencyResolver is related, but it doesn't provide the functionality requested above. The issue is somewhat convoluted... I see several potential feature requests:

  • Given the current runtime/environment provide the answer to "What file would this assembly name resolve to?". Basically it would be a companion to LoadFromAssemblyName. I'm not against this, but the value seems limited (see more details below about the overall scenario). Note that this is very related to another potential API addition we've discussed in the past which would be something like TryLoadFromAssemblyName - that is do the load but don't fail if you can't. Resolving the name to a file path is a first step of such API.
  • Given a built application answer the question "What file would an assembly name resolve to in the context of this application". This is obviously much more difficult. The application can be targeting different runtime (I'm running on .NET 6, but the app is for .NET 3.1 or even .NET Framework) - so it would mean the code would have to know binding rules for each shipped version - not really feasible. And that's just half of the problem - assembly resolution is contextual - the runtime will provide different answers based on which AssemblyLoadContext is used, how would that map to an external API. And to top it of, assembly resolution is extensible via custom ALCs or event handlers, how would the API run such code?
  • Given an app which is being built by the SDK right now, from a task which runs as part of the that build, answer the question "Which file is the assembly with this name?". This is much more reasonable question and the SDK already has mechanism how to answer this in the MSBuild world. There are item groups (different ones for different stages of the build/publish) which contain the full list of assemblies which make up the app. This mechanism is completely independent of the current runtime - that is SDK running on .NET 6 can do this for pretty much any older version including .NET Framework. Existing tools like the trimmer or R2R compiler make use of this to answer this exact question.

Reading the above it seems that the actual ask is for the last item - a way within the context of the SDK to get the full list of assemblies which make up the application. Doing that is an SDK functionality and we should not be adding any API into the runtime for this.

@vitek-karas
Copy link
Member

To answer @masonwheeler: That is basically what MetadataLoadContext does - it can load any assembly (unrelated to the current runtime - so .NET 6 can load .NET Framework assembly and vice versa) and provide a System.Reflection absed API to inspect it. It's a purely managed implementation (completely separate from the runtime), meaning once the assembly is not used GC will simply collect the relevant memory. The only limitations are:

  • It can't run any code from the assembly (since it doesn't actually load the assembly into the runtime)
  • It needs to be given an "assembly resolver" - which answers the question "What assembly does this name resolve to"

Note that it's a NuGet package which is available for pretty any version of .NET.

@masonwheeler
Copy link
Contributor

@vitek-karas Everything looks great right up until the point where it can't run any code from the assembly. And then the compiler comes crashing into a big brick wall because the entire metaprogramming system falls apart.

@vitek-karas
Copy link
Member

@masonwheeler So you have a tool which runs as part of the build which wants to run at least parts of the app being built?

I see only two ways around this:

  • You require that the app is targeting the same .NET with which it's being built. You can do this, but it goes a bit against the current .NET SDK design (SDK can build apps targeting older versions of .NET) and it will be rather problematic running this from VS (I think VS in some cases runs the SDK using .NET Framework, even if the app is targeting .NET 6 for example)
  • You run the code from the app in a new process. That way it can use different version of .NET from the tool.

The last alternative would be to have some kind of IL interpreter, but I'm not aware of any readily available (which doesn't mean it doesn't exist). But that would likely come with other limitations anyway.

@teo-tsirpanis
Copy link
Contributor

So you have a tool which runs as part of the build which wants to run at least parts of the app being built?

FWIW I have the exact same thing and it runs in-process within an MSBuild task if the build runs on modern .NET, and out-of-process if it runs on .NET Framework.

I didn't place any constraints on the framework the compiled app targets; if it targets an incompatible framework, it will make a best-effort try to load the assembly and its dependencies. The dependencies are the same passed to Roslyn, minus the reference assemblies.

@jnm2
Copy link
Contributor Author

jnm2 commented Apr 15, 2022

Thanks @vitek-karas. Your answer makes sense to me. I don't really want to take the current environment into consideration for my use case, and of course there is no exact way to locate an assembly's dependencies the way the assembly's containing app would do so at runtime. The only path forward would be to build and improve heuristics as needed.

Feel free to close when you consider the other folks' use cases here to be answered.

@masonwheeler
Copy link
Contributor

masonwheeler commented Apr 15, 2022

and of course there is no exact way to locate an assembly's dependencies the way the assembly's containing app would do so at runtime

Is any "what would happen in some other containing app" scenario actually being requested? My use case isn't for a hypothetical, but a fact about the current moment: "if I tried to load this assembly name right here right now in this process, what file would it resolve to?"

@vitek-karas
Copy link
Member

@masonwheeler as an API request "what would a given name resolve to" is perfectly valid. But the above discussion is about the usefulness of such API. There are not that many scenario I can think of where the API is helpful.

I thought that your scenario is a compiler like tool, which runs during app's build and tries to run parts of the app's code. In such case, loading the app's code into the compiler-like tool is a rather tricky proposition. And even if you decided to do it, you mention that you need to run that code, so why not load it fully as you'll need to run it anyway.
I'm trying to understand what is your scenario where the currently running tool would want to answer a question like "Where would I load AssemblyA if I were asked to do so?" and what the answer would be used for.

  • The one scenario which I know of is typically using MetadataLoadContext to inspect some random assembly "in the context of the current app" - but that doesn't allow running code, so that doesn't seem to fit your case.

Please understand that for us to do a good job of designing a new API (and also prioritizing the work), it's best if we have a solid understanding of the scenarios where such API would be used.

@masonwheeler
Copy link
Contributor

@vitek-karas Basically, there are two related but distinct use cases.

One is to find files to be loaded in a metadata-only context. MetadataLoadContext can deal with this well enough. The other is to load dependencies into an executable, unloadable context, to execute metaprogramming code at compile-time. That requirement breaks with MetadataLoadContext because it's not executable, and it breaks with the normal system when I try to integrate it into Visual Studio because it's not unloadable and VS loads the whole compiler into its process rather than running it externally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Projects
None yet
Development

No branches or pull requests

10 participants