Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MetadataImportOptions.All doesn't work in source generators? #72369

Closed
dennis-yemelyanov opened this issue Mar 2, 2024 · 18 comments
Closed

MetadataImportOptions.All doesn't work in source generators? #72369

dennis-yemelyanov opened this issue Mar 2, 2024 · 18 comments
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead

Comments

@dennis-yemelyanov
Copy link

I'm trying to get private members from INamedTypeSymbol using MetadataImportOptions.All. When looking at the generated code in Visual Studio it looks as expected (the generator finds a private field), but when running the app, the private field is not found.

Am I using this option incorrectly? Or what can explain the difference in behavior between the code shown in VS and the actual code that gets executed?

The generator code looks like this:

using System.Linq;
using Microsoft.CodeAnalysis;

namespace MyGenerator
{
    [Generator]
    public class Generator : IIncrementalGenerator
    {
        public void Initialize(IncrementalGeneratorInitializationContext context)
        {
            var memberNamesProvider = context.CompilationProvider
                .Select(static (c, _) =>
                {
                    return c
                        .WithOptions(c.Options.WithMetadataImportOptions(MetadataImportOptions.All))
                        .GetTypeByMetadataName("MyLibrary.MyLibraryClass")!.GetMembers().Select(x => x.Name);
                });

            context.RegisterSourceOutput(memberNamesProvider, (productionContext, memberNames) =>
            {
                var text = 
$@"public class GeneratedResult
{{
    public static void Print()
    {{
        System.Console.WriteLine(""{string.Join(" ", memberNames)}"");
    }}
}}";

                productionContext.AddSource("GeneratedResult.g.cs", text);
            });
        }
    }
}

 
Version Used: Visual Studio 17.9.2

Steps to Reproduce:

  1. Open GeneratorTest.sln from https://github.com/dennis-yemelyanov/MetadataImportOptions/tree/main in Visual Studio.
  2. Check the generated code under the GenertorTest project analyzers. It contains PrivateField (as expected):
    image
  3. Run the GeneratorTest project. The output does not contain PrivateField, and just prints this:

PublicField .ctor

Expected Behavior:

The app output includes PrivateField.

Actual Behavior:

The app output does not include PrivateField.

@dennis-yemelyanov
Copy link
Author

After more digging I found that this seems to be related to reference assemblies. After disabling reference assembly generation in the library project the issue is gone and the private field is found as expected:

<ProduceReferenceAssembly>false</ProduceReferenceAssembly>

However in my real life scenario, it's not feasible to identify all library projects where I need to add this property. Also this needs to be added for any new project, which is even harder to keep track of.

Is there a better solution/workaround?

@dennis-yemelyanov
Copy link
Author

Looks like a better way is to set <CompileUsingReferenceAssemblies>false</CompileUsingReferenceAssemblies> on the consuming project. This way only one project that references the library/generator needs to be updated.

The discrepancy between the generated code shown by Visual Studio and the actual code is a bit confusing though. I guess that's because ServiceHub.RoslynCodeAnalysisService.dll doesn't use ref assemblies? Ideally it should be fixed to be consistent with the actual build process

@jaredpar
Copy link
Member

jaredpar commented Mar 8, 2024

I guess that's because ServiceHub.RoslynCodeAnalysisService.dll doesn't use ref assemblies?

@CyrusNajmabadi can likely say what is different in the IDE here. I think it's because the IDE essentially uses Compilation as references in design time builds instead of real references (if they used real references in design time then they'd have to always be building which is inefficient). These Compilation have private members and hence are showing up in the generator when run in the IDE.

Is there a better solution/workaround?

The better approach is to not depend on private fields in referenced assemblies. Doing so essentially requires that you're using reflection to access members or depending on details that can, and will, be different at runtime.

Moving to IDE as this is more a question of how they handle references at design time. My expectation though is this will be a "By Design" resolution.  

@dennis-yemelyanov
Copy link
Author

Thank you, @jaredpar, for looking into this and providing your inputs!

Just wanted to add a small comment regarding this:

The better approach is to not depend on private fields in referenced assemblies. Doing so essentially requires that you're using reflection to access members or depending on details that can, and will, be different at runtime.

In my case these referenced assemblies are actually just referenced projects (that are even part of the same solution when building in VS) so they are not black boxes. And I just need to inspect their private fields to generate some code (to support some serialization scenarios for those objects).

So it does seem like a valid scenario for source generation. Although I guess one might argue that generators work best when they are scoped to just one project and don't use referenced projects at all (but that clearly limits some usage scenarios when working with existing project code structured in a certain way).

But I guess my main point is that if the API allows to do something (in this case passing MetadataImportOptions.All with the intention of getting private field information) it would be helpful to have some (or all) of:

  • Consistent behavior between VS and running code.
  • Some best practice documentation explaining why this might not be the best idea.
  • Some kind of compilation warning in generator code explaining potential issues and how to avoid them.

@jaredpar
Copy link
Member

jaredpar commented Mar 8, 2024

And I just need to inspect their private fields to generate some code (to support some serialization scenarios for those objects).

At the point you're inserting the serialization logic for a type, that includes private fields, in an assembly other than where it's defined, those fields are effectively public API. The other project is very much using and manipulating them. What is having private there actually buying you? This is a case where having internal fields and granting the project doing the serialization [InternalsVisibleTo] is a better approach.

@lxymahatma
Copy link

I think Private Proxy has the same problem because of this

Types from external dll(for example String)
Probably, by enabling MetadataImportOptions.All in the Source Generator, it should be possible to read it. However, I haven't been able to find a way to do that. I need help.

@jaredpar
Copy link
Member

Closing because at this point there are no plans to allow MetadataImportOptions.All in source generators.

@jaredpar jaredpar closed this as not planned Won't fix, can't repro, duplicate, stale Apr 25, 2024
@dennis-yemelyanov
Copy link
Author

@jaredpar what does it mean that it's not allowed in source generators? It's available in Roslyn API, but not recommended to use (basically deprecated)?

Also if it's not allowed in source generators, I'm curious what is the recommended/intended use for this option? Is it allowed in analyzers? Or is it only intended to be used by the compiler for some internal purposes?

@jaredpar
Copy link
Member

what does it mean that it's not allowed in source generators? It's available in Roslyn API, but not recommended to use (basically deprecated)?

In the source generator scenario the host controls how metadata references are loaded and whether or not they have the MetadataImportImports.All option set. There is no plans at this time to change the way that value is calculated.

In can be done in the code sample in the initial repro steps

                        .WithOptions(c.Options.WithMetadataImportOptions(MetadataImportOptions.All))
                        .GetTypeByMetadataName("MyLibrary.MyLibraryClass")!.GetMembers().Select(x => x.Name);

However that not something you should do in a source generator as it will not be performant.

Lastly though: in the command line compiler the majority of dependencies are reference assemblies. The value of MetadataImportOptions.All on reference assemblies is fairly limited.

Or is it only intended to be used by the compiler for some internal purposes?

This value should not be set by analyzers or generators.

@dennis-yemelyanov
Copy link
Author

I see. I thought I'd provide some details on my scenario. I have some old existing code (in multiple projects) that uses DataContractSerializer for serialization. DataContractSerializer allows to serialize private fields by adding a [DataMember] attribute to them. I'm working on migrating this code to use modern JSON serialization, so for compatibility I need to expose private fields as well, which I'm currently doing by generating custom JSON mapping using IJsonTypeResolver that accesses private fields (through reflection) and exposes them.

The solution seems to work well (aside from maybe performance, but the code base is not big enough for it to be a real issue). And I had to set CompileUsingReferenceAssemblies to false for the consuming projects.

Now I understand this is not really a supported configuration. Although I really liked being able to do that. And changing all private fields to internal is not the best option (setting aside the semantic change for these fields, this would have to be done manually, so it's error prone, and if I miss some field, it will not be serialized, and the issue may not be found easily without extensive testing).

@jaredpar
Copy link
Member

The solution seems to work well (aside from maybe performance, but the code base is not big enough for it to be a real issue). And I had to set CompileUsingReferenceAssemblies to false for the consuming projects.

This though is yet another way that you're making your build slower.

Now I understand this is not really a supported configuration.

It's more than not supported. We are making changes in the IDE such that we can begin to restrict when source generators run. This is going to be used as a mechanism to get problematic source generators out of the inner loop for typing. This type of pattern in your generator, where it is willfully doing scenarios we consider to be performance pitfalls, is the type of action that will possibly result in us pushing it out of the core typing scenarios.

And changing all private fields to internal is not the best option

Then I would recommend that a generator run in the assembly where the private fields are defined. There is simply not a lot of interest, across the entire .NET stack, to making it easier to poke into private data across assemblies like this.

@dennis-yemelyanov
Copy link
Author

dennis-yemelyanov commented Apr 25, 2024

Then I would recommend that a generator run in the assembly where the private fields are defined.

Yeah that was actually my initial design. But then it became too cumbersome to identify all the projects that the generator needs to be added to. So I switched it to adding the generator to just one project containing some root type and discovering all referenced types recursively from all the referenced projects (so the serialization code for all these types can be generated). That looked like a very simple and efficient approach... Until I discovered the issue with private fields

@CyrusNajmabadi
Copy link
Member

and discovering all referenced types recursively from all the referenced projects (so the serialization code for all these types can be generated). That looked like a very simple and efficient approach

Definitely not efficient :)

Also, afaict, cannot be done using incremental generators at all :)

Can you let us know how you setup your incremental pipeline here? I'd def like to see it.

@dennis-yemelyanov
Copy link
Author

dennis-yemelyanov commented Apr 25, 2024

Definitely not efficient :)

I should clarify, efficient from a development time perspective, not execution runtime :)

Also, afaict, cannot be done using incremental generators at all :)

Hmm not sure why that would be the case, I thought the main difference between incremental and non-incremental generators was caching, but otherwise they provided the same functionality.

My code looks like below and probably breaks every single rule of a good performing generator. But I just want to get things to work and to worry about optimizing it later when (and if) it becomes an issue. So far it hasn't been.

Or at least when there is a standard collection type that implements real equality (verifies that all elements in 2 collections are equal to each other). I'm definitely not looking to create such type and implement GetHashCode() for it myself, but until such type is available, caching will be broken anyway for any model that has collections in it (which for me is almost always the case since I like to build entity models that have collections of properties). So until then I feel like doing (or not doing) any other perf optimizations probably won't matter too much.

var entitiesToGenerate = context.CompilationProvider
    .Select(static (compilation, _) => compilation.WithOptions(compilation.Options.WithMetadataImportOptions(MetadataImportOptions.All)))
    .Combine(context.AdditionalTextsProvider.Where(x => x.Path.Contains("RootTypesToGenerate.txt")).Select(static (x, _) => x.GetText()).Collect())
    .Select(static (x, _) => x.Right.First().Lines.SelectMany(line =>
    {
        var symbols = ImmutableArray<INamedTypeSymbol>.Empty;

        // extract top level symbols for types specified in RootTypesToGenerate.txt

        return symbols;
    }))
    .SelectMany(static (x, _) =>
    {
        // Recursively visit all types referenced by a given type, including types of private fields (if they are marked as [DataMember])
        var visitor = new FindReferencedTypesVisitor();

        foreach (var namedTypeSymbol in x)
        {
            visitor.VisitNamedType(namedTypeSymbol);
        }

        return visitor.TypeSymbolsToGenerate;
    })
    .Select(static (x, _) => ModelBuilder.BuildEntityModel(x))
    .Collect()
    .Combine(context.AnalyzerConfigOptionsProvider);

context.RegisterSourceOutput(entitiesToGenerate, static (spc, e) => GenerateJsonCustomization(spc, e.Left, e.Right));

@CyrusNajmabadi
Copy link
Member

You're using symbols in your pipeline . So you will never have anything incremental here :-). And those steps will not be cheap, especially the recursive walking.

@dennis-yemelyanov
Copy link
Author

I'm aware of that. But my goal is to get things done in the least time consuming and error prone way, not necessarily to create the best performing generator, unless performance actually becomes an issue (I don't really care about my build being 1 second longer or key typing in the IDE being 1 millisecond slower)

@CyrusNajmabadi
Copy link
Member

Bad generators slow down ide features by orders of magnitude in real world projects (think seconds to tens of seconds, while also allocating gbs of memory a second). You don't care about that here, but we have to design and plan accordingly :-)

@dennis-yemelyanov
Copy link
Author

dennis-yemelyanov commented Apr 25, 2024

yeah I get that. And I think it's great that there is a framework like incremental generators that makes it easy to do things in a performant way and tries to make it harder to do something in a non-performant way. I also believe there are 2 categories of scenarios:

  1. Generators that are distributed to other users.
  2. Generators that are written for internal consumption only, to improve developer productivity, avoid reflection at runtime, replace a ton of manual work with a single generator etc.

For the first category performance is paramount and following best practices is a must, since the project size where the generator will be used and other parameters are unknown. For the second category though I think there is a tradeoff between performance and convenience, since all the parameters such as project size, acceptable build time etc. are well known, and solving a real problem is often more important than achieving perfect numbers.

So far I've been in the second category for quite some time, and am very happy with the productivity boost I can achieve using generators. I think it's one of the best features in .NET, even though I never wrote a generator for public consumption. So I hope that at least none of the existing generator features will be be taken out, just because they don't meet the bar for the first category of scenarios

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

No branches or pull requests

4 participants