Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: inverse [DynamicDependency] attribute for generated code trimming scenarios #50333

Open
Sergio0694 opened this issue Mar 28, 2021 · 34 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Diagnostics linkable-framework Issues associated with delivering a linker friendly framework
Milestone

Comments

@Sergio0694
Copy link
Contributor

Sergio0694 commented Mar 28, 2021

Background and Motivation

I'm working on the new version of the Microsoft.Toolkit.Mvvm package (aka MVVM Toolkit), and I'm specifically focusing on integrating source generators to make the library more flexible and to improve performance. With respect to this second point, one thing I've worked on is to use source generators to create code at runtime to skip a couple paths in the library where I'm currently using a compiled LINQ expression. The idea is that when the source generator runs, the library will try to grab the generated method and use that instead of a LINQ expression.

The issue I'm having is that with this setup I don't know how to express the dynamic dependency between my code, to help with trimming. As in, the generated code will be in consuming projects, and it will be used via reflection by code in the shipped library. I've seen the [DynamicAttribute] type, but it seems to basically be the inverse of what I need here. As in, according to the info in this blog post, there's two issues I see in this scenario:

  1. The [DynamicAttribute] (according to the blog post) works as follows: "when you include me, also include my friends". I basically need the opposite, as in, "when my friend is included, then include me as well".
  2. The [DynamicAttribute] also wants to know the assembly name of the dependent APIs. But in my case the annotated method called by the user (in the MVVM Toolkit) won't know the assembly name at build-time, as that'll just be whatever assembly name the consuming projects will use. I'd just know the fully qualified name (without assembly) of the types I want to keep. Also note that there could be multiple types with this same fully qualified name across different assemblies.

Basically I'm looking for a way to specify an inverse dependency, so that my generated code (which is not used directly) will be able to inform the linker that when a well known API (from a well known assembly) is included, then this generated code should be included as well, as it might be called via reflection by that code. I don't see a way to express that currently with existing APIs 🤔

Detailed use-case example

There's an ObservableValidator class that has the following method (EntityValidatorMap is a ConditionalWeakTable<K, V>):

protected void ValidateAllProperties()
{
    // Creates a delegate with a compiled LINQ expression
    static Action<object> GetValidationAction(Type type) { }

    EntityValidatorMap.GetValue(GetType(), static t => GetValidationAction(t))(this);
}

This will use a compiled LINQ expression to create and cache a function that will invoke ValidateProperty on all validatable properties on the current instance (the type of the instance is used as key to cache the delegates). Works fine, but avoiding a compiled LINQ expression would be nice. The new version with source generators has one that finds all classes in consuming projects inheriting from ObservableValidator, and for each of them it produces a method like this:

namespace Microsoft.Toolkit.Mvvm.ComponentModel.__Internals
{
    [DebuggerNonUserCode]
    [ExcludeFromCodeCoverage]
    [EditorBrowsable(EditorBrowsableState.Never)]
    [Obsolete("This type is not intended to be used directly by user code")]
    internal static partial class __ObservableValidatorExtensions
    {
        [global::System.ComponentModel.EditorBrowsable(global::System.ComponentModel.EditorBrowsableState.Never)]
        [global::System.Obsolete("This method is not intended to be called directly by user code")]
        public static global::System.Action<object> CreateAllPropertiesValidator(global::MyApp.PersonViewModel _)
        {
            static void ValidateAllProperties(object obj)
            {
                var instance = (global::MyApp.PersonViewModel)obj;
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Name, nameof(instance.Name));
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Age, nameof(instance.Age));
            }

            return ValidateAllProperties;
        }
    }
}

Then the method in ObservableValidator (in the actual library) is updated as follows:

protected void ValidateAllProperties()
{
    // Fast path that tries to create a delegate from a generated type-specific method. This
    // is used to make this method more AOT-friendly and faster, as there is no dynamic code.
    static Action<object> GetValidationAction(Type type)
    {
        if (type.Assembly.GetType("Microsoft.Toolkit.Mvvm.ComponentModel.__Internals.__ObservableValidatorExtensions") is Type extensionsType &&
            extensionsType.GetMethod("CreateAllPropertiesValidator", new[] { type }) is MethodInfo methodInfo)
        {
            return (Action<object>)methodInfo.Invoke(null, new object?[] { null })!;
        }

        return GetValidationActionFallback(type);
    }

    // Fallback method to create the delegate with a compiled LINQ expression
    static Action<object> GetValidationActionFallback(Type type) { }
    
    EntityValidatorMap.GetValue(GetType(), static t => GetValidationAction(t))(this);
}

Proposed API

Not 100% sure on what the best API design would be for this. One possible solutions would be to add a new attribute that'd be the "inverse" of [DynamicDependency], such as a [DynamicallyAccessed] attribute. In this example below I've mostly just adapted the definition from [DynamicDependency] with a few changes, so consider this more of an idea than a refined proposal (it isn't):

namespace System.Diagnostics.CodeAnalysis
{
    [AttributeUsage(AttributeTargets.Class | AttributeTargets.Constructor | AttributeTargets.Delegate | AttributeTargets.Enum | AttributeTargets.Event | AttributeTargets.Field | AttributeTargets.Interface | AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Struct, AllowMultiple = true, Inherited = false)]
    public sealed class DynamicallyAccessedAttribute : Attribute
    {
        public DynamicDependencyAttribute(string memberSignature);
        public DynamicDependencyAttribute(string memberSignature, Type type);
        public DynamicDependencyAttribute(string memberSignature, string typeName, string assemblyName);
        public DynamicDependencyAttribute(DynamicallyAccessedMemberTypes memberTypes, Type type);
        public DynamicDependencyAttribute(DynamicallyAccessedMemberTypes memberTypes, string typeName, string assemblyName);

        public string? MemberSignature { get; }
        public DynamicallyAccessedMemberTypes MemberTypes { get; }
        public Type? Type { get; }
        public string? TypeName { get; }
        public string? AssemblyName { get; }
        public string? Condition { get; set; }
    }
}

Usage example

With an attribute like this, I could essentially generate the following code:

[DynamicallyAccessed("ValidateAllProperties()", typeof(ObservableValidator))]
public static void ValidateAllProperties(global::MyApp.MyViewModel instance)
{
}

So that when ObservableValidator.ValidateAllProperties isn't trimmed out, all these generated methods that could be invoked via reflection by that method will not be trimmed out either.

Alternative Designs

n/a

Risks

None that I can see.

@Sergio0694 Sergio0694 added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Mar 28, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Reflection untriaged New issue has not been triaged by the area owner labels Mar 28, 2021
@ghost ghost added this to Needs triage in Triage POD for Meta, Reflection, etc Mar 28, 2021
@stephentoub
Copy link
Member

cc: @eerhardt

@eerhardt
Copy link
Member

An alternative thought would be to add a Target property to DynamicDependency, which would act as if the attribute was applied to the Target. You could generate the [DynamicDependency] attribute at the assembly level, and point to both sides of the DynamicDependency. The Usage example above would then be written as:

[assembly: DynamicDependency("ValidateAllProperties", typeof(__ObservableValidatorExtensions), Target = "ObservableValidator.ValidateAllProperties()")]

This would state that the ObservableValidator.ValidateAllProperties() method has a "dynamic dependency" on __ObservableValidatorExtensions.ValidateAllProperties. It would act just the same as if the [DynamicDependency] attribute was applied directly to the ObservableValidator.ValidateAllProperties() method.

One issue with the original proposal is that the trimmer would have to look through all the code to find these "inversed DynamicDependency" attributes. But adding an assembly level DynamicDependency attribute would be a lot less places the trimmer would need to look.

@vitek-karas any thoughts here?

@Sergio0694
Copy link
Contributor Author

Hi @eerhardt, thanks for taking a look!
Well first of all then I'm glad to see this is indeed not doable today and the new issue was valid 😄

Just to clarify, in the proposed Target property you mentioned, that should also include the fully qualified name of that type, correct? Also is it not needed to also specify the source assembly as well, just like with the other side of the dependency? As in, wouldn't there need to be a new pair of type/signature (eg. TargetType and TargetSignature), just like for the other side of the dependency, so that the linked would also be able to see where exactly that other member is coming from? 🤔

@vitek-karas
Copy link
Member

You're right that we don't have an attribute to do this today. There is a way to do the "target" behavior via linker attributes XML but it's not very user friendly (and I would recommend to NOT go that route).

All that said I don't think there's a problem here though. The trimming happens on the entire application. The trimmer tool assumes that it sees all of the code in the application (and it's highly unlikely we will come up with a version which doesn't make this assumption anytime soon). As such the trimmer needs to see the source generated code as well. Once it does that all of the dependencies of the generated code will be automatically analyzed and preserved - just like any other code in the app.

Given the code samples above I think the one missing piece is how the library gets to the source generated type. The reflection which looks for a specific namespece/type name in basically "all" assemblies is something trimming will have hard time dealing with. Ideally there would be no reflection in the code paths used by source generators. Reflection is obviously problematic for trimming, but it's also problematic for AOT (good AOT solutions try to avoid keeping metadata to save space and perf, which makes reflection hard).

I don't know what other teams decided to do to solve this problem (runtime discoverability of source generated code). One approach I've seen used in the past is module constructors. Now C# can generate module initializers your source generator can generate a module init method which registers all of the helpers it generated - typically into a static dictionary of the library the helpers are for. At runtime there's no need for reflection then, just go over the dictionary if it contains a helper for the case at hand. (You would also be able to get rid of the delegate type creation and Unsafe usage which is always a bit worrisome).

This would also solve the trimmer issue - module constructors are always preserved if the owning module is preserved, so all the code they refer to will also be preserved (in this case the generated helpers).

Side note: The attribute as proposed has another problem which is that it applies to "all" assemblies (find me my friend in all assemblies) - trimmer really hates this - as it potentially greatly limits what it can do. If an application refers to assembly A but never calls anything from it, trimmer should remove that assembly - and in fact it never even looks at that assembly right now. With the proposed behavior, it would have to look at that assembly and possibly keep something from it. I know that in your case this problem doesn't exist, but the general design of the attribute would lead to this.

@jkotas
Copy link
Member

jkotas commented Mar 29, 2021

Now C# can generate module initializers your source generator can generate a module init method which registers all of the helpers it generated - typically into a static dictionary of the library the helpers are for.

This pattern is not linker friendly. It keeps the helpers around even when the code that they are supporting is trimmed.

@buyaa-n buyaa-n moved this from Needs triage to Buyaa's triage backlog in Triage POD for Meta, Reflection, etc Mar 29, 2021
@Sergio0694
Copy link
Contributor Author

Sergio0694 commented Apr 25, 2021

Posting a small update on this issue about some recent changes I've pushed to the MVVM Toolkit v2, in case it's in any way relevant to the proposal. I've been talking with @GrabYourPitchforks about my current setup here, and following his suggestions I've reworked the code to avoid those Unsafe.As<T>(object) casts for the produced delegates, which as @vitek-karas also mentioned are a bit worrisome (as in, they do work fine but for sure the code gets pretty tricky to handle). The updated code produced by the source generators is now fully type-safe, requires no unsafe hacks and also skips that MakeGenericType call from the library at runtime, which should also make the library more AOT-friendly (or at the very least, faster in 100% AOT scenarios?).

Here's what the updated generator produces for a given user type (let's call it PersonViewModel):

Source-generated code (click to expand):
namespace Microsoft.Toolkit.Mvvm.ComponentModel.__Internals
{
    internal static partial class __ObservableValidatorExtensions
    {
        public static Action<object> CreateAllPropertiesValidator(PersonViewModel _)
        {
            static void ValidateAllProperties(object obj)
            {
                var instance = (PersonViewModel)obj;
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Name, nameof(instance.Name));
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Age, nameof(instance.Age));
            }

            return ValidateAllProperties;
        }
    }
}

NOTE: skipping all the extra attributes and globals:: here to keep the code compact.

And then the MVVM Toolkit does the following lookup at runtime, for the first invocation:

MVVM Toolkit library code (click to expand):
protected void ValidateAllProperties()
{
    static Action<object> GetValidationAction(Type type)
    {
        if (type.Assembly.GetType("Microsoft.Toolkit.Mvvm.ComponentModel.__Internals.__ObservableValidatorExtensions") is Type extensionsType &&
            extensionsType.GetMethod("CreateAllPropertiesValidator", new[] { type }) is MethodInfo methodInfo)
        {
            return (Action<object>)methodInfo.Invoke(null, new object?[] { null })!;
        }

        return GetValidationActionFallback(type);
    }
    
    EntityValidatorMap.GetValue(GetType(), static t => GetValidationAction(t))(this);
}

So this change made the usage safer and also easier to manage, which is already a very welcome improvement 😄

I still have the issue of letting the trimmer know not to removing these methods though, since they're only referenced via reflection by the library. Currently I haven't added a static mapping of sorts since @jkotas mentioned that's not a linker-friendly solution. Looking forwards to hearing whether this proposal makes sense then and/or what a good solution for this issue will be 🙂

@vitek-karas
Copy link
Member

Maybe there's a way to do this without new features (not trying to avoid adding something which makes sense, just exploring options):

I assume the source generator runs for some class defined in the app (typically) which derives from ObservableValidator. Would it be possible to require the classes to be marked partial? (or maybe source generator has a way to force this)
In that case you could generate the code into the class itself. And then the discovery can use a new feature in the trimmer, where you annotate the ObservableValidator with [DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.NonPublicMethods)]. After that calling ObservableValidator ov; ov.GetType().GetMethods(... look for the generated method) will make trimmer preserve the generated method (and such code will not generate any trimmer warnings unlike the current design).

@Sergio0694
Copy link
Contributor Author

"Maybe there's a way to do this without new features (not trying to avoid adding something which makes sense, just exploring options)"

Ahah of course, if there's a workaround that works just the same and without requiring new features I'd be just as happy 😄

"Would it be possible to require the classes to be marked partial? (or maybe source generator has a way to force this)"

Unfortunately that's not possible (as far as I know). You'd have to either just have your source generator ignore that, and just end up with a build error due to duplicate type definitions, forcing the user to then mark their class as partial, or you could do something like adding some custom diagnostics in the generator to detect whether partial is missing, and inform the user with a proper warning/error, like "The partial modifier is missing on type Foo, please etc. etc.". I guess this limitation is a result of source generators (at least in v1) being strictly additive, so they can't just "force" partial on a type that isn't.

I can see how this workaround would effectively work with trimming, I mostly just have these reservations about it:

  • I don't really like the idea of asking consumers to mark a class as partial when (from their point of view) there's no need to. What I mean is, compared to cases where the generator is effectively creating members they can use (eg. new properties), in this case the new method would strictly be an implementation detail of the library and they shouldn't ever use that directly (yet it would still show up in IntelliSense for them). Ideally, this should just be an automatic performance improvement the library should do without them having to be aware of it (because as I mentioned, the feature itself already works without the generator too, it's just slower as it uses a compiled LINQ expression).
  • With respect to what Jan said about a static mapping not being ideal due to it preventing trimming from working, wouldn't this workaround effectively do the same? What I mean is, if I added [DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.NonPublicMethods)] to ObservableValidator, then would the trimmer not just keep all non-public methods in all inherited classes, regardless of whether they're actually used or not? Which includes both these validation methods produced by the MVVM Toolkit, as well as possibly other non-public methods in inherited types, that final consumers might not actually be using 🤔

Also while we're at it I also have a more general question: regardless of this specific scenario I have in the MVVM Toolkit, do you feel that such a proposed feature (ie. ability to inform the trimmer of "backward" dynamic dependencies via an attribute) would generally be useful, especially considering source generators being more widespread in the future? As in, I was curious to know whether the team thought this proposal had merit on its own, not just to solve this specific issue I used as an example 😄

@jkotas
Copy link
Member

jkotas commented Apr 26, 2021

You may want to look at how this problem was solved for Json source generator: #45448. It has the same problem: The generator produces methods associated with types and it needs to look them up efficiently at runtime.

@vitek-karas
Copy link
Member

Re methods showing up in intellisense:
This is relatively easily fixable by adding [EditorBrowsable(EditorBrowsableState.Never)] onto the generated method - intellisense will then "hide" it.

Re DynamicallyAccessedMembers on the ObservableValidator:
The trimmer will only keep those methods on types which are already used by something else in the app. Just deriving from ObservableValidator is not enough, the type must be used by something as well, and the code which calls observableInstance.GetType() must be used by the app.

Re if this is useful:
I do think we need to come up with a solution for this type of scenario. The discussion here is basically what should the solution look like. There are constraints which we would like to obey if at all possible:

  • (the obvious one) don't invent something new, if it already has a workable solution
  • allow trimmer to be "lazy" - avoid any design which would require the trimmer to scan over all types/assemblies/methods in the application. This is what the original proposal didn't fulfill - but it's probably fixable somehow.
  • the new feature should not force trimmer to include too much unnecessary code around - that is: make it work, but also keep the app small. (This is very likely not a concern for this proposal)

@eerhardt proposal would probably fit all these. I'm a bit hesitant because I think that this is a design flaw in the source generators. Currently they don't seem to allow any way to "attach" the generated code to the code it belongs to using existing IL semantics (direct call, method on the same type, add an attribute to existing method, ...). So this attribute would basically try to workaround that limitation - just like the reflection calls at runtime are a workaround for that same problem. It's kind of weird that source generators which were partially designed to avoid reflection, force usage of reflection (and consequently cause trouble to trimming, AOT and so on).

@eerhardt
Copy link
Member

Source generators can only add new code, they can't change existing code.

I only see 3 high-level options to get the new generated code included in your application:

  1. The developer directly calls/references it
  2. You have a "global registration" mechanism (like a module initializer)
  3. Use reflection to scan for it

Currently, both the JSON (#45448) and Logging (#36022) source generators take approach (1) above.

@Sergio0694
Copy link
Contributor Author

Sergio0694 commented Apr 26, 2021

Correct me if I'm wrong, from what I can see you then basically get the following drawbacks for each approach:

  1. The developer directly calls/references it: this means the API surface needs to be altered and the devs needs to explicitly pass in some magic autogenerated thing produced by the generator to actually opt-in into the new feature/performance boost (eg. that JsonContext.<CLASS_NAME> property for System.Text.Json). So it's clunkier to use and not transparent, so to say.
  2. You have a "global registration" mechanism (like a module initializer): this interferes with trimming as Jan said, and can cause code that's not actually used to still be kept around, because the trimmer will just see it being referenced all the time.
  3. Use reflection to scan for it: this can (will?) break down when trimming is enabled, because those generated methods will not be referenced directly by anyone, and there's currently no way to inform the trimmer that they're being looked up by some other library code (which is the issue I'm currently facing here). EDIT: also, if caching can't be used, there's a cost for the lookup via reflection being done at runtime to grab the generated method(s).

So it's mostly about choosing which of these 3 patterns to go for, and accept the drawbacks it will involve?

As far as I can tell, among these 3 possible approaches, option 3 with the proposed attribute seems to be the only one that could be effectively fixed to result in a solution that has (effectively) no drawbacks in general? Because (1) will always require explicit action by consumers (ie. directly referencing some generated member), and option (2) is inherently not trimmer-friendly, whereas (3) could effectively be both transparent to consumers, as well as being trimmer-friendly with the right annotations. Again please correct me if I'm wrong, I'm just trying to properly understand all this and wrap my head around it 😄

@jkotas
Copy link
Member

jkotas commented Apr 26, 2021

The drawback of solution 3 is that it uses (relatively expensive) reflection at runtime.

@Sergio0694
Copy link
Contributor Author

Right, I should've probably clarified that (at least in my case) I'm just assuming that option (3) would go together with some sort of caching, so that you'd amortize the reflection cost over multiple executions. What I'm doing in the MVVM Toolkit is to just cache the retrieved delegate in a ConditionalWeakTable<Type, Action<...>>, with Type being this.GetType() from ObservableValidator in this case (so the runtime type of whatever consumer type is inheriting from that class and calling the method). This way after the first invocation there's no reflection involved, and using a CWT<,> should still ensure that the type and the associated delegate being retrieved could still be unloaded if needed. In my specific case this method is not a hot path anyway, so especially considering the lookup is just done once per type, the one-time reflection call isn't really a concern for performance.

I can cetainly agree that in cases where caching isn't an option and/or in cases where the generated code is meant to be used in a hot path (or in some path that is called more often, etc.), then doing lookup with reflection all the time might not be ideal 🙂

@jkotas
Copy link
Member

jkotas commented Apr 26, 2021

Even if you cache, you pay for the reflection in startup time and binary footprint of trimmed apps. For example, reducing startup time and size of trimmed apps was the priority for the JSON serializer source generator (#45448).

@eerhardt
Copy link
Member

You have a "global registration" mechanism (like a module initializer): this interferes with trimming as Jan said, and can cause code that's not actually used to still be kept around, because the trimmer will just see it being referenced all the time.

I guess I don't see the difference between this and any of the proposed solutions to "3. Use reflection to scan for it". All the proposed solutions to telling the trimmer to keep the generated code can result in preserving code that's not actually used.

@Sergio0694
Copy link
Contributor Author

"Even if you cache, you pay for the reflection in startup time and binary footprint of trimmed apps. For example, reducing startup time and size of trimmed apps was the priority for the JSON serializer source generator (#45448)."

@jkotas Oh, absolutely. I'm sorry if that came across the wrong way, I wasn't trying to say that approach (3) is just better than (1) nor that there's no valid use cases for (1), for sure. I can understand how that's the best solution for something like the JSON serializer and that it's overall a better solution with respect to things such as startup time, of course 🙂

My thinking here was that eg. with respect to the example use case in the MVVM Toolkit, where such a generated method would not be called at startup (you wouldn't try to validate a user form when opening an app), for me having the consumer experience be transparent (ie. not requiring users to alter their code just to opt-in into the faster code) is much more of a priority. As in, overall it's just a different use case scenario than something like those JSON APIs, from what I can see. Ideally I'd just want users to upgrade to the MVVM Toolkit v2 with the source generator, and just magically get faster code without requiring them to do anything.

"I guess I don't see the difference between this and any of the proposed solutions to "3. Use reflection to scan for it". All the proposed solutions to telling the trimmer to keep the generated code can result in preserving code that's not actually used."

@eerhardt Maybe I misunderstood here, I thought Jan meant to say that if the generated methods were annotated via an attribute, the linker could still technically remove them entirely if it detected that the code from the library looking them up with reflection was never used, whereas with the module initializer solution, they'd always be preserved no matter what. If I got that part wrong or if I'm missing something here, then please let me know 😄

@eerhardt
Copy link
Member

if the generated methods were annotated via an attribute, the linker could still technically remove them entirely if it detected that the code from the library looking them up with reflection was never used

There are other scenarios where unused code code would get kept unnecessarily. For example, if I had 2 classes ClassA and ClassB that had a validation extension generated for them. But I only ever used ClassA in the app (and used the library to look the validation up with reflection). In this case, ClassB's generated code would need to be kept, even though it was unused.

@jkotas
Copy link
Member

jkotas commented Apr 26, 2021

Ideally I'd just want users to upgrade to the MVVM Toolkit v2 with the source generator, and just magically get faster code without requiring them to do anything.

Unfortunately, source generators do not have the right features to make the update seamless and transparent for existing patterns.

The lean way to do transparent solutions like what you are looking for is by combining source generators to generate the bulk of the code and then combine them with something like Fody to hookup the generated code.

@Sergio0694
Copy link
Contributor Author

Replying to @eerhardt:

"[...] if I had 2 classes ClassA and ClassB that had a validation extension generated for them. But I only ever used ClassA in the app (and used the library to look the validation up with reflection). In this case, ClassB's generated code would need to be kept, even though it was unused."

Right, yeah I can see how only deciding what to keep based on the type can be limiting. I would say that given that the validation code is only generated if you specifically inherit from ObservableValidator, it would maybe not be a major issue, but yeah I can see how that would still not be ideal. Also consumers could choose not to use ValidateAllProperties at all and instead only validate individual properties as they're being edited, so yeah that would be a case where the generated code wouldn't be needed 🤔

Replying to @jkotas:

I see. Fody and IL weaving in general is really something we'd like to avoid in this case though 😥
In general, I have two main issues with the pattern used by System.Text.Json with respect to source generators:

  • As I mentioned, the fact it's not transparent, and requires developers to pass this obscure/magic generated type that's only actually needed as an implementation detail to get faster serialization - it doesn't provide some special feature per se. Especially with MVVM being used by many higher level devs that might not be aware/interested in this performance aspect too much (which would also make the API trickier to use in general) I'd personally like to avoid such a thing. This is one of the reasons why I would really like to stick with a more transparent solution for this, if possible. At this point I can see us shipping the MVVM Toolkit vNext without explicit support for trimming (worst case scenario, if code is removed, the fallback path with LINQ expressions will be used, so the whole thing should still work, just a bit slower), and possibly reevaluate this later.
  • Unless I'm missing something, even the approach that S.T.JSON is using would still not work in cases where the callsite doesn't actually know what specific type the current instance is - ie. in pretty much all cases where the type is either not sealed.

As an example for the previous point, consider this case:

public class ValidatorA : ObservableValidator
{
    public void Foo()
    {
        ValidateAllProperties(ValidationContext.Default.ValidatorA);
    }
}

public class ValidatorB : ValidatorA
{
}

new ValidatorB().Foo(); // Incorrect context is used 💣

Whereas the current approach (both in the shipped release of the MVVM Toolkit and in vNext with source generators) would have no problems with scenarios like this, as ValidateAllProperties will always just invoke the correct validation delegate matching this.GetType() for whatever instance the current object is. And it's also much easier to use as you never have to remember to explicitly pass in some magic generated validation context either.

To summarize, the situation is as follows. Consider a user of the MVVM Toolkit defining a type such as:

public class A : ObservableValidator { }

The source generator will then create the following validation method:

public static Action<object> CreateAllPropertiesValidator(A _) => throw null;

So from what I can see we basically need a way to express the following for each such generated method:

"Keep this method for type A if ObservableValidator.ValidateAllProperties() is ever invoked for an instance of type A"

...Which I guess just can't be fully determined statically by the trimmer though? 😟

@vitek-karas
Copy link
Member

...Which I guess just can't be fully determined statically by the trimmer though? 😟

That is the core of the problem - we want to come up with a design which would be statically analyzable. For example in System.Text.Json we would ideally want to be in a place that the trimmer can statically determine that all use cases have the source generated code available and thus the trimmer can remove the reflection based serialization from System.Text.Json (to reduce size). In order to do this, the callchain which leads to the serialization (or the usage in the general case) must be statically analyzable.

My current thinking (very rough):

  • Allow custom DynamicallyAccessedMembers-like attributes (For example SerializedToJsonAttribute which would otherwise act like the DynamicallyAccessedMembersAttribute). The trimmer would recognize these and apply the same propagation rules as for DAM. This would allow the serialization API to be annotated - so for example Serialize<[SerializedToJsonAttribute] T>(T value).
  • The source generator could than use this attribute to detect the types which are used in serialization and generate the required code.
  • Add support for something like DynamicDependency where one could specify both the "source" and "target" as assembly level attributes. This would allow the source generator to inject these attributes which would "tie" the generated code with the user defined type/method it belongs to.
  • Probably add some additional metadata to the new dependency to also tie it to the new DAM-like annotation, so that trimmer can somehow validate that all callsites for a given DAM-like annotation are "correct" (have the matching source generated code) and then potentially use this information to remove the fallback paths in the serializer itself (the reflection based serialization).

(This is very high-level and very hand wavy, but it feels like going in the right direction)

Granted that this might not solve the validation issue you describe above - mainly because that API is not strongly typed in any way and the trimmer has very limited ways to statically analyze it. That said - we recently introduced a new mechanism to allow DAM on types (applies to all derived types in a way) - dotnet/linker#1929 - maybe that could be combined with the above to get it working.

@buyaa-n buyaa-n added this to the Future milestone Jul 7, 2021
@buyaa-n buyaa-n removed the untriaged New issue has not been triaged by the area owner label Jul 7, 2021
@buyaa-n buyaa-n moved this from Buyaa's triage backlog to Future in Triage POD for Meta, Reflection, etc Jul 7, 2021
@Sergio0694
Copy link
Contributor Author

Thank you for the proposed solution idea @vitek-karas, I'm glad to see that at least there's some interest in solving this 😄

"Add support for something like DynamicDependency where one could specify both the "source" and "target" as assembly level attributes. This would allow the source generator to inject these attributes which would "tie" the generated code with the user defined type/method it belongs to."

This bit in particular makes sense and it seems to be in line to what had been suggested before as well, with a "glue" attribute to externally link two separate methods potentially from different assemblies, and placed at the assembly level to make life easier for the linker that wouldn't have to crawl every single type and method to discover these attributes.

"That is the core of the problem - we want to come up with a design which would be statically analyzable."

I don't think I'm entirely clear on why that wouldn't be possible for the requested solution. I mean I'm sure there's a reason but I'm struggling to fully understand it, so I'd appreciate if anyone could take a minute to explain that 🙂

To given an example, consider this scenario, which is basically equivalent to what I'm proposing:

class A
{
    public virtual void Foo() { }
}

class B : A
{
    public override void Foo() { }
}

Now, even though B.Foo() wasn't ever called directly in user code, I'd assume the linker would still know when to safely elide it, eg. depending on whether A.Foo() was ever called (on a non-derived type that's not in B's hierarchy). That is essentially the same scenario I have, with the only difference being that in my case I'm trying to express this same kind of dependency externally, through attributes. But the same behavior would apply: I want my generated method Foo(B) to be kept under the same circumstances that would case the linker to preserve an override for B.Foo() if that was present. So granted, we'd need a new attribute to express this dependency, but then once we had that wouldn't that same linker mechanism already exist? For sure it'd need some changes to also support the attribute, but what I mean is, isn't this same trimming scenario already supported in cases like this, with overridden methods only invoked through polymorphic calls to base instances? 🤔

I think this is the main point that I'm struggling to understand - the way I see it this would at least in theory be doable?
Thanks!

@ghost ghost moved this from Future to Needs triage in Triage POD for Meta, Reflection, etc Aug 25, 2021
@ghost ghost added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Aug 25, 2021
@vitek-karas
Copy link
Member

Sorry for the confusion in terminology. I should have been clearer about the "statically analyzable".

It is definitely statically analyzable to determine what to keep (as you mention, it's no different from quite a few similar mechanics in the trimmer already).

There's another aspect to "statically analyzable" though - verifiability. And that aspect is not about what is kept, but if what is kept is enough to make the app work. So then the question would be "How will the source generated code be discovered at runtime". That is, what mechanism will be used at runtime to find the Foo(B) generated method and how is it invoked. Ideally the trimmer should be able to validate, that such mechanism is in line with the marking decisions it made. This is a much harder problem to statically analyze.

Without the verifiability the tooling is not helpful really. We would have tools for library devs to hint the trimmer what to keep, but there would be no tooling to verify that it's good enough. For example, what if the library author made a mistake in how this new attribute is used. It might work in lot of cases, but could be broken in others. Now I used the library in my app, I turn on trimming and my app breaks. We really want to avoid that. We want to build trimmer features such that if the trimmer doesn't produce any warnings, it's almost guaranteed that the app will keep working the same even after trimming.

@joperezr joperezr added linkable-framework Issues associated with delivering a linker friendly framework and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Aug 31, 2021
@ghost
Copy link

ghost commented Aug 31, 2021

Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF, @sbomer, @joperezr
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and Motivation

I'm working on the new version of the Microsoft.Toolkit.Mvvm package (aka MVVM Toolkit), and I'm specifically focusing on integrating source generators to make the library more flexible and to improve performance. With respect to this second point, one thing I've worked on is to use source generators to create code at runtime to skip a couple paths in the library where I'm currently using a compiled LINQ expression. The idea is that when the source generator runs, the library will try to grab the generated method and use that instead of a LINQ expression.

The issue I'm having is that with this setup I don't know how to express the dynamic dependency between my code, to help with trimming. As in, the generated code will be in consuming projects, and it will be used via reflection by code in the shipped library. I've seen the [DynamicAttribute] type, but it seems to basically be the inverse of what I need here. As in, according to the info in this blog post, there's two issues I see in this scenario:

  1. The [DynamicAttribute] (according to the blog post) works as follows: "when you include me, also include my friends". I basically need the opposite, as in, "when my friend is included, then include me as well".
  2. The [DynamicAttribute] also wants to know the assembly name of the dependent APIs. But in my case the annotated method called by the user (in the MVVM Toolkit) won't know the assembly name at build-time, as that'll just be whatever assembly name the consuming projects will use. I'd just know the fully qualified name (without assembly) of the types I want to keep. Also note that there could be multiple types with this same fully qualified name across different assemblies.

Basically I'm looking for a way to specify an inverse dependency, so that my generated code (which is not used directly) will be able to inform the linker that when a well known API (from a well known assembly) is included, then this generated code should be included as well, as it might be called via reflection by that code. I don't see a way to express that currently with existing APIs 🤔

Detailed use-case example

There's an ObservableValidator class that has the following method (EntityValidatorMap is a ConditionalWeakTable<K, V>):

protected void ValidateAllProperties()
{
    // Creates a delegate with a compiled LINQ expression
    static Action<object> GetValidationAction(Type type) { }

    EntityValidatorMap.GetValue(GetType(), static t => GetValidationAction(t))(this);
}

This will use a compiled LINQ expression to create and cache a function that will invoke ValidateProperty on all validatable properties on the current instance (the type of the instance is used as key to cache the delegates). Works fine, but avoiding a compiled LINQ expression would be nice. The new version with source generators has one that finds all classes in consuming projects inheriting from ObservableValidator, and for each of them it produces a method like this:

namespace Microsoft.Toolkit.Mvvm.ComponentModel.__Internals
{
    [DebuggerNonUserCode]
    [ExcludeFromCodeCoverage]
    [EditorBrowsable(EditorBrowsableState.Never)]
    [Obsolete("This type is not intended to be used directly by user code")]
    internal static partial class __ObservableValidatorExtensions
    {
        [global::System.ComponentModel.EditorBrowsable(global::System.ComponentModel.EditorBrowsableState.Never)]
        [global::System.Obsolete("This method is not intended to be called directly by user code")]
        public static global::System.Action<object> CreateAllPropertiesValidator(global::MyApp.PersonViewModel _)
        {
            static void ValidateAllProperties(object obj)
            {
                var instance = (global::MyApp.PersonViewModel)obj;
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Name, nameof(instance.Name));
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Age, nameof(instance.Age));
            }

            return ValidateAllProperties;
        }
    }
}

Then the method in ObservableValidator (in the actual library) is updated as follows:

protected void ValidateAllProperties()
{
    // Fast path that tries to create a delegate from a generated type-specific method. This
    // is used to make this method more AOT-friendly and faster, as there is no dynamic code.
    static Action<object> GetValidationAction(Type type)
    {
        if (type.Assembly.GetType("Microsoft.Toolkit.Mvvm.ComponentModel.__Internals.__ObservableValidatorExtensions") is Type extensionsType &&
            extensionsType.GetMethod("CreateAllPropertiesValidator", new[] { type }) is MethodInfo methodInfo)
        {
            return (Action<object>)methodInfo.Invoke(null, new object?[] { null })!;
        }

        return GetValidationActionFallback(type);
    }

    // Fallback method to create the delegate with a compiled LINQ expression
    static Action<object> GetValidationActionFallback(Type type) { }
    
    EntityValidatorMap.GetValue(GetType(), static t => GetValidationAction(t))(this);
}

Proposed API

Not 100% sure on what the best API design would be for this. One possible solutions would be to add a new attribute that'd be the "inverse" of [DynamicDependency], such as a [DynamicallyAccessed] attribute. In this example below I've mostly just adapted the definition from [DynamicDependency] with a few changes, so consider this more of an idea than a refined proposal (it isn't):

namespace System.Diagnostics.CodeAnalysis
{
    [AttributeUsage(AttributeTargets.Class | AttributeTargets.Constructor | AttributeTargets.Delegate | AttributeTargets.Enum | AttributeTargets.Event | AttributeTargets.Field | AttributeTargets.Interface | AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Struct, AllowMultiple = true, Inherited = false)]
    public sealed class DynamicallyAccessedAttribute : Attribute
    {
        public DynamicDependencyAttribute(string memberSignature);
        public DynamicDependencyAttribute(string memberSignature, Type type);
        public DynamicDependencyAttribute(string memberSignature, string typeName, string assemblyName);
        public DynamicDependencyAttribute(DynamicallyAccessedMemberTypes memberTypes, Type type);
        public DynamicDependencyAttribute(DynamicallyAccessedMemberTypes memberTypes, string typeName, string assemblyName);

        public string? MemberSignature { get; }
        public DynamicallyAccessedMemberTypes MemberTypes { get; }
        public Type? Type { get; }
        public string? TypeName { get; }
        public string? AssemblyName { get; }
        public string? Condition { get; set; }
    }
}

Usage example

With an attribute like this, I could essentially generate the following code:

[DynamicallyAccessed("ValidateAllProperties()", typeof(ObservableValidator))]
public static void ValidateAllProperties(global::MyApp.MyViewModel instance)
{
}

So that when ObservableValidator.ValidateAllProperties isn't trimmed out, all these generated methods that could be invoked via reflection by that method will not be trimmed out either.

Alternative Designs

n/a

Risks

None that I can see.

Author: Sergio0694
Assignees: -
Labels:

api-suggestion, area-System.Reflection, linkable-framework

Milestone: Future

@ghost
Copy link

ghost commented Oct 27, 2022

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and Motivation

I'm working on the new version of the Microsoft.Toolkit.Mvvm package (aka MVVM Toolkit), and I'm specifically focusing on integrating source generators to make the library more flexible and to improve performance. With respect to this second point, one thing I've worked on is to use source generators to create code at runtime to skip a couple paths in the library where I'm currently using a compiled LINQ expression. The idea is that when the source generator runs, the library will try to grab the generated method and use that instead of a LINQ expression.

The issue I'm having is that with this setup I don't know how to express the dynamic dependency between my code, to help with trimming. As in, the generated code will be in consuming projects, and it will be used via reflection by code in the shipped library. I've seen the [DynamicAttribute] type, but it seems to basically be the inverse of what I need here. As in, according to the info in this blog post, there's two issues I see in this scenario:

  1. The [DynamicAttribute] (according to the blog post) works as follows: "when you include me, also include my friends". I basically need the opposite, as in, "when my friend is included, then include me as well".
  2. The [DynamicAttribute] also wants to know the assembly name of the dependent APIs. But in my case the annotated method called by the user (in the MVVM Toolkit) won't know the assembly name at build-time, as that'll just be whatever assembly name the consuming projects will use. I'd just know the fully qualified name (without assembly) of the types I want to keep. Also note that there could be multiple types with this same fully qualified name across different assemblies.

Basically I'm looking for a way to specify an inverse dependency, so that my generated code (which is not used directly) will be able to inform the linker that when a well known API (from a well known assembly) is included, then this generated code should be included as well, as it might be called via reflection by that code. I don't see a way to express that currently with existing APIs 🤔

Detailed use-case example

There's an ObservableValidator class that has the following method (EntityValidatorMap is a ConditionalWeakTable<K, V>):

protected void ValidateAllProperties()
{
    // Creates a delegate with a compiled LINQ expression
    static Action<object> GetValidationAction(Type type) { }

    EntityValidatorMap.GetValue(GetType(), static t => GetValidationAction(t))(this);
}

This will use a compiled LINQ expression to create and cache a function that will invoke ValidateProperty on all validatable properties on the current instance (the type of the instance is used as key to cache the delegates). Works fine, but avoiding a compiled LINQ expression would be nice. The new version with source generators has one that finds all classes in consuming projects inheriting from ObservableValidator, and for each of them it produces a method like this:

namespace Microsoft.Toolkit.Mvvm.ComponentModel.__Internals
{
    [DebuggerNonUserCode]
    [ExcludeFromCodeCoverage]
    [EditorBrowsable(EditorBrowsableState.Never)]
    [Obsolete("This type is not intended to be used directly by user code")]
    internal static partial class __ObservableValidatorExtensions
    {
        [global::System.ComponentModel.EditorBrowsable(global::System.ComponentModel.EditorBrowsableState.Never)]
        [global::System.Obsolete("This method is not intended to be called directly by user code")]
        public static global::System.Action<object> CreateAllPropertiesValidator(global::MyApp.PersonViewModel _)
        {
            static void ValidateAllProperties(object obj)
            {
                var instance = (global::MyApp.PersonViewModel)obj;
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Name, nameof(instance.Name));
                __ObservableValidatorHelper.ValidateProperty(instance, instance.Age, nameof(instance.Age));
            }

            return ValidateAllProperties;
        }
    }
}

Then the method in ObservableValidator (in the actual library) is updated as follows:

protected void ValidateAllProperties()
{
    // Fast path that tries to create a delegate from a generated type-specific method. This
    // is used to make this method more AOT-friendly and faster, as there is no dynamic code.
    static Action<object> GetValidationAction(Type type)
    {
        if (type.Assembly.GetType("Microsoft.Toolkit.Mvvm.ComponentModel.__Internals.__ObservableValidatorExtensions") is Type extensionsType &&
            extensionsType.GetMethod("CreateAllPropertiesValidator", new[] { type }) is MethodInfo methodInfo)
        {
            return (Action<object>)methodInfo.Invoke(null, new object?[] { null })!;
        }

        return GetValidationActionFallback(type);
    }

    // Fallback method to create the delegate with a compiled LINQ expression
    static Action<object> GetValidationActionFallback(Type type) { }
    
    EntityValidatorMap.GetValue(GetType(), static t => GetValidationAction(t))(this);
}

Proposed API

Not 100% sure on what the best API design would be for this. One possible solutions would be to add a new attribute that'd be the "inverse" of [DynamicDependency], such as a [DynamicallyAccessed] attribute. In this example below I've mostly just adapted the definition from [DynamicDependency] with a few changes, so consider this more of an idea than a refined proposal (it isn't):

namespace System.Diagnostics.CodeAnalysis
{
    [AttributeUsage(AttributeTargets.Class | AttributeTargets.Constructor | AttributeTargets.Delegate | AttributeTargets.Enum | AttributeTargets.Event | AttributeTargets.Field | AttributeTargets.Interface | AttributeTargets.Method | AttributeTargets.Property | AttributeTargets.Struct, AllowMultiple = true, Inherited = false)]
    public sealed class DynamicallyAccessedAttribute : Attribute
    {
        public DynamicDependencyAttribute(string memberSignature);
        public DynamicDependencyAttribute(string memberSignature, Type type);
        public DynamicDependencyAttribute(string memberSignature, string typeName, string assemblyName);
        public DynamicDependencyAttribute(DynamicallyAccessedMemberTypes memberTypes, Type type);
        public DynamicDependencyAttribute(DynamicallyAccessedMemberTypes memberTypes, string typeName, string assemblyName);

        public string? MemberSignature { get; }
        public DynamicallyAccessedMemberTypes MemberTypes { get; }
        public Type? Type { get; }
        public string? TypeName { get; }
        public string? AssemblyName { get; }
        public string? Condition { get; set; }
    }
}

Usage example

With an attribute like this, I could essentially generate the following code:

[DynamicallyAccessed("ValidateAllProperties()", typeof(ObservableValidator))]
public static void ValidateAllProperties(global::MyApp.MyViewModel instance)
{
}

So that when ObservableValidator.ValidateAllProperties isn't trimmed out, all these generated methods that could be invoked via reflection by that method will not be trimmed out either.

Alternative Designs

n/a

Risks

None that I can see.

Author: Sergio0694
Assignees: -
Labels:

api-suggestion, area-System.Diagnostics, area-System.Reflection, linkable-framework

Milestone: Future

@hamarb123
Copy link
Contributor

hamarb123 commented Nov 15, 2023

+1 to this feature request. I have done the same thing (source generating attribute types) and am now running into this limitation with .NET 8 (first time I've enabled trimming).

@vitek-karas
Copy link
Member

@hamarb123 could you please describe the scenario in which you think you need this capability?

As noted above, the problem with DynamicDependency (whichever direction it points) is that it's not verifiable. It only tells the tools to keep stuff, but it doesn't tie it in any way to the place where you're going to use the fact that it was kept. Meaning that you will still get a warning somewhere. And if you suppress the warning the code becomes fragile since any future changes to the code would have to keep in mind that there's this unexpressed relationship between the suppression and the DynamicDependency.

This is generally why we're hesitant to add a new feature around this, unless we can figure out a way to make it verifiable - which is pretty diffcult (see discussion above).

@hamarb123
Copy link
Contributor

@vitek-karas

My scenario is that I have metadata attributes I stick on assemblies which track important information (versioning info, licenses, etc) - I need to ensure all of this info is kept. The way it's done is with [module: ...] and then a specific helper method (one for each set of information) is used to extract all the information. If I could annotate the attribute (which is a generated type often) to say it's used by the type, then it would suit my needs. Note that I already have to suppress the warning around the types not being guaranteed to be available, but with this feature I could manually guarantee it at least.

I can see how it's not verifiable, but I believe it can still be quite useful to me, and probably other people.

@vitek-karas
Copy link
Member

Attributes should be kept - and if they take a type as a parameter, that type will also be kept and so on. The trimming doesn't have a way to remove attributes currently.

If possible, could you share an example of the shape of the code which you need to make work? I would be interested in both the information which is currently being trimmed as well as the rough shape of the code which uses reflection to access it.

@hamarb123
Copy link
Contributor

hamarb123 commented Nov 15, 2023

Attributes should be kept - and if they take a type as a parameter, that type will also be kept and so on.

Also, I'm reading the fields of the attribute, will they be kept?

Attributes should be kept - and if they take a type as a parameter, that type will also be kept and so on. The trimming doesn't have a way to remove attributes currently.
If possible, could you share an example of the shape of the code which you need to make work? I would be interested in both the information which is currently being trimmed as well as the rough shape of the code which uses reflection to access it.

I don't know if any part of it is currently being trimmed (as I'm only just going through the process of enabling trimming with .NET 8), but I'm trying to be thorough as I know these sorts of things can seem flaky if they're not done correctly (since it will seem like random things are being removed when an unrelated change is made, etc.).

Shape of code:

All my libraries (generated code - note I'm currently using msbuild generation, not source generation, but that's not really relevant):

[module global::hamarb123.Base.Attributes.AssemblyVersionIdentifierV1Attribute("some git hash", ...)]

If they do not reference the applicable library, they will also generate the attribute like so:

namespace hamarb123.Base.Attributes
{
	[global::System.AttributeUsageAttribute(global::System.AttributeTargets.Module, Inherited = false, AllowMultiple = true)]
	[global::System.ComponentModel.EditorBrowsableAttribute(global::System.ComponentModel.EditorBrowsableState.Never)]
	internal sealed class AssemblyVersionIdentifierV1Attribute : global::System.Attribute
	{
		private readonly string assemblyGitHash;
		//...

		public AssemblyVersionIdentifierV1Attribute(string assemblyGitHash, ...)
		{
			this.assemblyGitHash = assemblyGitHash;
			//...
		}
	}
}

The library then will have the above attribute defined, also with some helper properties & methods though (which are not added to the generated ones, since only the fields are read on them).

My apps ensure that all the libraries are loaded early using Assembly.Load. This may not be necessary on NAOT.

The library uses code like this to read it (simplified, to focus on the reading of the attributes - error handling, etc are omitted):

foreach (var lib in AppDomain.CurrentDomain.GetAssemblies())
{
	foreach (var mod in lib.Modules)
	{
		//check the main type
		foreach (var attribute in mod.GetCustomAttributes<AssemblyVersionIdentifierV1Attribute>())
		{
			//do something with it
		}

		//check for a user-defined type
		var type = mod.GetType("hamarb123.Base.Attributes.AssemblyVersionIdentifierV1Attribute");
		if (type != null)
		{
			//check if all the fields are defined properly
			var f_assemblyGitHash = type.GetField("assemblyGitHash", BindingFlags.NonPublic | BindingFlags.Instance);
			if (f_assemblyGitHash == null) continue;
			//... same with other fields

			//loop through all instances of it
			foreach (var attribute in mod.GetCustomAttributes(type))
			{
				//read the fields from attribute using reflection, convert it to the real attribute type & do the same thing as before
			}
		}
	}
}

Note that this is the only place that should ever be reading these values. If I could add an inverse [DynamicDependency], I'd shift this section to a new internal type so that I could include it, and just expose the relevant APIs from the internal one (which would be the only one to directly access it).

@vitek-karas
Copy link
Member

My apps ensure that all the libraries are loaded early using Assembly.Load. This may not be necessary on NAOT.

I think the runtime is still lazy in populating the runtime structures for assemblies - so if you need to rely things like AppDomain.GetAssemblies you may need to load them by hand before that.

In general, I would advice against relying on AppDomain.GetAssemblies as that list is only returning already loaded assemblies - and if you already have code to preload everything, you might as well store that list yourself and not rely on the runtime. It would feel cleaner (at least to me).

foreach (var mod in lib.Modules)

I assume this came from .NET Framework, on .NET 6+ the runtime doesn't allow multi-module assemblies anymore, so this is not needed.

var type = mod.GetType("hamarb123.Base.Attributes.AssemblyVersionIdentifierV1Attribute");

This is going to be the main problem. There's no way for the static analysis to figure this out and do what you want - which is to preserve that attribute on all loaded assemblies.

In your case, I think the best solution would be to generate a module initializer - so basically code which is called from the module's type static .cctor. Trimming will always keep module constructor, and so anything referenced from it will be kept as well. If you want to do this completely reflection free - you could generate code into the module .ctor which registers something into a global property in your main library. That way you don't need to perform the assembly enumeration and possibly not even the search for the attributes (as the generated code could do that directly). But it depends on how you do the code generation and the setup there.
Alternatively, you could generate an empty method which is called from the module constructor and put DynamicDependency attributes on that method - you would get warnings in your consumption code, but if you do it right it should work.

Also, I'm reading the fields of the attribute, will they be kept?

If you have a direct reference to the attribute type in your code, and read those fields via direct references, they will be kept. If you do it through reflection then it depends - if you get warnings in this code, then those might not be kept. In your sample above the fields would not be kept necessarily - because the tooling won't figure out what type you're talking about.

Note that this is the only place that should ever be reading these values. If I could add an inverse [DynamicDependency], I'd shift this section to a new internal type so that I could include it, and just expose the relevant APIs from the internal one (which would be the only one to directly access it).

I must admit I don't really understand this comment. What would you add DynamicDependency for - the per-assembly generated attribute types can't be referenced directly. You would have to use fully assembly qualified type names, but then you won't know those names up front.

/cc @sbomer (interesting scenario)

@hamarb123
Copy link
Contributor

hamarb123 commented Nov 16, 2023

In general, I would advice against relying on AppDomain.GetAssemblies as that list is only returning already loaded assemblies - and if you already have code to preload everything, you might as well store that list yourself and not rely on the runtime. It would feel cleaner (at least to me).

The way I do it allows it to be swapped out with a different version of the assembly later if needed, and still see relevant information in, e.g., crash reports.

I have actually found my approach very nice to use overall, the only annoyance is the need to pre-load assemblies, but other than that it works very well for me.

foreach (var mod in lib.Modules)

I assume this came from .NET Framework, on .NET 6+ the runtime doesn't allow multi-module assemblies anymore, so this is not needed.

Interesting :)

In your case, I think the best solution would be to generate a module initializer - so basically code which is called from the module's type static .cctor. Trimming will always keep module constructor, and so anything referenced from it will be kept as well. If you want to do this completely reflection free - you could generate code into the module .ctor which registers something into a global property in your main library. That way you don't need to perform the assembly enumeration and possibly not even the search for the attributes (as the generated code could do that directly). But it depends on how you do the code generation and the setup there. Alternatively, you could generate an empty method which is called from the module constructor and put DynamicDependency attributes on that method - you would get warnings in your consumption code, but if you do it right it should work.

This is a workaround I hadn't thought of yet. Note that only the empty method with DynamicDependency approach works for me though, since not all my libraries necessarily reference the library which does this stuff (the ones which do don't bother to generate the type manually, and just use the built-in attribute type from it instead) - therefore they wouldn't be able to just add things to a global list sort of thing (without reflection anyway). It does feel a bit gross to use module initialisers like this though.

Another idea I had which might work is to instead stick the DynamicDependency on the constructor of the attribute, would that work? Since it should then keep it in the scenarios where the constructor is used to construct the attribute I'd think.

Note that this is the only place that should ever be reading these values. If I could add an inverse [DynamicDependency], I'd shift this section to a new internal type so that I could include it, and just expose the relevant APIs from the internal one (which would be the only one to directly access it).

I must admit I don't really understand this comment. What would you add DynamicDependency for - the per-assembly generated attribute types can't be referenced directly. You would have to use fully assembly qualified type names, but then you won't know those names up front.

Inverse [DynamicDependency] would go on the generated attributes - would be referencing the internal helper type by name & assembly name (names would be stable here), so that way whenever a method gets called on it, the required info would be kept since the generated attribute would say "I need my fields to be kept if this type is" basically.

@vitek-karas
Copy link
Member

Another idea I had which might work is to instead stick the DynamicDependency on the constructor of the attribute, would that work? Since it should then keep it in the scenarios where the constructor is used to construct the attribute I'd think.

Yes, that would work.

Honestly I don't think it's necessary to design it such that the attribute can be kept, but not its fields if they're not going to be accessed. The fields are probably small enough (and their dependencies) that this would make little difference. You can't store complex structures in attributes anyway (since attributes only allow primitive type parameters), and your attribute is "data only", it doesn't have any of the logic in it.

@maxkatz6
Copy link
Contributor

maxkatz6 commented Jan 3, 2024

In Avalonia, for each public XAML file we put a switch branch in a generated XamlLoader class:

// pseudocode
public class !XamlLoader
{
     public static object TryLoad(string path)
     {
          if (path == "avares://MyAssembly/MyFile.axaml")
              return CompiledXamlFactories.BuildAndPopulate_MyFile();
          return null;
     }
}

Where XamlLoader is used via reflection from the AvaloniaXamlLoader type. Which is used inside of StyleInclude/ResourceInclude types.

In a typical XAML type this issue is already solved, as we have control over XAML files, and can replace StyleInclude/ResourceInclude nodes with direct references in the IL. But in applications that don't use XAML or require AvaloniaXamlLoader in the C# code for some other reason, reflection is still used.

As right now we don't have a solution for AvaloniaXamlLoader problem, [DynamicallyAccessedAttribute] sounds like a possible solution to keep current code from being trimmed away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Diagnostics linkable-framework Issues associated with delivering a linker friendly framework
Projects
No open projects
Status: Needs Consultation
Development

No branches or pull requests

10 participants