Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Champion: Module Initializers #2608

Open
gafter opened this issue Jun 18, 2019 · 19 comments
Open

Champion: Module Initializers #2608

gafter opened this issue Jun 18, 2019 · 19 comments

Comments

@gafter
Copy link
Member

@gafter gafter commented Jun 18, 2019

See also #2486

Although the .NET platform has a feature that directly supports writing initialization code for the assembly (technically, the module), it is not exposed in C#. This is a rather niche scenario, but once you run into it the solutions appear to be pretty painful. I have seen reports of a number of customers (inside and outside Microsoft) struggle with the problem, and there are no doubt more undocumented cases.

I suggest that we would add a tiny feature to support this without any explicit syntax, by having the C# compiler recognize a module attribute with a well-known name, like the following:

namespace System.Runtime.CompilerServices
{
    [Obsolete("This attribute is only to be used in C# language version 9.0 or later", true)]
    [AttributeUsage(AttributeTargets.Module, AllowMultiple = false)]
    public class ModuleInitializerAttribute : Attribute
    {
        public ModuleInitializerAttribute(Type type) { }
    }
}

You would use it like this

[module: System.Runtime.CompilerServices.ModuleInitializerAttribute(typeof(MyModuleInitializer))]

internal static class MyModuleInitializer
{
    static MyModuleInitializer()
    {
        // put your module initializer here
    }
}

and the C# compiler would then emit a module constructor that causes the static constructor of the identified type to be triggered:

void .cctor()
{
    System.Runtime.CompilerServices.RuntimeHelpers.RunClassConstructor(
        typeof(MyModuleInitializer).TypeHandle);
}
@jkotas

This comment has been minimized.

Copy link
Member

@jkotas jkotas commented Jun 18, 2019

Would it be better to just stick the attribute on a method that is meant to be used as module initializer? It would allow the method to be called directly, without going through the overhead of RuntimeHelpers.RunClassConstructor.

The attribute can be potentially allowed on multiple methods in the module and then all of them would be called in some order from the module initializer.

@theunrepentantgeek

This comment has been minimized.

Copy link

@theunrepentantgeek theunrepentantgeek commented Jun 18, 2019

I'd guess that one factor is the compiler overhead of the attribute search.

If the only place to look is the metadata for the module itself, the check will be very quick (avoiding any overhead for the vast majority of folks who never use this feature). If the search has to check every method of every class in module, it will take a lot longer, slowing down compilation for everyone.

That said, what about targeting the class itself - like this:

[System.Runtime.CompilerServices.ModuleInitializerAttribute]
internal static class MyModuleInitializer
{
    static MyModuleInitializer()
    {
        // put your module initializer here
    }
}

It's still a far larger search volume than the module metadata (so performance may be a concern), but a far smaller search volume than "every method".

@gafter

This comment has been minimized.

Copy link
Member Author

@gafter gafter commented Jun 18, 2019

The feature as proposed is intended to be the simplest possible thing that exposes (and maps almost directly to) the underlying .NET feature. No more is needed for the use cases I've seen. If you did need to have multiple bodies of code run initializers, for example, you could implement that in the language-supported one (use reflection to search for types with your own special attribute and initialize them).

I'm not worried about the "overhead" of calling RuntimeHelpers.RunClassConstructor once, as I expect that to be trivial in the overall execution of the program.

@ericstj

This comment has been minimized.

Copy link
Member

@ericstj ericstj commented Jun 19, 2019

all of them would be called in some order

Can that be done in a predictable, stable, way? Any precedent for compiler deciding order of execution?
Maybe similar to static field initialization, but in that case you have a semi-predictable order as it's defined as textual and barring partial classes, the developer defines the order in a single file. In this case it spans files, which will have variable ordering depending on file-system sort due to the globbing that happens in .NET.SDK projects. What about allowing the attribute on a method and making it an error if it appears on more than one method? If folks need more than one then they can explicitly call them in a defined order. That said I think @gafter's suggestion works just as well if the overhead isn't too high.

Are there any rules about what you can do inside a module constructor? Are you allowed to load other assemblies, make pInvoke calls, call async code, etc? If there are lots of rules I can imagine it being hard to enforce them in the compiler making this a somewhat dangerous feature warranting terms like "unsafe" or "dangerous".

@dsaf

This comment has been minimized.

Copy link

@dsaf dsaf commented Jun 19, 2019

So, what is the terminology: module or assembly? Can I have many modules per assembly? Is C# module same as as F# module same as .NET module?

[Obsolete("This attribute is only to be used in C# language version 9.0 or later", true)]

This sounds like the opposite of Obsolete :).

@yaakov-h

This comment has been minimized.

Copy link
Contributor

@yaakov-h yaakov-h commented Jun 19, 2019

This sounds like the opposite of Obsolete :).

@dsaf it's a trick used to stop older versions of the compiler using something it doesn't understand. Newer versions ignore that exact obsoletion string.

ref struct uses the same technique.

As for what a module means, that's already defined as an attribute target. I assume this feature won't change the meaning.

@pinkfloydx33

This comment has been minimized.

Copy link

@pinkfloydx33 pinkfloydx33 commented Jun 19, 2019

When using Fody, you need to have a method with a special name and signature. Obviously this can't be relied on here and I like the attribute specifying the type with the static constructor (short of special syntax). In that method you can just call whatever other initialization you need. So if for some reason you needed more than one module initializer, you could just refactor that into a single method that explicitly invokes the rest (no need for reflection).

The only thing I dislike about the assembly level attribute specifying the type is that it's hidden. A developer looking at a static constructor later may not realize that this is supposed to be a module initializer. This is the same problem you end up having with Fody. Not that it's a bad thing, but you need to make sure you document that method with warnings about removing code that is being relied on as a module initializer. If we could specify the attribute on the class/initializer directly, then it becomes a bit more obvious. The only problem is that now you could potentially have more than one such method and the compiler would have to search for it. In that case perhaps more than one detected attribute could issue a compiler error, though I'm not sure how that would impact then build process (ie slow it down)

@Joe4evr

This comment has been minimized.

Copy link

@Joe4evr Joe4evr commented Jun 19, 2019

The only thing I dislike about the assembly level attribute specifying the type is that it's hidden. A developer looking at a static constructor later may not realize that this is supposed to be a module initializer. This is the same problem you end up having with Fody. Not that it's a bad thing, but you need to make sure you document that method with warnings about removing code that is being relied on as a module initializer.

I see two (complementing) solutions for this:

  1. While really only a style guide, the new attribute can be placed in the same file as the Module Initializer method you want to call
  2. More thoroughly, the compiler can gain some extra knowledge about the attribute and verify that a cctor in the specified type exists, and otherwise reports a compilation error:
[module: System.Runtime.CompilerServices.ModuleInitializerAttribute(typeof(MyModuleInitializer))]
// Error CS####: No static constructor specified in type 'MyModuleInitializer' to be called as module initializer

internal static class MyModuleInitializer
{
    //oops, someone removed this code, but now it can't build
}
@HaloFour

This comment has been minimized.

Copy link
Contributor

@HaloFour HaloFour commented Jun 19, 2019

I'm with @Joe4evr , I think that if this is the way in which module initializers are implemented that the compiler should check and enforce that a static constructor exists on the class.

However, if the compiler is making such a check I think it would be just as easy for the compiler to attempt to emit a static call to a well known static method rather than using a static class constructor:

[module: System.Runtime.CompilerServices.ModuleInitializerAttribute(typeof(MyModuleInitializer))]

internal static class MyModuleInitializer
{
    internal static void Initialize() {
        // do stuff here
    }
}

It would be a compiler error if that method is not resolved by the compiler at compile time.

In this case the compiler would only have to emit a static call to that method:

void .cctor()
{
    MyModuleInitializer.Initialize();
}

In my naive opinion this seems about as difficult as going the static constructor route, assuming that the compiler would check that such a constructor exists.

@masonwheeler

This comment has been minimized.

Copy link

@masonwheeler masonwheeler commented Jun 19, 2019

This is good to see. One thing I'd really like to see changed, though: make module constructors run eagerly.

Right now, module initializers, like class static constructors, run lazily; at some point after a module has loaded but before any code from that module runs, the module initializer will run. Unfortunately, this adds unnecessary coupling and complication to one of the best scenarios for module initializers: plugins. Ideally, you could load a plugin assembly, the module initializer would run, and it would register the plugins with the plugin system (through a known method in a dependent assembly.) But with lazy initialization, you can't do that; you need the plugin system to "reach into" the module somehow in order to activate it, probably with Reflection, and by that point there's no point in having a module initializer at all; you just use Reflection to search for plugins to register.

With static constructors, lazy initialization is needed because there's no good way to resolve dependency order eagerly. But with CLR assemblies, we have a well-established dependency order already built into the fundamental concept of assemblies, so that limitation doesn't apply. So it seems to me there's no good reason not to make it eager.

@HaloFour

This comment has been minimized.

Copy link
Contributor

@HaloFour HaloFour commented Jun 19, 2019

@masonwheeler

Seems like something you'd need to take to CoreCLR as the language currently can't influence how the initializers would behave.

@jkotas

This comment has been minimized.

Copy link
Member

@jkotas jkotas commented Jun 19, 2019

"overhead" of calling RuntimeHelpers.RunClassConstructor once

The overhead is in that RuntimeHelpers.RunClassConstructor introduces unnecessary dependency on reflection stack. I agree that it is not a big deal for a lot of programs out there, but not all of them. The dependency on reflection stack means that this feature won't be usable by folks who want to write lean-and-mean code without reflection dependencies or where RuntimeHelpers.RunClassConstructor is not available such as Unity3D mini-profiles.

Can that be done in a predictable, stable, way?

Partial classes solved this problem.

what you can do inside a module constructor? Are you allowed to load other assemblies, make pInvoke calls, call async code, etc?

There are no special rules. You can do anything in module static constructor as what you would do in regular static constructor. DLLMain != module constructor.

@mjsabby

This comment has been minimized.

Copy link
Member

@mjsabby mjsabby commented Aug 1, 2019

Is it possible to prioritize this in the 8.1 release?

@gafter

This comment has been minimized.

Copy link
Member Author

@gafter gafter commented Aug 2, 2019

@mjsabby There are no current plans for an 8.1 release.

@MadsTorgersen MadsTorgersen moved this from TRIAGE NEEDED to X.X Candidate in Language Version Planning Aug 26, 2019
@gafter gafter added this to the X.X candidate milestone Aug 26, 2019
@gafter gafter moved this from X.X Candidate to 9.0 Candidate in Language Version Planning Aug 28, 2019
@gafter gafter modified the milestones: X.X candidate, 9.0 candidate Aug 28, 2019
@gafter

This comment has been minimized.

Copy link
Member Author

@gafter gafter commented Aug 28, 2019

@jkotas Rather than using reflection, the compiler could implement this by injecting a static internal method into the type that does nothing, and then calling that method in the generated module initializer.

@jkotas

This comment has been minimized.

Copy link
Member

@jkotas jkotas commented Aug 28, 2019

compiler could implement this by injecting a static internal method into the type that does nothing, and then calling that method in the generated module initializer.

Yes, that would work great and address all my concerns.

@Grauenwolf

This comment has been minimized.

Copy link

@Grauenwolf Grauenwolf commented Dec 30, 2019

The attribute can be potentially allowed on multiple methods in the module and then all of them would be called in some order from the module initializer.

I would vote against that part.

Just like we only get one Main method, you should only get one assembly initializer. It can call out to other functions if you need more organization, but anything more opens a nasty can of worms.

@chsienki

This comment has been minimized.

Copy link

@chsienki chsienki commented Jan 10, 2020

One possible use for the multi-module initializers approach is for generated code.

Generated code often needs a way to 'register' itself at startup. Having multiple module initializers would allow the generated code to add its own initializer that would perform any specific initialization logic it needs, without the need for the user to manually add an explicit call to it.

@masonwheeler

This comment has been minimized.

Copy link

@masonwheeler masonwheeler commented Jan 10, 2020

@Grauenwolf If we're going to add this, it makes sense to look elsewhere for similar features and see what works and what doesn't.

Probably the best analogue comes from .NET architect Anders Hejlsberg's previous project, Delphi. It allows you to put an initialization section in any code file, and at compile-time the compiler sets everything up so they will all get executed one after the other and automagically gets the order of execution right so you don't end up with dependency problems.

That last bit (getting dependency order right) is based on a specific, restrictive property of Delphi's compilation that doesn't apply to C#, so we can't expect to be able to copy that successfully. But the basic principle of putting your initialization code together with the code it's initializing, and then having the compiler gather them into a single overall initializer, is a sound one. What are the alternatives? I can only think of two, and both are bad:

  1. You write a bunch of local initialization routines, then you have to keep track of all of them manually and call them all in the module initializer
  2. You don't write local initialization routines at all, and you have to write a module initializer that directly reaches into every piece of your assembly that needs initialized.

Both of these are significantly worse than having the compiler set it up for you. As for ordering, most of the time it won't be necessary, especially since static constructors will take care of most of the cases of initialization-time dependencies between classes. But for cases where it is, there should be some way to specify an explicit ordering, and any initializer routines that don't have an ordering set will run (in an undefined order) after all the ones that do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Tracking: Julien
Awaiting triage
You can’t perform that action at this time.