Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: modifier (perhaps "static") for local functions, to explicitly disallow captured state semantics #1277

Closed
mgravell opened this issue Jan 25, 2018 · 73 comments

Comments

@mgravell
Copy link
Member

mgravell commented Jan 25, 2018

A recurring problem I've seen a lot since the addition of local methods - and in particular in async optimizations (and not just me) is that it is very easy to accidentally introduce shared state into a local function when you really didn't want to. For example:

void Foo(int a) {
    void Bar(int val) {
         Blah(a); // <== error; this was meant to be val
    }
    // ...
    Bar(a);
}

This is a common intent when using local functions to enclose a method, without actually intending to allow shared capture state.

Currently, the only "fix" here is to move the method out:

static void Bar(int val) {
     Blah(val); // <== fixed, because the compiler told us about it
}
void Foo(int a) {    
    // ...
    Bar(a);
}

This works, but it pollutes the space and loses code locality and expressiveness. I actually find myself routinely copy/pasting local methods out and back in again to check that they still compile - to see if I've leaked state.


Proposal: allow the static modifier on local functions, to express the intent to divorce all state. I'm not 100% sure that static is the right word, but it seems close. The effect would be simply to enforce that variables (parameters and locals) declared outside the static local method are not in scope inside the local method, essentially as though the local method had been declared outside the method. The "local method", then, is simply here for code cleanliness reasons.

This has the effect of avoiding unintended captured state machinery.

Example:

void Foo(int a) {
    static void Bar(int val) {
         Blah(val); // <== fixed, because the compiler told us about it
    }
    // ...
    Bar(a);
}

There is an outstanding question of whether it is acceptable to share the this instance but nothing else; this ambiguity is perhaps the biggest thing against the static keyword.

Additional possibility: because of the lack of state, it allows re-use of the most obvious/correct variable names. I've seen lots of examples essentially like:

void Foo(int a) {
    void Bar(int innerA) {
         Blah(innerA); // <== note awkward naming
    }
    // ...
    Bar(a);
}

but this proposal allows the "obvious" names to be used consistently:

void Foo(int a) {
    static void Bar(int a) {
         Blah(a);
    }
    // ...
    Bar(a);
}
@Drawaes
Copy link

Drawaes commented Jan 25, 2018

This was brought up during a PR review by @stephentoub

dotnet/corefx#25187 (comment)

It is way to easy to make that mistake.

@mgravell
Copy link
Member Author

mgravell commented Jan 25, 2018

side thought: should the same modifier also be allowed for anonymous methods and lambdas, in the same way that async can be used as a modifier? for example:

Action<string> someCallback = static s => { ... };

which would again mean that no implicit state capture is permitted inside the lambda expression or expression body.

(again, not sure that static is quite right, but: you get the point)

@Joe4evr
Copy link
Contributor

Joe4evr commented Jan 25, 2018

See also: dotnet/roslyn#23970, #302

@iam3yal
Copy link
Contributor

iam3yal commented Jan 25, 2018

@mgravell This can also be solved using attributes and analyzers with #794, however, we still can't apply attributes in the context of anonymous methods and lambdas but with this it's going to be possible.

Alternatively, if a language feature is introduced to support this I'd love it to be extended further and have C++ style capture list due to the flexibility it provides as described here.

However, it seems like they already thinking about lambdas and static as a keyword see here #275.

@iam3yal
Copy link
Contributor

iam3yal commented Jan 25, 2018

There is an outstanding question of whether it is acceptable to share the this

In my opinion it shouldn't, this approach should be all or nothing for two reasons:

  1. Consistency.

  2. Control - It can always be passed explicitly.

@mgravell
Copy link
Member Author

mgravell commented Jan 25, 2018

@eyalsk interesting related ideas, thanks; the "capture list" approach looks nice, but is complicated; IMO C# is already bloating fast enough, and an "all or nothing" is probably a very pragmatic reasonable approach - especially since "all" is already the implicit default. In particular, the "capture list" approach is complicated by the fact that capture closures are already scoped and nested by declaration location. It may be nearly impossible to implement such that you're not capturing something extra by accident (where A captures x+y and B captures y+z - either A may also be silently capturing z, or B may be silently capturing x - in terms of fields on the state machine and lifetime)

Likewise, personally I'd be just fine with disallowing this capture, so static means exactly that.

@DavidArno
Copy link

DavidArno commented Jan 25, 2018

Related issue: #275. Though the scope of this proposal is far narrower, and likely far easier to implement, than that other one.

@Joe4evr
Copy link
Contributor

Joe4evr commented Jan 25, 2018

the "capture list" approach looks nice, but is complicated

Capture lists were suggested before and the team has rejected it.

@iam3yal
Copy link
Contributor

iam3yal commented Jan 25, 2018

@Joe4evr Doesn't mean it can't be reevaluated but I understand the likelihood of it to be rejected again.

@TheUnlocked
Copy link

The use of static for this works surprisingly well with #832, as the function could be considered to be cached, and because of that, it couldn't possibly access something outside of its scope.

@MkazemAkhgary
Copy link

MkazemAkhgary commented Jan 29, 2018

Can static local function access instance members of current instance? or will it be in static context?

my suggest is that local function (whether static or not) in instance method can access instance members. but local function in static method cant access instance members (pretty obvious)

int field;

void Foo()
{
    static void local()
    { 
        field = 0;
    }
}

The reason to allow this is because we don't confuse fields with local variables (if you follow good old naming rules), thus there is nothing to worry about.

@Drawaes
Copy link

Drawaes commented Jan 29, 2018

The main problem is with avoiding accidental capture of a variable in the parent method thus causing allocations. I am not sure the above helps with that.

Being able to make a local or annon method static means in the annon case the delegate can be single instance and cached. That the compilier can detect the capturing of variables and you can save a register as you don't need to pass a ref to "this".

It seems like the solution that adds no keywords, uses preexieting language semantics that are well known (static methods) and covers 90% of the situations. Also will allow for a better perf profile... Seems like a win, win, win

@yaakov-h
Copy link
Member

I really don't like the idea of a local function having the static keyword but retaining the ability to access instance fields, properties and methods.

@Drawaes
Copy link

Drawaes commented Jan 29, 2018

Static local methods wouldn't have the ability to access instance fields or properties and methods would they? If they did then they would need the "this" ref passed in and I agree that would be nasty.

@HaloFour
Copy link
Contributor

@Drawaes

The main problem is with avoiding accidental capture of a variable in the parent method thus causing allocations.

Local functions don't incur the same penalty for capture as delegates do. They don't cause any allocations, the enclosed variables are lifted to a struct and the function itself passed that struct with a ref parameter:

public int M(int x, int y) {
    int add() => x + y;
    return add();
}

//  translated into:
private struct DisplayClass {
    public int x;
    public int y;
}

private static int add(ref DisplayClass ptr) {
    return ptr.x + ptr.y;
}

public int M(int x, int y) {
    DisplayClass ptr = default;
    ptr.x = x;
    ptr.y = y;
    return add(ref ptr);
}

@benaadams
Copy link
Member

Local functions don't incur the same penalty for capture as delegates do. They don't cause any allocations,

They do if they are async, even the presence of an async local function causes an allocation even if its not called; if it does any capturing dotnet/roslyn#18946

@Drawaes
Copy link

Drawaes commented Jan 29, 2018

Sorry my fault I should have specified that async was my major use case for them.

@Neme12
Copy link

Neme12 commented Jan 29, 2018

but this proposal allows the "obvious" names to be used consistently:

void Foo(int a) {
    static void Bar(int a) {
         Blah(a);
    }
    // ...
    Bar(a);
}

Why would you pass the same exact value instead of capturing it though? Isn't this what the compiler already does for you? You can be consistent and use the "obvious" name without declaring it twice. And if the parameter has a slightly different meaning than the original one, then it might as well have a slightly different name.

If capturing the int is slower than passing it again then I think the right approach would be to try to optimize this rather than let us duplicate all the parameters with the exact same names.

@Drawaes
Copy link

Drawaes commented Jan 29, 2018

Because it causes allocations.

@HaloFour
Copy link
Contributor

@benaadams

I see. Yeah, there are several scenarios that cause local functions to use the delegate-style capturing rather than a struct, presumably because they all involve the potential lifting of that state into a value that must be able to persist beyond the lifetime of that method invocation.

Another example is when a method happens to also create a delegate that encloses over part of the method state, even if that enclosed state doesn't overlap with the state of the local function.

@sharwell
Copy link
Member

sharwell commented Jan 29, 2018

I am currently investigating ways to leverage the RoslynClrHeapAllocationAnalyzer for practical performance purposes. For example, an analyzer that recognized PerformanceSensitiveAttribute.AllowCaptures could report a warning if the constraint was violated, e.g. by a local function. This is still in the early stages, but I'd be interested in feedback to make this useful in practice.

@Drawaes
Copy link

Drawaes commented Jan 29, 2018

I get that an analyzer is nice but why not just stop it happening at all with the static word? It seems a little like the new get out of jail free card is "add an analyzer".

I tend to agree that sometimes that makes sense to stop language bloat but in this case a well known and understood concept like "static" is really not an addition to the library. Infact I would go so far as to say without static it's odd, any other method can be static (getters/setters, constructors, methods) so why not these two types of methods?

@HaloFour
Copy link
Contributor

@Drawaes

I think the idea there is that an analyzer can be done today whereas any language feature proposal has to endure the LDM process which is infinitely more protracted. It also fits pretty well in this case since the expected behavior of the language feature is a warning/error if you try to capture in one of these local functions.

Regarding static, I agree that it's probably the best existing keyword for the job, but I personally don't think that it does so without some kind of expansion on its current definition. Is the idea that the "method" has it's own "instance" which represents the locals and other state?

@Drawaes
Copy link

Drawaes commented Jan 29, 2018

The definition would surely be as is today... There is no ref to this passed in and the method has access to it's stack as per any static?

@HaloFour
Copy link
Contributor

@Drawaes

The definition would surely be as is today... There is no ref to this passed in and the method has access to it's stack as per any static?

That doesn't imply anything about the locals, though. Static methods can freely capture their own locals and that doesn't involve any this pointer.

@benaadams
Copy link
Member

benaadams commented Jan 29, 2018

Method is just like a regular static method; its just scoped to the parent method; so only has access to static variables and its own parameters (no access the the locals of the parent method)

@HaloFour
Copy link
Contributor

@benaadams Agreed, I just don't think that the current definition per the spec implies that relationship.

@Neme12
Copy link

Neme12 commented Jan 29, 2018

in this case a well known and understood concept like "static"

but this is not the well known meaning of "static"

  1. it implies that the method is actually static and this doesn't get passed either - and it very well could be, but what if you wanted a method that doesn't capture but isn't static? that would be just as efficient
  2. static could also be understood to mean that the function is static (no this) but does capture locals - if you told me about static local functions, this is what I would probably think that might mean

static never meant "doesn't capture" - I think we're mixing up 2 different meanings because you could definitely imagine both A without B and B without A

@Joe4evr
Copy link
Contributor

Joe4evr commented Jan 29, 2018

I agree that static might not be the best choice of keyword. I pointed out in the proposal for "static variables" (as linked earlier) that I'm not a fan of using the keyword static in that particular context, because it could still be lowered into an instance field.

@MkazemAkhgary
Copy link

what about private? still not accurate but maybe not as confusing as static

@jnm2
Copy link
Contributor

jnm2 commented May 25, 2018

@stakx

I agree with the proposal (disallowing capture) but I am also one of those unhappy about the choice of keyword (static). This will just make the meaning of "static" unnecessarily ambiguous.

Sorry, I'm slow. Where's the ambiguity in the term static? A static local function would be static in exactly the same sense as a static method, right? #1565

@stakx
Copy link
Contributor

stakx commented May 25, 2018

@jnm2:

A static local function would be static in exactly the same sense as a static method, right?

Perhaps I am slow to understand, but no, I read the proposal differently. It suggests that static on local methods "divorces all state" (i.e. it would supposedly disallow variable capture from the enclosing method and possibly further).

So for example:

static int staticField;

static void Outer()
{
    var local = 7;
    staticField = 42;

    Inner();

    static void Inner()
    {
        local = 8; // disallowed?
        staticField = 43; // disallowed?
    }
}

My understanding of present-day static is "belongs to the type, not to an instance of that type". If static had this meaning in all circumstances (including when placed on local methods), then both local = 8 should be allowed (because we're talking local variables here, which have nothing to do with static/instance members) as should be staticField = 43 (because both the field and the method are static).

However, according to the proposal, at least one if not both assignments become illegal because they access outer state. That's a rather different outcome.

I've seen the proposed re-phrasing of the definition of static, but quite frankly, at this relatively late point in C#'s history, I think doing that would be a bad decision.

(It would also mean that C# diverges even further from the underlying VM's terminology, which isn't forbidden but will become painfully visible e.g. when doing reflection).

If we're talking about "divorcing all state" then I'd rather see a new, dedicated keyword pure (or similar) added to the language.

@HaloFour
Copy link
Contributor

@stakx

public class C {
    static int staticField;
    int instanceField;

    public void Outer() {
        int local = 0;

        static void Inner() {
            local = 123; // fail, can't capture locals
            instanceField = 456; // fail, can't capture this
            staticField = 789; // fine, doesn't require capturing anything
        }
    }
}

@jnm2
Copy link
Contributor

jnm2 commented May 25, 2018

@stakx Then it sounds like you'd be happy with static being used the way they are planning in #1565 (as in HaloFour's example)?

@Drawaes
Copy link

Drawaes commented May 25, 2018

I am not convinced you shouldn't be able to capture this. As that isn't a "capture" in the normal sense of the word and very unlikely to be the perf hit most worry about and is driving the feature? I could also be wrong ;)

@TheUnlocked
Copy link

To add to the above, there is a case to be made that static in this context would mean static to the enclosing function but not to the instance. However, I'm not going to be the one making that case. ;)

@stakx
Copy link
Contributor

stakx commented May 25, 2018

@HaloFour, @jnm2: I'd actually be fine with that if that were the end result BUT please note that this doesn't require a redefinition of what static means. The fact that you'd no longer be able to catch locals and instance fields would simply be a useful side effect, not a new meaning. static would keep the meaning it's always had (and that's what I'm arguing for here). But that's not what this proposal wants:

Proposal: modifier (perhaps "static") for local functions, to explicitly disallow captured state semantics

The meaning of the modifier (static) would be to "disallow capture". This is not what present-day static is for, and neither should it assume that new, additional meaning. (Even though the end result would be that it would in effect disallow capture of all but static fields.)

Proposal: allow the static modifier on local functions, to express the intent to divorce all state.

Again, static doesn't mean "divorce all state" and neither should it assume that new meaning.

By all means, go ahead with the proposal 👍, but please don't specify this new feature in a way that gives static a new meaning. Make it clear that static simply has a beneficial side effect in this particular situation.

@jnm2
Copy link
Contributor

jnm2 commented May 25, 2018

@stakx

But that's not what this proposal wants:

Proposal: modifier (perhaps "static") for local functions, to explicitly disallow captured state semantics

By disallowing captured state, you disallow capture of this (and thereby, access to instance fields), but no capture is necessary to access static fields and therefore this isn't enforcing purity. I didn't see anyone ask for disallowing access to static state, though maybe I missed it.

@stakx
Copy link
Contributor

stakx commented May 25, 2018

@jnm2: I agree with what you're saying. What I am objecting against are alternate definitions such as "divorce all state" (which would include static state btw.) or "disallow capture" being affixed to static because that'd simply make static have at least two meanings.

If you want a keyword that divorces all (captured) state, call it pure. If you want a keyword to disallow capture, call it whatever. But static already has a meaning, which IMHO should be preserved.

Note I've come to agree that static could be used on local methods to disallow capture. But that's not what it should mean in this usage scenario.

@jnm2
Copy link
Contributor

jnm2 commented May 25, 2018

I follow, though I'm not sure if you're right. To get to the point: how would you define the current meaning of static?

@stakx
Copy link
Contributor

stakx commented May 25, 2018

@jnm2: See above. I don't have the spec at hand but my understanding of static has always been, "this member does not belong to an instance of this type, but to the type itself".

(What this does not answer is whether a local method is logically a member of the enclosing type at all. If not, my definition would no longer apply precisely.)

@jnm2
Copy link
Contributor

jnm2 commented May 25, 2018

@stakx You could expand that definition to, "this member does not capture instance or local state." Then static wouldn't have two meanings; it would have a single meaning, clarified to apply in more contexts.

Like variables being added to Newtonian physics that had been negligible in previous contexts, and so were not talked about until relativity came around.

@stakx
Copy link
Contributor

stakx commented May 25, 2018

@jnm2: In my eyes, that wouldn't be an expansion of the existing definition, but a replacement, because I simply don't perceive static as having anything to do with capture.

Sure, I can get used to a new, different, but also more general meaning for static -- I just can't say that I like the idea of it very much, especially if such a radical redefinition could potentially be avoided.

I guess I've made my argument sufficiently clear, if it's at all taken into account in any form during the design of this new feature, then I'm more than happy. Either way, I won't raise any further objections. :)

@benaadams
Copy link
Member

because I simply don't perceive static as having anything to do with capture.

Adding that it "doesn't capture instance or local state"; means it still doesn't have anything to do with capture :)

@stakx
Copy link
Contributor

stakx commented May 25, 2018

@benaadams: No. But you've now associated it with that term, with previously was an altogether separate thing. If I added to the definition of static, "but it isn't orange" then you'd probably wonder too why that definition needs to mention colour at all. (A definition for X would ideally say what X is, not what it isn't.)

PS: That being said, I don't object to that small addition. (As long as you don't define static as "[...] static [...] hides its enclosing scope" which I'd find rather inaccurate. 😄)

@jnm2
Copy link
Contributor

jnm2 commented May 25, 2018

A static method is a method which can only access by name:

  • its own declared parameters and locals
  • other static members

If you consider this and base to be names.
That definition seems pretty traditional but works for local functions as well.

@stakx
Copy link
Contributor

stakx commented May 25, 2018

@jnm2: 👍 Totally happy with that except for one tiny nitpick: "other static members defined in the same scope, or in an enclosing scope at the type level". (Otherwise a local method defined in method A could access a local method defined in method B.)

@jnm2
Copy link
Contributor

jnm2 commented May 25, 2018

@stakx Is that necessary though? Local functions aren't considered members, are they?
Item 2 could be "accessible static type members."

Sounds like the adjective static is modifying type rather than members as I intended. Fail...
There's a bit more to it than this aspect, anyway 😁 https://github.com/dotnet/csharplang/blob/master/spec/classes.md#static-and-instance-members

@iam3yal
Copy link
Contributor

iam3yal commented May 26, 2018

It's not like we don't have keywords that are used in different contexts in the language, e.g., using so why this is any different? I mean static isn't clear enough? or I'm missing something? but really context is important so I don't know why there's a discussion about "redefinition" when it's a different context.

Just my 2c. 😄

p.s. I tend to miss things at times so if I do please enlighten me. 😉

@jnm2
Copy link
Contributor

jnm2 commented May 26, 2018

@eyalsk My gut feeling is that it's not necessarily a good thing when that happens, if that's happening. But you could be right.

@AustinBryan
Copy link

So if we're not agreeing on static, does anyone have any better keyword ideas for it? One that would better reflect it's lack of capturing?

@CyrusNajmabadi
Copy link
Member

Honestly... the more i use c++ the more i just like capture lists. I think it's good that C# defaulted to 'allow capture'. But i think it would be fine to be able to just say (a, b, c)[] => ... (no capture for you!). It's incredibly brief and simple and gives exactly the behavior you want. And, it easily extends to being able ot say "actually... i do want to be able to capture something, but specify what it is".

The fact that it's only two chars to prevent any capturing at all is really nice for me...

@gafter
Copy link
Member

gafter commented Dec 11, 2018

See #1565

@gafter gafter removed their assignment Dec 11, 2018
@mgravell
Copy link
Member Author

Woohoo, this appears to be in the 8.0 beta in VS 16 preview 2, and it works great.

@jcouv
Copy link
Member

jcouv commented May 6, 2019

I'll go ahead this issue since it is a duplicate (and already implemented too ;-) ).
#1565

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests