Proposal: Lambda Capture Lists #117

Open
stephentoub opened this Issue Jan 28, 2015 · 25 comments
@stephentoub
Member

(Note: this proposal was briefly discussed in #98, the C# design notes for Jan 21, 2015. It has not been updated further based on the discussion that's already occurred on that thread.)

Background

Today, lambdas automatically capture any referenced state into a closure object. These captures happen by reference, in that the "local" variable that’s closed over is compiled as a field onto the "display class," with an instance of that class used both by the containing method and by the lambda defined in it.

// Original code
public static void ContainingMethod()
{
    int i = 42;
    Action a = () => {
       Console.WriteLine(i);
    };
}

// Approximate compiled equivalent
public static void ContainingMethod()
{
    var class2 = new <>c__DisplayClass1();
    class2.i = 42;
    Action action = new Action(class2.<ContainingMethod>b__0);
}
private sealed class <>c__DisplayClass1
{
    public int i;
    public void <ContainingMethod>b__0()
    {
        Console.WriteLine(this.i);
    }
}

The ability to write such concise code and have the compiler generate all of the necessary boilerplate is a huge productivity win.

Problem

While this is a productivity win, it also hides some key aspects of how the mechanism works, in particular how the data makes its way into the lambda and where that data is stored, namely in an allocated object.

Solution: Explicitly Specifying What Gets Captured

When C++11 introduced lambda support, it explicitly disabled the implicit automatic capturing of any referenced state. Instead, developers are forced to state their intentions for what they want captured by using a "capture list." C# today behaves as if all captured state is captured by reference, and we would retain that by default, but we could also allow developers (optionally) to use capture lists to be more explicit about what they want captured. Using a capture list, the previously explored ContainingMethod example could be written:

public static void ContainingMethod()
{
    int i = 42;
    Action a = [i]() => {
        Console.WriteLine(i);
    };
}

This states that the lambda captures the ‘i’ variable and nothing else. As such, this code is the exact equivalent of the previous example and will result in exactly the same code being generated by the compiler. However, now that we’ve specified a capture list, any attempt by the method to use a value not included in the capture list will be met with an error. This verification helps the developer not only better understand what state is being used, it also helps to enable compiler-verification that no allocations are involved in a closure. If a closure is instead written with an empty capture list, the developer can be assured that the lambda won’t capture any state, won’t require a display class, and thus won’t result in an allocation (in most situations, the compiler will then also be able to statically cache the delegate, so that the delegate will only be allocated once for the program rather than once per method call):

public static void ContainingMethod()
{
    int i = 42;
    Action a = []() => {
        Console.WriteLine(i); // Error: can’t access non-captured ‘i'
    };
}

Additional Support: Capturing By Value

Today if a developer wants the equivalent of capturing by value instead of capturing by reference, they must first make a copy outside of the closure and reference that copy, e.g

public static void ContainingMethod()
{
    int i = 42; // variable to be captured by value
    ...
    int iCopy = i;
    Action a = () => {
        Console.WriteLine(iCopy); 
    };
}

In this example, since the lambda closes over iCopy rather than i, it's effectively capturing a copy by reference, and thus has the same semantics as if capturing i by value. This, however, is verbose and error prone, in that a developer must ensure that iCopy is only used inside the lambda and not elsewhere, and in particular not inside of another lambda that might close over the same value. Instead, we could support assignment inside of a capture list:

public static void ContainingMethod()
{
    int i = 42; // variable to be captured by value
    ...
    Action a = [int iCopy = i]() => {
        Console.WriteLine(iCopy); 
    };
}

Now, only iCopy and not i can be used inside of the lambda, and iCopy is not available outside of the scope of the lambda.

// Compiled equivalent
public static void ContainingMethod()
{
    int i = 42;
    var class2 = new <>c__DisplayClass1();
    class2.iCopy = i;
    Action action = new Action(class2.<ContainingMethod>b__0);
}
private sealed class <>c__DisplayClass1
{
    public int iCopy;
    public void <ContainingMethod>b__0()
    {
       Console.WriteLine(this.iCopy);
    }
}

With the developer explicitly specifying what to capture and how to capture it, the effects of the lambda capture are made much clearer for both the developer writing the code and someone else reading the code, improving the ability to catch errors in code reviews.

Alternate Approach

Instead of or in addition to support for lambda capture lists, we should consider adding support for placing attributes on lambdas. This would allow for a wide-range of features, but in particular would allow for static analysis tools and diagnostics to separately implement features like what lambda capture lists are trying to enable.

For example, if the "[]" support for specifying an empty capture list is unavailable but a developer was able to apply attributes to lambdas, they could create a "NoClosure" attribute and an associated Roslyn diagnostic that would flag cases where a lambda annotated with [NoClosure] actually captured something:

public static void ContainingMethod()
{
    int i = 42;
    Action a = [NoClosure]() => { // diagnostic error: lambda isn't allowed to capture
       Console.WriteLine(i);
    };
}
@giggio
giggio commented Jan 28, 2015

I like it and I prefer the main approach over the alternate. It provides more control.

@Romoku
Romoku commented Jan 28, 2015

Capturing all enclosed references by value may be common enough to warrant a shorthand syntax. This can be described using "skinny arrow" notation.

int value = 0;
Func<int> getValue = () -> value;
value++;
Console.WriteLine(getValue()); // Output is 0
Console.WriteLine(value); // Output is 1
@giggio
giggio commented Jan 28, 2015

@Romoku I like it, but it would present 2 different syntaxes for the problem, and that might be confusing. One for copying, and another for which variables you would be closing upon.

@Romoku
Romoku commented Jan 28, 2015

@giggio I hesitate to suggest making pass by value semantics the default for lambda expressions. However, considering there is code already to workaround the pass by reference semantics it may turn out to be a minor issue.

@ryancerium

I'm inventing new keywords here, and ref may not work in this context since it should be legal to have a parameter by ref anyway:

int i = 42;
int j = 13;
Func<int> getValue1 = () => 42;
Func<int> getValue2 = (noscope) => 42;
// Func<int> getValue2 = (noscope) => i; // Error capture disallowed
Func<int> getValue3 = (ref i) => i++;
Func<int> getValue4 = (value i) => i--;
// Func<int> getValue4 = (value i) => i + j; // Error, j isn't captured

getValue1(); // 42
getValue2(); // 42
getValue3(); // 42 and increments global i
var is43 = i == 43; // true
getValue4(); // 42 and decrements getValue4.i
getValue4(); // 41 and decrements getValue4.i
@sharwell
Member

You could always go the way of C++ and capture by value or &reference:

int x = 0;
int y = 0;
Action action = [x, &y]() =>
  {
    Console.WriteLine($"x={x}, y={y}");
    x = 1;
    y = 2;
    Console.WriteLine($"x={x}, y={y}");
  };
action();
Console.WriteLine($"x={x}, y={y}");

Outputs:

x=0, y=0
x=1, y=2
x=0, y=2
@ryancerium

@sharwell

You could always go the way of C++ and capture by value or &reference:

The original discussion mentioned a strong distaste for the C++ syntax: "One problem is that it frankly looks horrible."

@theoy theoy added the Language-C# label Jan 28, 2015
@sharwell
Member

@ryancerium I was going to add a 😆 emoji, but then I figured for better or worse it actually is a valid option for consideration. Perhaps the merits of this approach could influence the final feature, even if it's not taken as-is.

@HaloFour

I do kind of like the C++ capture list syntax although I agree that it doesn't quite jive with C#. Perhaps something like the following:

int i = 1;
int j = 2;
Action a = [i, ref j]() => {
    i = 3;
    j = 4;
};
a();
Debug.Assert(i == 1);
Debug.Assert(j == 4);

This doesn't really address the functionality in C++ that allows for the capture list to specify that the lambda can close any variables either by value or by reference via [=] and [&] respectively. Perhaps [*] or [ref *]? The latter is kind of redundant since that is the behavior given no capture list.

Of course if there ever is a consideration of adding support for attributes to the lambda method then using [] for the capture list might also be a problem.

@migueldeicaza
Member

Two bits of feedback:

= This =

It would be very useful to also have the ability to explicitly restrict capturing "this" (or alternatively, to require "this" to be specified).

This would help in scenarios where lambdas are used in managed libraries that interoperate with unmanaged libraries that can cause strong cycles to be kept in memory. We have struggled with those from the Linux/Gtk+ days to this day.

= Weak Captures =
Additionally, it would be nice if there was a way to annotate captured variables as being done with a WeakReference, automatically surfacing a captured variable as a WeakReference, also for preventing strong cycles.

@HaloFour

Good point. If specifying a capture list then specifying this should be compulsory (except if capture lists support wildcards as mentioned in my previous post.)

Action a0 = () => {
    this.DoSomething(); // good, supported today
};

Action a1 = []() => {
    this.DoSomething(); // compiler error, with no capture list method treated as static
};

int i = 123;
Action a2 = [ref i]() => {
    this.DoSomething(); // compiler error, this not defined, doesn't even reference generated display class
};

Action a3 = [this]() => {
    this.DoSomething(); // good
};

Action a4 = [*]() => { // equiv to C++ [=]
    this.DoSomething(); // good
};

Action a5 = [ref *]() => { // equiv to C++ [&], equiv to no capture list
    this.DoSomething(); // good
};

Action a6 = [ref this]() => { // compiler error, this can't be ref
    this.DoSomething(); 
};
@stephentoub
Member

@migueldeicaza, thanks!

Regarding 'this', without introducing new syntax for lambdas, I don't know that much could be done to restrict the use of 'this' by default, as that'd be a breaking change; however, with this proposal, once you specify a capture list, you would have to include 'this' in it if you wanted to be able to access it.

Regarding capturing by weak reference, with the proposal as written you could do something like:

object o = ...;
Action a = [var wro = new WeakReference(o)]() => {
    ... // can use wro here, but not o
    object local;
    if (wro.TryGetTarget(out local)) { ... }
};

If the language itself had a keyword "weak" or something that could be used on variable declarations of reference types, then that would naturally translate to this proposal as well, e.g.

object o = ...;
Action a = [weak o]() => {
    object local = o;
    if (o != null) { ... }
};
@mikedn
mikedn commented Jan 28, 2015

+1 for the C++ syntax, I don't see anything wrong with it except that ref should be used instead of & as HaloFour already pointed out.

@tomasr
tomasr commented Jan 29, 2015

I don't mind the C++ syntax, though, honestly, I'm not sure I like the proposed syntax for By-Value Capture. I can't quite say what, but just doesn't seem very right to me (probably because it's way too verbose for more than a single variable).

I think the option of allowing attributes on lambdas is intriguing, but I don't see it necessarily as a replacement or alternative for explicit captures (as you mentioned, it could be used for other things). Based on that, I wonder if perhaps choosing a syntax for explicit captures that doesn't clash with the regular attribute syntax wouldn't be a better option, so that, if desired, lambda attributes could be introduced later on without causing conflicts (or making parsing uglier).

Thought experiment: Could the syntax for generics (covariance/contravariance) be reused to accomplish the same thing, using in/ref as hints for by-value/by-ref capture? Something like:

public static void ContainingMethod()
{
    int i = 42;
    int j = 50;
    Action a = <in i, ref j>() =>  // i captured by-value, j captured by-ref
    {
        Console.WriteLine(i);
        Console.WriteLine(j);
    };
}
@ryancerium

Action a = [var wro = new WeakReference(o)]() => {

@stephentoub That syntax has the strong advantage of looking like default parameters. I like it. You could probably elide the var if you wanted to closer match the current lambda parameter declarations.

@gafter gafter added the 1 - Planning label Feb 2, 2015
@MadsTorgersen MadsTorgersen was assigned by gafter Feb 2, 2015
@ryancerium

Tossing out wild syntax ideas this morning. Yes, I realize that b and c are unused in the lambda, I added them for idea-flinging.

// Doesn't capture anything from the enclosing scope
[](a) => a.Frob;
<>(a) => a.Frob;
{}(a) => a.Frob;

(a)[] => a.Frob;
(a)<> => a.Frob;
(a){} => a.Frob;

(a, noscope) => a.Frob;

// Captures a bunch of stuff from the enclosing scope in a variety of ways
[wro = new WeakReference(o), ref b, value c](a) => a.frob(wro); // C++ style
<wro = new WeakReference(o), ref b, value c>(a) => a.frob(wro); 
{wro = new WeakReference(o), ref b, value c}(a) => a.frob(wro); // Looks like an anonymous object with a delegate(?) call

(a)[wro = new WeakReference(o), ref b, value c] => a.frob(wro);
(a)<wro = new WeakReference(o), ref b, value c> => a.frob(wro); // Looks like a generic
(a){wro = new WeakReference(o), ref b, value c} => a.frob(wro); // Looks like an anonymous type

(a, wro = new WeakReference(o), ref b, value c) => a.frob(wro); // Looks like a lambda function
@MgSam
MgSam commented Feb 4, 2015

I don't like the idea of lambda capture lists or adding a value-capture syntax at all. There is a good reason C# didn't have this to begin with- it adds a lot of complexity without adding a lot of benefit.

Developers have been using implicit capture in C# lambdas successfully for many years now- I see absolutely no reason to suddenly add a bunch of ugly syntax into the language to make it explicit. Tools like ReSharper already warn you when you're doing an implicit capture- presumably a Roslyn analyzer will be able to do the same thing to bring this capability to everyone.

If there's really a demand, I could live with an optional attribute but even then it seems like language design work for a largely unnecessary feature. If you want to ensure there are no captures, make a separate method, the same as you have to do today.

Right now lambdas are simple, easy to use, and beautiful. Please don't turn this language into C++. If it ain't broke, don't fix it.

@mikedn
mikedn commented Feb 4, 2015

Developers have been using implicit capture in C# lambdas successfully for many years now

It seems that you forgot the foreach scope hack that was added because people just couldn't get how things work or just wished they work differently. And the problem still persists for for loops and can't be solved by hacking the for loop variable scope. Value capture solves this cleanly.

Please don't turn this language into C++.

That's a dubious request. C++ may have its (perceived) shortcomings but it also did a lot of things right and this is one of them. Copying a feature from C++ doesn't mean that C# suddenly transforms in C++.

@axel-habermaier
Contributor

@MgSam, @mikedn: Also, this feature would be completely optional for backwards compatibility reasons. If you don't like it, don't use it. I for one would find it useful, occasionally.

@HaloFour
HaloFour commented Feb 4, 2015

This was one of those features that when I saw the lambda specs for C++ I thought, "damn, that's a really nice idea." But I do agree that most of the time what the compiler does today (even with the foreach scope trick) is generally what people would intuitively expect so allowing it to remain optional (which it would have to be in order to not break existing code) is also a good thing.

As for the syntax being tossed around, most of it seems fine but I don't know how I feel about the inline copy-variable declarations. That just seems really verbose to me and could cause someone to lose track of exactly what they're reading since it feels like normal variable declarations. Does that buy anything aside from scoping the declared name to the lambda?

@giggio
giggio commented Feb 7, 2015

I agree with it being optional, and I don't think they have an option here, they have to make it optional so they don't break everyone's code. I like the idea of having more control and this is a very welcome feature.

@gafter gafter added 0 - Backlog and removed 1 - Planning labels Nov 20, 2015
@gafter gafter assigned stephentoub and unassigned MadsTorgersen Nov 20, 2015
@gafter gafter added 2 - Ready and removed 0 - Backlog labels Nov 21, 2015
@alrz
Contributor
alrz commented Mar 18, 2016 edited

I'd like to suggest using lambda capture list syntax to omit parameter list, repeating my comment from #9834:

// currently
Action      action = delegate() {};
Action<int> action = delegate(int arg) {};

Action      action = delegate {};
Action<int> action = delegate {};

// capture lists
Action      action = []() => {};
Action<int> action = [](arg) => {};

// omitted
Action      action = [] => {};
Action<int> action = [] => {};

void F(Action      action) {}
void F(Action<int> action) {}
void G(Action<int> action) {}

F(delegate {}); // ERROR
F([] => {});    // ERROR

G(delegate {}); // OK
G([] => {});    // OK
@fsacer
fsacer commented May 4, 2016

I would really like this feature, because it allows to limit scope of a function which is hard to achieve otherwise in C#. I would like to know if this is planned for any release of C#.

@alrz
Contributor
alrz commented Jan 19, 2017 edited

Can this be applied to local functions?

@eyalsk
eyalsk commented Jan 19, 2017

@alrz I think that it would be better to write a proposal so people could discuss it in the context of local functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment