Lambdas that modify their closure: &$* #833

wolfseifert · 2023-11-15T06:47:24Z

When porting lambdas that modify their closure from Cpp1

#include <iostream>
using namespace ::std;
int main() {

  auto x = 5;
  auto y = 7;

  auto v = [&] {
    auto z = x;
    x = y;
    y = z;
  };

  cout << "x = " << x << ", y = " << y << endl;
  v();
  cout << "x = " << x << ", y = " << y << endl;
}

to Cpp2 the syntax gets pretty ugly

main: () = {
  using namespace ::std;

  x := 5;
  y := 7;

  v := :() = {
    z := x$;
    x&$* = y$;
    y&$* = z;
  };

  cout << "x = (x)$, y = (y)$" << endl;
  v();
  cout << "x = (x)$, y = (y)$" << endl;
}

This &$* was the only way I found to make it work.

When doing the same for nested lambdas

#include <iostream>
using namespace ::std;
int main() {

  auto x = 5;
  auto y = 7;

  auto v = [&] {
    auto w = [&] {
      auto z = x;
      x = y;
      y = z;
    };
    w();
  };

  cout << "x = " << x << ", y = " << y << endl;
  v();
  cout << "x = " << x << ", y = " << y << endl;
}

it gets even worse

main: () = {
  using namespace ::std;

  x := 5;
  y := 7;

  v := :() = {
    x2 := x&$;
    y2 := y&$;
    w := :() = {
      z := x2$*;
      x2$* = y2$*;
      y2$* = z;
    };
    w();
  };

  cout << "x = (x)$, y = (y)$" << endl;
  v();
  cout << "x = (x)$, y = (y)$" << endl;
}

This nested lambda example may seem artifical, but appeared when porting a real Cpp1 program over to Cpp2.

Is there any better syntax for this available? Do I miss something?

msadeqhe · 2023-11-15T06:59:46Z

That's it. It's the syntax of capture by pointer in Cpp2.

This topic is related to issue #247.

AbhinavK00 · 2023-11-15T07:48:11Z

cpp2 should have a feature for reference capture, maybe captures could even be defined in terms of parameter passing conventions. I don't know why Herb didn't want to support capture by reference, maybe because of safety concerns but the current idiom isn't safer in any way.

SebastianTroy · 2023-11-15T10:40:16Z

I've noticed the &$* syntax mentioned a couple of other times and really really hated it, and just hoped it would not really need to make an appearance in production code. I don't think the additional capture block in cpp1 was really all that bad, could we add an optional capture block before or after) the parameter block, and then as was mentioned before, use normal parameter passing syntax to define whether it is a copy or reference? myLambda : (copy y)$ (x) x + y; The capture block makes it clearer IMO that values are being captured in a way that might leave them dangling unless you pay attention, also in the case where you use the same captured variable multiple times you can't accidentally capture by copy in one place and reference/pointer in another, unless you really meant to. There's no reason to prevent normal functions from having a capture block, honestly I'd prefer to have to manually capture global variables in the few cases o use them, to prevent accidental modifications with wide ranging side effects, and to make wide ranging side effects really visible in the code. I know a lot of people are pushing for terser and terser syntax at the moment, but for the most part, C++ is a complex language, not a scripting language, and I'd rather type a few more characters if it added nice signposts to my code, allowing me to parse it visually more easily, and flag important things where they are happening. On 15 November 2023 07:48:29 Abhinav00 ***@***.***> wrote: cpp2 should have a feature for reference capture, maybe captures could even be defined in terms of parameter passing conventions. I don't know why Herb didn't want to support capture by reference, maybe because of safety concerns but the current idiom isn't safer in any way. — Reply to this email directly, view it on GitHub<#833 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQI4FMEUYVGBRD74DGDYERXUVAVCNFSM6AAAAAA7L7UFDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJRHE2TQMZTGM>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

JohelEGP · 2023-11-15T13:42:42Z

I know a lot of people are pushing for terser and terser syntax at the moment, but for the most part, C++ is a complex language, not a scripting language, and I'd rather type a few more characters if it added nice signposts to my code, allowing me to parse it visually more easily, and flag important things where they are happening.

We can have the best of both worlds.
The lack of ceremony also means that what is relevant stands out more.
I'm a fan of :(x) x + y$, as the parameterized nature of the expression stands out more.

Also, post(size() == size()$ + 1) is fantastic.
You don't need to give size() a name and then use it.
And the o.type()$ in sfml_types.std::ranges::any_of(:(x) o.type()$.starts_with(x)) (#789 (reply in thread))
is performant; it is captured once and used N times.

SebastianTroy · 2023-11-15T14:51:49Z

:(x) x + y$ is certainly terse, but does it scan easily in a large codebase as a lambda? Also this limits capture to copy only, or at least be defined by the author, rather than allow the compiler to decide / optimise. Take this made up syntax for example, if we allowed adding captures to the parameter list to signal that they follow the same rules (the in is default but included to emphasise what would be possible) :(x, in y$) x + y Might require double naming, but would give the author the ability to signal intent, rather than implement specifics. Here the compiler can decide whether it is better to capture by reference or copy. If I need to make sure the captured variables stay in scope I can specify move or copy as required. Also changing the name can sometimes be beneficial, :(x, in y$ = someReallyLongName) x + y / y ^ y As to your mention of performance, I don't see how postfix $ is changing performance in your example, it is identical to cpp1 [t = o.type()], and so would perform the same right? On 15 November 2023 13:42:56 Johel Ernesto Guerrero Peña ***@***.***> wrote: I know a lot of people are pushing for terser and terser syntax at the moment, but for the most part, C++ is a complex language, not a scripting language, and I'd rather type a few more characters if it added nice signposts to my code, allowing me to parse it visually more easily, and flag important things where they are happening. We can have the best of both worlds. The lack of ceremony also means that what is relevant stands out more. I'm a fan of :(x) x + y$, as the parameterized nature of the expression stands out more. Also, post(size() == size()$ + 1) is fantastic. You don't need to give size() a name and then use it. And the o.type()$ in sfml_types.std::ranges::any_of(:(x) o.type()$.starts_with(x)) (#789 (reply in thread)<#789 (reply in thread)>) is performant; it is captured once and used N times. — Reply to this email directly, view it on GitHub<#833 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQLDGEXWMBX3KAWXPRTYETBF3AVCNFSM6AAAAAA7L7UFDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJSGU2TSMRVG4>. You are receiving this because you commented.Message ID: ***@***.***>

JohelEGP · 2023-11-15T14:55:24Z

As to your mention of performance, I don't see how postfix $ is changing performance in your example, it is identical to cpp1 [t = o.type()], and so would perform the same right?

That's right.
But you have to give it a name far from its use,
which isn't necessarily an improvement.

JohelEGP · 2023-11-15T15:22:31Z

  v := :() = {
    x2 := x&$;
    y2 := y&$;
    w := :() = {
      z := x2$*;
      x2$* = y2$*;
      y2$* = z;
    };
    w();
  };

Capture the function expression instead (https://cpp2.godbolt.org/z/zne45Pb6o):

main: () = {
  using namespace ::std;

  x := 5;
  y := 7;

  v := :() = {
    w := :() = {
      z := x&$*;
      x&$* = y&$*;
      y&$* = z;
    }$;
    w();
  };

  cout << "x = (x)$, y = (y)$" << endl;
  v();
  cout << "x = (x)$, y = (y)$" << endl;
}

Program returned: 0
x = 5, y = 7
x = 7, y = 5

JohelEGP · 2023-11-15T15:37:56Z

Also this limits capture to copy only, or at least be defined by the author, rather than allow the compiler to decide / optimise.

I recall reading the same comment on Cpp1 lambdas.
As Cpp2 demonstrates, explicit is better than implicit.
There are examples in Cpp1 where "leaving it to the optimizer" has not proven itself for ranges of users.
I recall TDEH and HALO.
Please, share the others.

SebastianTroy · 2023-11-15T16:15:39Z

You've just referenced two cases where the optimizer handles something, as proof that the optimizer isn't always useful? I'm making the same argument for capture that has been made for parameter passing. I the author want to program my intent, not fill my code with symbols to handle the exact type shenanigans required to incant it... On 15 November 2023 15:38:11 Johel Ernesto Guerrero Peña ***@***.***> wrote: Also this limits capture to copy only, or at least be defined by the author, rather than allow the compiler to decide / optimise. I recall reading the same comment on Cpp1 lambdas. As Cpp2 demonstrates, explicit is better than implicit. There are examples in Cpp1 where "leaving it to the optimizer" has not proven itself for ranges of users. I recall TDEH<https://quuxplusone.github.io/blog/2019/08/02/the-tough-guide-to-cpp-acronyms/#eh-tdeh> and HALO<https://quuxplusone.github.io/blog/2019/08/02/the-tough-guide-to-cpp-acronyms/#halo>. Please, share the others. — Reply to this email directly, view it on GitHub<#833 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQNAM6MR6OXMHHKIYR3YETOV7AVCNFSM6AAAAAA7L7UFDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJSG43DKOJQGQ>. You are receiving this because you commented.Message ID: ***@***.***>

JohelEGP · 2023-11-15T16:30:02Z

You've just referenced two cases where the optimizer handles something, as proof that the optimizer isn't always useful?

As proof that relying on optimizations to make a feature viable in contexts where performance matters isn't a recipe for success.
Another example is std::ranges with its runtime performance and debugging experience (https://github.com/tcbrindle/flux improves on this area).

I'm making the same argument for capture that has been made for parameter passing. I the author want to program my intent, not fill my code with symbols to handle the exact type shenanigans required to incant it...

OK, that's much better.
In that case, I can get behind the idea of :(x, in y$ = someReallyLongName) x + y / y ^ y.

JohelEGP · 2023-11-15T17:21:21Z

I'm making the same argument for capture that has been made for parameter passing. I the author want to program my intent, not fill my code with symbols to handle the exact type shenanigans required to incant it...

OK, that's much better.
In that case, I can get behind the idea of :(x, in y$ = someReallyLongName) x + y / y ^ y.

We already have the syntax in Cpp2.
We just need to make the terse forms the default.
See https://cpp2.godbolt.org/z/7aPcooxjG:

main: () = {
  i := 1;
  {  f := :(x) -> _ = { (j := i$) return x + j; };  assert(f(2) == 3);  }
//{  f := :(x)          (j := i$)        x + j;     assert(f(2) == 3);  }
//{  f := :(x) (j := i$) x + j;                     assert(f(2) == 3);  }
  {  f := :() = { (inout j := i&$*) { j++; j++; } };  f();  assert(i == 3);  i = 1;  }
//{  f := :()     (inout j := i&$*) { j++; j++; };    f();  assert(i == 3);  i = 1;  }
//{  f := :() (inout j := i&$*) {
//          j++;
//          j++;
//        };                                          f();  assert(i == 3);  i = 1;  }
  {  f := :() -> forward _ = { (inout j := i&$*) { j++; return j++; } };  assert(f()& == i&);  assert(i == 3);  i = 1;  }
//{  f := :() -> forward _     (inout j := i&$*) { j++; return j++; };    assert(f()& == i&);  assert(i == 3);  i = 1;  }
//{  f := :() -> forward _ (inout j := i&$*) {
//          j++;
//          return j++;
//        };                                                              assert(f()& == i&);  assert(i == 3);  i = 1;  }
}

:(x) -> _ = { (j := i$) return x + j; } can be defaulted to
:(x) (j := i$) x + j;, and formatted it is
:(x) (j := i$) x + j;.
:() = { (inout j := i&$*) { j++; j++; } } can be defaulted to
:() (inout j := i&$*) { j++; j++; }, and formatted it is
:() (inout j := i&$*) {
j++;
j++;
};.
:() -> forward _ = { (inout j := i&$*) { j++; return j++; } } can be defaulted to
:() -> forward _ (inout j := i&$*) { j++; return j++; }, and formatted it is
:() -> forward _ (inout j := i&$*) {
j++;
return j++;
};.

JohelEGP · 2023-11-15T18:30:57Z

That formulation doesn't really work for copy parameters.
You make a copy of the actual capture each invocation.
That means that you can actually modify it (i.e., it isn't a member of the function expression object).
See https://cpp2.godbolt.org/z/Y7a3d5boE:

  {  f := :(x) -> _ = { (copy j := i$) return x + j++; };  assert(f(2) == 4);  assert(f(2) == 4);  }
  {  f := :(x)                                x + i$++;    _ = f;                                  }

JohelEGP · 2023-11-15T18:43:29Z

Perhaps it'd be better to just allow function expressions to be parameterized just like blocks.
But those parameters appertain to the function expression object.
Note that "function" already implies a function parameter list.
A function expression is a value, and its parameters are part of that value.
So the parameters of a function (which might be an expression)
are different from the parameters of a function expression.
See https://cpp2.godbolt.org/z/PG4s7GxYY:

main: () = {
  i := 1;

  {  f := (copy j := i) :(x) x + j;  assert(f(2) == 3);  }
  {  f := (     j := i) :(x) x + j;  assert(f(2) == 3);  }

  {  f := (inout j := i) :()              = { j++;        j++; };         f();          assert(i == 3);  i = 1;  }
  {  f := (inout j := i) :() -> forward _ = { j++; return j++; };  assert(f()& == i&);  assert(i == 3);  i = 1;  }
}

AbhinavK00 · 2023-11-15T18:52:50Z

This one is a big change, so I'd really like to hear what Herb has to say and what his rationale was behind the original decision along with if he thinks the change proposed here are worth it.

But since it's the syntax talk, I'll just propose something.
This is something from Hylo. To mark mutation in Hylo, the use a prefixed ampersand like &var.
So, a simple way to signal an inout capture could be something like var&$ or var$& to signify mutation or address. I know this isn't very far off from the current status quo but it can be explained in a different way.

Also, this idea was originally for marking inout call-sites but now the keyword inout is just used there, so I'll just throw it out there.
To mark inout arguments, use &

a : = 9;
f(x&, 8);  //first argmument is inout
//works very well with UFCS
x&.f(3);
//instead of the current
(inout x).f(7);

Making mutation explicit helps and I just think it's a great thing that Hylo has.

Note: As for the symbol used, either another symbol could be used (~ comes to mind) or cpp2 can have something different for taking address of a variable (maybe just recommend using std::addressof) or have something similar to sizeof (like addressof or addr) that just lowers to (preferably) std::addressof.

Sorry as this got off-topic but just wanted to throw it out there.

JohelEGP · 2023-11-16T00:59:41Z

Or just take the hit of having one declaration per capture, with pointers for inout (https://cpp2.godbolt.org/z/qTbe5fEer):

main: () = {
  i := 1;
  _ = :() -> _ = {
    j := i&$;
    k := i$;
    return j*++ + k;
  };
}

But that is sub-optimal, as you make copies of the captures:

auto main() -> int{
  auto i {1};
  static_cast<void>([_0 = (&i), _1 = std::move(i)]() -> auto{
    auto j {_0};
    auto k {_1};
    return ++*cpp2::assert_not_null(std::move(j)) + std::move(k);
  });
}

A block parameter isn't better (https://cpp2.godbolt.org/z/36Esj66s9):

main: () = {
  i := 1;
  (j := i&)
    _ = :() -> _ = {
      k := i$;
      return j$*++ + k;
    };
}

auto main() -> int{
  auto i {1}; 
{
auto const& j = &i;

    static_cast<void>([_0 = std::move(i), _1 = j]() -> auto{
      auto k {_0}; 
      return ++*cpp2::assert_not_null(_1) + std::move(k); 
    });
}
}

I think this argues in favor of #833 (comment)
to just do the right, efficient thing.

msadeqhe · 2023-11-16T04:12:32Z

@JohelEGP I think having one declaration per capture as you have suggested, is the right way.

Cpp2 compiler can optimize and avoid the copy by making them a real alias.

In a short lambda body, the programmer can directly write &$* or $ in any expression:

var1: = 10;

call(: () = {
    print(var1&$* + var1&$*);
});

But in a long lambda body, the programmer can optionally make aliases to captures:

var1: = 10;

call(: () = {
    // We can use the same variable name `var1`, because it's in a different scope.
    var1: = var1&$*;

    print(var1 + var1);
});

Every var1 in the lambda body will be replaced (similar to a macro) with var1&$*, hence they are aliases.

If this optimization is not acceptable as the programmer expects var1 to have its own memory, a new syntax can be used to create aliases. But variable declaration with == is constepxr, therefore it cannot be used in this case. Is that right?

msadeqhe · 2023-11-16T04:54:35Z

For the alias syntax (in a way that the new name will be replaced with the old name, similar to a macro):

abc: namespace  alias = some::long::name;
v32: type       alias = std::vector<i32>;
fnc: (x) -> i32 alias = /* any expression */;
var: i32        alias = /* any expression */;

alias is a contextual keyword. alias declarations are similar to namespace, type, function and variable declarations, except when we use them, they will be replaced with their definition, similar to macro.

For example:

var1: = 10;
// The type is optional in variable alias declarations.
var2: alias = var1 + 10;

print(var2);
// It generates the following Cpp1 code:
// print(var1 + 10);

Or something similar to this approach, to avoid the copy when we declare a variable to capture.

msadeqhe · 2023-11-16T07:07:20Z

abc: namespace  alias = some::long::name;
v32: type       alias = std::vector<i32>;
fnc: (x) -> i32 alias = /* any expression */;
var: i32        alias = /* any expression */;

In this example, function aliases are simply Forced Inline Functions in terms of Cpp1.

And variable aliases are like function aliases without parameters.

JohelEGP · 2023-11-16T14:23:20Z

@JohelEGP I think having one declaration per capture as you have suggested, is the right way.

Cpp2 compiler can optimize and avoid the copy by making them a real alias.

For some context, we're talking about these declarations j and k:

  i := 1;
  _ = :() -> _ = {
    j := i&$;
    k := i$;
    return j*++ + k;
  };

I did considered this, and thought of two things.

First, the need to prove that it doesn't make a performance difference.
It certainly does for k, which is initialized from the capture i$ each call.

Second, the compiler could recognize that up-front variables are captures.
But that's a semantic change without a change in syntax.
Moving the declaration of k after some non-declaration statement still copies each call.

That said, your comment made me think that the status-quo might be fine.
So today you can continue to write (https://cpp2.godbolt.org/z/aMKnz8PPE):

main: () = {
  i := 1;
  (k := i)
    _ = :() -> _ = {
      j := i&$;
      return j*++ + k$;
    };
  (k := i, j := i&)
    _ = :() -> _ = {
      return j$*++ + k$;
    };
}

Alias the variables before capturing them.
Rely on compiler optimizations for variable declarations in the body.

However, I still think (copy k := i, j := i&) :() is strictly superior (#833 (comment)):

SWYM.
No awkward distance between the block parameters and the function expression
which might be necessary due to the shape of the code.
Only this one handles copies of complex expressions well.
Unless you're OK with writing k := :() long*.expression()$; outside and k$() inside the function expression.

It looks to me that not having a dedicated place for declaring captures
can lead you to look for workarounds that might not work as you expect.
Specially for copies of complex expressions.
You might want to use those more than once in the body.
The only way to avoid copying twice is as with k$() above.

Copies become more important once you can have this parameters on function expressions (https://cpp1.godbolt.org/z/sPhTb9ahr).
So I think we might really need a solution like #833 (comment).

I certainly like the current status of capturing at the point of use.

filipsajdak · 2023-11-16T18:20:37Z

My favorite grawlix idiom

wolfseifert · 2023-11-17T06:38:57Z

Closing in favour of #247.

JohelEGP · 2023-12-07T02:06:38Z

That formulation doesn't really work for copy parameters.
You make a copy of the actual capture each invocation.
-- #833 (comment).

This is no longer the case since commit 4bd0c04.
Though I have yet to update to use it, it seems like a promising change in conceptual model.

my conceptual model of captures is actually closer to them being additional local variables that are "static" (persistent across calls), and local variables are non-const by default.
-- 4bd0c04#commitcomment-134056070.

See comment thread for previous commit: 4bd0c04

wolfseifert added the suggestion label Nov 15, 2023

wolfseifert closed this as completed Nov 17, 2023

JohelEGP referenced this issue Dec 7, 2023

Emit anonymous functions as constexpr or mutable lambdas

b9e73e0

See comment thread for previous commit: 4bd0c04

JohelEGP mentioned this issue Dec 8, 2023

[BUG] Can't capture outer variable in nested function expression #880

Open

JohelEGP mentioned this issue Jan 5, 2024

[BUG] Can't interpolate captured function call #838

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lambdas that modify their closure: &$* #833

Lambdas that modify their closure: &$* #833

wolfseifert commented Nov 15, 2023

msadeqhe commented Nov 15, 2023

AbhinavK00 commented Nov 15, 2023

SebastianTroy commented Nov 15, 2023 via email

JohelEGP commented Nov 15, 2023

SebastianTroy commented Nov 15, 2023 via email

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

SebastianTroy commented Nov 15, 2023 via email

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

AbhinavK00 commented Nov 15, 2023

JohelEGP commented Nov 16, 2023

msadeqhe commented Nov 16, 2023 •

edited

Loading

msadeqhe commented Nov 16, 2023 •

edited

Loading

msadeqhe commented Nov 16, 2023 •

edited

Loading

JohelEGP commented Nov 16, 2023

filipsajdak commented Nov 16, 2023

wolfseifert commented Nov 17, 2023

JohelEGP commented Dec 7, 2023

Lambdas that modify their closure: &$* #833

Lambdas that modify their closure: &$* #833

Comments

wolfseifert commented Nov 15, 2023

msadeqhe commented Nov 15, 2023

AbhinavK00 commented Nov 15, 2023

SebastianTroy commented Nov 15, 2023 via email

JohelEGP commented Nov 15, 2023

SebastianTroy commented Nov 15, 2023 via email

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

SebastianTroy commented Nov 15, 2023 via email

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

JohelEGP commented Nov 15, 2023

AbhinavK00 commented Nov 15, 2023

JohelEGP commented Nov 16, 2023

msadeqhe commented Nov 16, 2023 • edited Loading

msadeqhe commented Nov 16, 2023 • edited Loading

msadeqhe commented Nov 16, 2023 • edited Loading

JohelEGP commented Nov 16, 2023

filipsajdak commented Nov 16, 2023

wolfseifert commented Nov 17, 2023

JohelEGP commented Dec 7, 2023

msadeqhe commented Nov 16, 2023 •

edited

Loading

msadeqhe commented Nov 16, 2023 •

edited

Loading

msadeqhe commented Nov 16, 2023 •

edited

Loading