Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Split the `||` unboxed closure form into two forms and remove the capture clauses and self type specifiers #231

Merged
merged 4 commits into from Sep 18, 2014

Conversation

Projects
None yet
@pcwalton
Copy link
Contributor

pcwalton commented Sep 9, 2014

No description provided.

@bgamari

This comment has been minimized.

Copy link

bgamari commented Sep 10, 2014

@reem

This comment has been minimized.

Copy link

reem commented Sep 10, 2014

Is there a reason this distinction can't be made with || and ref ||? that seems much clearer than proc() , which is very random.

@pcwalton

This comment has been minimized.

Copy link
Contributor Author

pcwalton commented Sep 10, 2014

The problem with ref || is that it's much more common than ||, so Huffman coding suggests that by-reference captures should be shorter.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 10, 2014

@reem Also, ref isn't quite right -- it's not that all captures are by reference, but rather that the closure is claimed not to escape, and therefore the capture mode is inferred based on the body.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Sep 10, 2014

@reem To clarify: the new semantics for || closures here is not the same as ref || as previously proposed.


Having to specify `ref` and the capture mode for each unboxed closure is inconvenient (see Rust PR rust-lang/rust#16610). It would be more convenient for the programmer if, the type of the closure and the modes of the upvars could be inferred. This also eliminates the "line-noise" syntaxes like `|&:|`, which are arguably unsightly.

Not all knobs can be removed, however: the programmer must manually specify whether each closure is escaping or nonescaping. To see this, observe that no sensible default for the closure `|| (*x).clone()` exists: if the function is nonescaping, it's a closure that can be called multiple times and returns a copy of `x` every time; if the function is escaping, it's a closure that can be called once and returns a copy of `x`.

This comment has been minimized.

@huonw

huonw Sep 10, 2014

Member

I don't understand this? Why is it only once-callable if it escapes? It seems that the following works fine with our current closure scheme:

fn foo<'a, T: Clone>(x: &'a T) -> impl Fn<(), T> + 'a {
     |&:| (*x).clone()
}

(Using abstract return types & capture by value.)

This comment has been minimized.

@huonw

huonw Sep 10, 2014

Member

And, even something like

fn foo<T: Clone>(x: T) -> impl Fn<(), T> {
    |&:| x.clone()
}

can be called multiple times, despite moving x into the closure.

This comment has been minimized.

@aturon

aturon Sep 10, 2014

Member

@huonw I think this was a mistake in the RFC writeup. The intent was to allow escaping closures to infer the closure type.

In particular, although proc always moves in its upvars, it may still only use them by-ref, and hence can be a Fn.

@reem

This comment has been minimized.

Copy link

reem commented Sep 10, 2014

I'm mostly just against using proc(), since it's so different than || and that just leads to confusion.


1. Any variable which is closed over and borrowed mutably is by-reference and mutably borrowed.

2. Any variable of a type that does not implement `Copy` which is moved within the closure is captured by value.

This comment has been minimized.

@huonw

huonw Sep 10, 2014

Member

So, the only distinction between Copy and non-Copy types that are used by-value inside the closure is the size of the closure itself? (i.e. non-Copy types are stored inline in the closure environment but Copy types only have a reference.)

Is this a special case that's actually worth having? It seems like "anything used by-value is captured by value" isn't a crazy rule?


The `ref` prefix for unboxed closures is removed, since it is now essentially implied.

The `proc()` syntax is repurposed for unboxed closures. The value returned by a `proc()` implements `FnOnce`, `FnMut`, or `Fn`, as determined above; thus, for example, `proc(x: int, y) x + y` produces an unboxed closure that implements the `FnOnce(int, int) -> int` trait. Free variables referenced by a `proc` are always captured by value.

This comment has been minimized.

@huonw

huonw Sep 10, 2014

Member

Why is this FnOnce? Shouldn't that proc be Fn?

This comment has been minimized.

@pcwalton

pcwalton Sep 10, 2014

Author Contributor

All Fn implicitly also implements FnOnce (since if you can call something multiple times you can call it once).

This comment has been minimized.

@huonw

huonw Sep 10, 2014

Member

Right, so it seems like it's more correct to state "produces an unboxed that implements the Fn(int, int) -> int trait" since this is the most specific statement: every closure implements FnOnce, but only a limited subset implement FnMut and an even smaller subset of this implement Fn.

@pcwalton

This comment has been minimized.

Copy link
Contributor Author

pcwalton commented Sep 10, 2014

@huonw @aturon Corrected.

@pcwalton

This comment has been minimized.

Copy link
Contributor Author

pcwalton commented Sep 10, 2014

@huonw Corrected example.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Sep 10, 2014

I'd prefer different syntax for proc (e.g. own |x| x). proc is and has always been a terrible keyword/name for these things imo. I also would prefer the same rule simplification suggested by huonw.

I don't understand how the trait bound syntax sugar can possibly work. Is it going to be <ident>(a, b, c) -> d going to <ident><(a, b, c), d> for any <ident>? That seems like it would enable some very interesting syntax sugar abuse. Or are Fn etc going to become keywords?

To that end, I'd prefer to keep the current sugar for trait bounds.

@pcwalton

This comment has been minimized.

Copy link
Contributor Author

pcwalton commented Sep 10, 2014

@SiegeLord The grammar will be path '(' args ')' ('->' ret). The parser will accept any path, but the typechecker will restrict the traits to the built-in lang items.

I strongly prefer the path syntax, because |&:| is far too line noisy.

@pcwalton

This comment has been minimized.

Copy link
Contributor Author

pcwalton commented Sep 10, 2014

Not only that, |&: is really confusing when |&: will not be allowed as closure construction syntax and when proc is also a valid way to create closures. Honestly I see keeping the current trait sugar syntax as a non-starter.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Sep 10, 2014

My first reaction was that || and proc() overlap a bit too much and having to switch between them depending on how you use your captures may be confusing.

I am aware that by-value by default and by-ref with sugar was attempted and later rejected.
But given that || with forced moving out of captures can be used to emulate proc(), was "by-ref by default, move out of captures for by-value" considered at all?

It would be interesting to know how many of the old proc() uses move out of their captures, and would work out of the box with this new ||.
Even if they don't work, it doesn't have to be this bad:

let (tx, rx) = channel();
let data = arc.clone();
spawn(|| {
    let tx = tx;
    let data = data;
    tx.send(data.crunch());
});

What about

macro_rules! take { ($($x:ident),+) => { $(let $x = $x;)* } }
let (tx, rx) = channel();
let data = arc.clone();
spawn(|| {
    take!(tx, data);
    tx.send(data.crunch());
});

Or

let (tx, rx) = channel();
let data = arc.clone();
spawn(|| { take!(tx, data);
    tx.send(data.crunch());
});
@huonw

This comment has been minimized.

Copy link
Member

huonw commented Sep 10, 2014

FWIW, it's already a lot of effort to capture (shared) things in a task-spawning closure, e.g.

let data = Arc::new(...);
let flag = Arc::new(Mutex::new(...));
let (tx, rx) = channel();

for i in range(...) {
    let data = data.clone();
    let flag = flag.clone();
    let tx = tx.clone();
    spawn(|| {
        // use data, flag, tx
    })
}

The let lines inside the loop are unnecessary in many/most other languages, and adding yet-more boilerplate to that seems sad. But maybe a few nice macros means the advantages are worth it?

(Also, fwiw, the macro would have to expand to a single let, via a tuple match like let ( $( $x, )* ) = ( $( $x, )* );, since multiple statements are not allowed.)

@pcwalton

This comment has been minimized.

Copy link
Contributor Author

pcwalton commented Sep 10, 2014

BTW, own || is not a bad syntax in lieu of proc, IMO.

@netvl

This comment has been minimized.

Copy link

netvl commented Sep 10, 2014

Looks really great, though proc() is bad syntax IMO. Maybe just repurpose |:| instead? So

|x, y| x + y

is non-escaping closure and

|:x, y| x + y

is escaping one. This syntax is neat and not alien like proc(). The possible downside is that it is too similar to non-escaping closures, however, I think this is a good thing because these kinds of closures are much more similar than closures and procs are today.


2. Any variable of a type that does not implement `Copy` which is moved within the closure is captured by value.

3. Any other variable which is closed over is by-reference and immutably borrowed.

This comment has been minimized.

@netvl

netvl Sep 10, 2014

Does this mean (taking 1st and 2nd items into account) that Copy-types are always borrowed by reference? I think this is strange restriction, I certainly wouldn't expect such behavior. Everywhere else Copy types are always implicitly copied, as far as I know.

@Valloric

This comment has been minimized.

Copy link

Valloric commented Sep 10, 2014

proc() looks too different from || for no good reason. I don't care much what's the final design as long as the two forms look similar (since they are similar). own || sounds decent enough.


The `ref` prefix for unboxed closures is removed, since it is now essentially implied.

The `proc()` syntax is repurposed for unboxed closures. The value returned by a `proc()` implements `FnOnce`, `FnMut`, or `Fn`, as determined above; thus, for example, `proc(x: int, y) x + y` produces an unboxed closure that implements the `Fn(int, int) -> int` trait (and thus the `FnOnce(int, int) -> int` trait by inheritance). Free variables referenced by a `proc` are always captured by value.

This comment has been minimized.

@P1start

P1start Sep 10, 2014

Contributor

and thus the FnOnce(int, int) -> int trait by inheritance

The traits don’t inherit each other at the moment, but it was discussed as an alternative in the unboxed closures RFC. Should the inheritance between traits be added to this RFC?

@pcwalton

This comment has been minimized.

Copy link
Contributor Author

pcwalton commented Sep 11, 2014

Another suggestion by @aturon is move ||.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Sep 11, 2014

Not only that, |&: is really confusing when |&: will not be allowed as closure construction syntax

The RFC could be amended to allow that in the closure construction syntax, as a way to hint/override inference. This is not unprecedented (e.g. with number literals suffixes). Line-noise is the only self-consistent argument here, I think. Then again, I don't feel super-strongly as I despise the || syntax in every location.

Another suggestion by @aturon is move ||.

That crossed my mind, but moving Copy types is nonsensical.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Sep 11, 2014

A move and copy are exactly the same thing under the hood (all by-value uses of a value are semantically a shallow byte-copy); the only difference is the source is disallowed from further use for non-Copy types. Hence, it's not complete nonsense to talk about by-value uses as "moves" with a few special cases where you can continue to use the value you moved from.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Sep 11, 2014

The very reason why 'move' vs 'copy' distinction exists is to be able to use both terms without introducing special cases.

few special cases where you can continue to use the value you moved from

In writing that sentence you completely devalued the word 'move' as a term. If I can't use 'move' to unambiguously mean the move of ownership without qualifying whether I use a move move vs a copy move, why even introduce the distinction?

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Sep 11, 2014

I personally think we emphasise the difference far too much (and end up with unnecessary complications the other way), and people end up thinking at least one of an "implicit copy" or "move" is magical, but neither are and they're actually essentially identical. I try to use "by-value use" for the umbrella term but that is not appropriate for this closure syntax (e.g. by_value || is quite wordy, and value || is reserving a keyword that would be very upsetting to lose as an identifier).

In most cases, people don't care that a by-value use transfers ownership, other than the fact that the compiler complains. That is, transferring ownership is rarely an ends in itself, just a necessary property required for safety, hence there isn't actually that much value in having strict rules about not connecting "implicit copies" and ownership transfers (other than pedagogy, but we don't have good pedagogy around this right now anyway IMO).

why even introduce the distinction?

Yes, that's my point: we make too big a deal out of the distinction.

@alexcrichton alexcrichton force-pushed the rust-lang:master branch from 6357402 to e0acdf4 Sep 11, 2014

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Sep 13, 2014

I like most of the ideas in here (which is not surprising...). After thinking about it a bit, the proposed capture mode rules for non-escaping closures also make sense to me. Would it be fair to summarize these as "every free variable is captured by reference, except when this would prevent the body of the closure from being legal, in which case the variable is captured by value instead"?

(Like others, I'm not sure if it makes sense to consider Copy here; but I also don't see much harm. When does it make a difference? If it doesn't make much difference either way (as I suspect), then it would probably be preferable to keep it simple and not consider it.)

On to important matters: bikeshedding the syntax of escaping closures!

As there aren't many hard constraints in the current language, both of my considerations relate to compatibility and consistency with potential future language constructs.

First of all, I think it would be nice at some point to grow support for unboxed trait object literals of any trait, not just Fn*. Because the exact same considerations with respect to capture modes are going to recur here as are discussed in this RFC, this argues for having some kind of modifier (e.g. move) on the existing closure syntax to denote escaping closures, instead of having two entirely distinct syntaxes (proc). The same modifier could then be applied to trait object literals of other traits to mean the same thing.

Second, I think move would be a fine choice of name for this modifier, and I don't mind move becoming a keyword because I think we should also add &move references at some point. Unfortunately this is also a problem, because it leads to ambiguity: is &move |x| x a shared reference to an escaping closure or an &move reference to a non-escaping closure? (I don't have any ideas yet about the best way to move forward here.)

(Incidentally, if we already had &move references, for consistency we would presumably have non-escaping closures capture their moved free variables by &move reference rather than by value; but I'm not sure if this would make much difference in practice.)

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Sep 16, 2014

I have a strong feeling this would lead to ambiguous/hard parsing issues, but I'll mention it anyway just in case:

proc(x, y, z) could be changed to ||x, y, z||. This makes it very similar to, but easily distinguishable from |x, y, z|, and provides a sense of it just being a "heavy duty" closure. The fact that || is a valid by-ref closure seems problematic though.

Edit: maybe if you require writing || as | |? Could see that leading to some frustrating errors, though.

@aturon aturon force-pushed the rust-lang:master branch from 4c0bebf to b1d1bfd Sep 16, 2014

@netvl

This comment has been minimized.

Copy link

netvl commented Sep 17, 2014

Once again I want to suggest re-purposing |:| syntax for escaping closures. Its advantage is that it does not require a new keyword and it is very lightweight. This syntax won't be needed anyway when this change lands.


Having to specify `ref` and the capture mode for each unboxed closure is inconvenient (see Rust PR rust-lang/rust#16610). It would be more convenient for the programmer if, the type of the closure and the modes of the upvars could be inferred. This also eliminates the "line-noise" syntaxes like `|&:|`, which are arguably unsightly.

Not all knobs can be removed, however: the programmer must manually specify whether each closure is escaping or nonescaping. To see this, observe that no sensible default for the closure `|| (*x).clone()` exists: if the function is nonescaping, it's a closure that returns a copy of `x` every time but does not move `x` into it; if the function is escaping, it's a that returns a copy of `x` and takes ownership of `x`.

This comment has been minimized.

@P1start

P1start Sep 17, 2014

Contributor

Can this really not be inferred? With enough complexity in the compiler, it could default to always moving into closures unless the value moved into it is used after the closure is defined, in which case it could default to by-reference. Is the problem with this approach just that it would require an incredibly complex (and therefore probably buggy) implementation in the compiler, or is there another reason why this can’t be inferred?

This comment has been minimized.

@reem

reem Sep 17, 2014

It can probably be inferred, but the complexity means that this would be a very complex source of bugs for users refactoring code - I think it's similar to why function types are not inferred.

This comment has been minimized.

@P1start

P1start Sep 17, 2014

Contributor

I thought about that, but I don’t think it’d be a source of bugs at all—it’d be almost entirely invisible to the user. From what I can tell, the non-escaping closures in the RFC are accepted in every place the escaping ones are. The only difference is how they do things. I am merely proposing that the non-escaping closures move whenever possible, and that the now-redundant escaping closures be removed.

I suppose the problem with this method is that a lot (most?) of the time you don’t want to move the values into the closure. Perhaps the proposed non-escaping closures could stay as-is, and the escaping ones could be removed completely. The only advantage I see in the escaping closures is that they move things by value more often, which often isn’t even an advantage. Choosing by-ref by default is the most extensible option, as one can still opt in to the old behaviour; e.g., || { let foo = x.clone(); drop(x); foo }. Perhaps I should just rethink the RFC’s proposed escaping closures as sugar for the above for when that functionality is needed.

@pcwalton pcwalton merged commit 2e7023d into rust-lang:master Sep 18, 2014

@reem

This comment has been minimized.

Copy link

reem commented Sep 18, 2014

@pcwalton Are there meeting notes for the decision to merge this? Curious what was discussed.

@japaric

This comment has been minimized.

Copy link
Member

japaric commented Sep 18, 2014

@reem I think these are the last meeting notes about this topic.

@P1start

This comment has been minimized.

Copy link
Contributor

P1start commented Sep 18, 2014

  • pnkfelix: You mentioned the move and owned vars syntax, but hte RFC still mentions the proc syntax...
  • pcwalton: That was universally disliked. I will update the RFC before I merge it. The biggest thing right now is that I want to make sure we can land the Trait change.

The RFC was not updated to reflect this decision.

wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019

Merge pull request rust-lang#231 from mike-lang/patch-1
Update issue link to go directly to where progress was tracked
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.