New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: new lifetime elision rules #141

Merged
merged 4 commits into from Jul 9, 2014

Conversation

Projects
None yet
@aturon
Member

aturon commented Jun 26, 2014

Rendered (draft)

text/

tracking issue: rust-lang/rust#15552

Note: the core idea for this RFC and the initial survey both came from @wycats.

* If there is exactly one input lifetime position (elided or not), that lifetime
is assigned to _all_ elided output lifetimes.
* If there are multiple input lifetime positions, but one of them is `&self` or

This comment has been minimized.

@nrc

nrc Jun 26, 2014

Member

I find this rule a bit surprising (the others make perfect sense). I can intuitively see the motivation that self ought to be privileged but, I can't really justify why that is so. Looking at the examples below, the ones using this rule took me a lot longer to grok.

@nrc

nrc Jun 26, 2014

Member

I find this rule a bit surprising (the others make perfect sense). I can intuitively see the motivation that self ought to be privileged but, I can't really justify why that is so. Looking at the examples below, the ones using this rule took me a lot longer to grok.

This comment has been minimized.

@wycats

wycats Jun 26, 2014

Contributor

The rationale is that in several usage surveys, this was essentially the only pattern we saw when &self was involved.

I believe that the reason for this is that when you're borrowing something out of self, it makes sense to involve another ref for computation. In contrast, it's a very unusual pattern to borrow something out of a value as a method of some other object. It's just not really how people think about using methods and objects in general, so it doesn't happen (almost at all).

I suspect that in cases where this pattern could occur, people use standalone functions instead of methods.

@wycats

wycats Jun 26, 2014

Contributor

The rationale is that in several usage surveys, this was essentially the only pattern we saw when &self was involved.

I believe that the reason for this is that when you're borrowing something out of self, it makes sense to involve another ref for computation. In contrast, it's a very unusual pattern to borrow something out of a value as a method of some other object. It's just not really how people think about using methods and objects in general, so it doesn't happen (almost at all).

I suspect that in cases where this pattern could occur, people use standalone functions instead of methods.

This comment has been minimized.

@bstrie

bstrie Jun 26, 2014

Contributor

@wycats, what proportion of the cited 87% would be lost if this rule were not accepted? I don't personally object to it, but I can see how it's a bit more flimsy than the others, and I would be willing to live without it if the statistics bore it out.

@bstrie

bstrie Jun 26, 2014

Contributor

@wycats, what proportion of the cited 87% would be lost if this rule were not accepted? I don't personally object to it, but I can see how it's a bit more flimsy than the others, and I would be willing to live without it if the statistics bore it out.

This comment has been minimized.

@bill-myers

bill-myers Jun 26, 2014

I think that it should use the lifetime of the first input parameter, regardless of whether it is self or not, and only if it is an elided lifetime.

This avoids issues with UFC and makes method and non-method functions work the same.

Supporting elision of lifetimes only in the return value when they are explicit on self seems a bad idea, since it is counterintuitive. Also, it doesn't work for multiple explicit lifetimes (e.g. &'a Block<'b>).

@bill-myers

bill-myers Jun 26, 2014

I think that it should use the lifetime of the first input parameter, regardless of whether it is self or not, and only if it is an elided lifetime.

This avoids issues with UFC and makes method and non-method functions work the same.

Supporting elision of lifetimes only in the return value when they are explicit on self seems a bad idea, since it is counterintuitive. Also, it doesn't work for multiple explicit lifetimes (e.g. &'a Block<'b>).

https://gist.github.com/aturon/da49a6d00099fdb0e861
# Drawbacks

This comment has been minimized.

@nrc

nrc Jun 26, 2014

Member

Another drawback: I find full specification of lifetime parameters makes it easier to understand what is going on. Even today, I often write the lifetimes where they could be elided because I think it makes code easier to reason about if you can name things. If I have a lifetime error, the first thing I do is add explicit lifetimes wherever they are missing.

I get the impression I'm in the minority with this though.

To me, these extra rules trade off easier reading (and writing) when you don't need to think about lifetimes too much against greater cognitive overhead when you do have to think about them. I guess that since reading code is more common than debugging lifetime errors, this trade off is worthwhile. I certainly like the idea of reducing lifetime noise.

@nrc

nrc Jun 26, 2014

Member

Another drawback: I find full specification of lifetime parameters makes it easier to understand what is going on. Even today, I often write the lifetimes where they could be elided because I think it makes code easier to reason about if you can name things. If I have a lifetime error, the first thing I do is add explicit lifetimes wherever they are missing.

I get the impression I'm in the minority with this though.

To me, these extra rules trade off easier reading (and writing) when you don't need to think about lifetimes too much against greater cognitive overhead when you do have to think about them. I guess that since reading code is more common than debugging lifetime errors, this trade off is worthwhile. I certainly like the idea of reducing lifetime noise.

This comment has been minimized.

@huonw

huonw Jun 26, 2014

Member

👍 to full specs making things clearer.

@huonw

huonw Jun 26, 2014

Member

👍 to full specs making things clearer.

This comment has been minimized.

@wycats

wycats Jun 26, 2014

Contributor

@nick29581 It might be worth considering having the compiler optionally show you all of the inferred lifetimes when there are error messages that involve lifetimes: rustc --errors=expanded or something.

That said, I think the error message improvements in this proposal go a long way to making it obvious what has happened when you inappropriately elided a lifetime. Similar error message work around other lifetime errors would go a long way to improving the general ergonomics of explicit lifetimes as well, and we should work on that!

@wycats

wycats Jun 26, 2014

Contributor

@nick29581 It might be worth considering having the compiler optionally show you all of the inferred lifetimes when there are error messages that involve lifetimes: rustc --errors=expanded or something.

That said, I think the error message improvements in this proposal go a long way to making it obvious what has happened when you inappropriately elided a lifetime. Similar error message work around other lifetime errors would go a long way to improving the general ergonomics of explicit lifetimes as well, and we should work on that!

This comment has been minimized.

@bstrie

bstrie Jun 26, 2014

Contributor

@nick29581, that same argument can be made for type inference. Just like type inference, nothing is stopping you from being fully explicit with lifetimes if you deem it's better for readability.

@bstrie

bstrie Jun 26, 2014

Contributor

@nick29581, that same argument can be made for type inference. Just like type inference, nothing is stopping you from being fully explicit with lifetimes if you deem it's better for readability.

Show outdated Hide outdated active/0000-lifetime-elision.md
fn get_str<'a>() -> &'a str;
```
The ellision rules work well for functions that consume references, but not for

This comment has been minimized.

@chris-morgan

chris-morgan Jun 26, 2014

Member

s/ellision/elision/

@chris-morgan

chris-morgan Jun 26, 2014

Member

s/ellision/elision/

@kballard

This comment has been minimized.

Show comment
Hide comment
@kballard

kballard Jun 26, 2014

Contributor

👍

Contributor

kballard commented Jun 26, 2014

👍

* For `impl` headers, input refers to the lifetimes appears in the type
receiving the `impl`, while output refers to the trait, if any. So `impl<'a>
Foo<'a>` has `'a` in input position, while `impl<'a> SomeTrait<'a> Foo<'a>`
has `'a` in both input and output positions.

This comment has been minimized.

@chris-morgan

chris-morgan Jun 26, 2014

Member

I think the word for is lacking from the second example. It’s not an obvious example of where the lifetimes are, either—it could be rewritten as the probably-fairly-nonsensical “impl<'a, 'b> SomeTrait<'a> for Foo<'b> has 'a in [the] output position and 'b in [the] input position”.

(As for the “the”, I think that should be there in all these cases, or “an” as the case may be in some places. This affects much of the document.)

@chris-morgan

chris-morgan Jun 26, 2014

Member

I think the word for is lacking from the second example. It’s not an obvious example of where the lifetimes are, either—it could be rewritten as the probably-fairly-nonsensical “impl<'a, 'b> SomeTrait<'a> for Foo<'b> has 'a in [the] output position and 'b in [the] input position”.

(As for the “the”, I think that should be there in all these cases, or “an” as the case may be in some places. This affects much of the document.)

@huonw

This comment has been minimized.

Show comment
Hide comment
@huonw

huonw Jun 26, 2014

Member

I'm nervous about adding elision for output parameters since I'm slightly concerned that may make things less clear (a minor adjustment to a signature that otherwise compiles would make the compiler spew weird errors), but I am in favour of elision in input position in impl, that is:

impl BufReader { ... }
impl Reader for BufReader { ... }
impl Reader for (&str, &str) { ... }
Member

huonw commented Jun 26, 2014

I'm nervous about adding elision for output parameters since I'm slightly concerned that may make things less clear (a minor adjustment to a signature that otherwise compiles would make the compiler spew weird errors), but I am in favour of elision in input position in impl, that is:

impl BufReader { ... }
impl Reader for BufReader { ... }
impl Reader for (&str, &str) { ... }
@chris-morgan

This comment has been minimized.

Show comment
Hide comment
@chris-morgan

chris-morgan Jun 26, 2014

Member

@huonw Do you mean output parameters in general, or just in impls?

Member

chris-morgan commented Jun 26, 2014

@huonw Do you mean output parameters in general, or just in impls?

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jun 26, 2014

Contributor
  • If there are multiple input lifetime positions, but one of them is &self or &mut sef, the lifetime of self is assigned to all elided output lifetimes.

I don't like this rule. The other rules have the property that there's no other way the signature could possibly make sense: i.e., the desugaring is unambiguous. Here we're making an arbitrary choice. I don't think we should do that.

Subtlety for non-& types

There's an additional subtlety: lifetime parameters of & types are covariant. For other types, they may not be. For instance:

struct Callback<'s> {
    callback: fn(&'s str) -> int;
}

fn some_fn(cb: Callback) -> &str;

// Under proposed rules desugars to:
fn some_fn<'s>(cb: Callback<'s>) -> &'s str;

Here Callback has a contravariant lifetime parameter. And the desugaring doesn't make sense, because there's no way you can get something with a lifetime of 's out of a Callback<'s>; you can only "put one in". In other words, Callback's lifetime parameter is in an output position.

If you just take that into account when applying the rules, then I think they would keep working. But I'm not sure what the situation is with invariant or bivariant lifetime parameters, because I haven't thought about it yet.

Contributor

glaebhoerl commented Jun 26, 2014

  • If there are multiple input lifetime positions, but one of them is &self or &mut sef, the lifetime of self is assigned to all elided output lifetimes.

I don't like this rule. The other rules have the property that there's no other way the signature could possibly make sense: i.e., the desugaring is unambiguous. Here we're making an arbitrary choice. I don't think we should do that.

Subtlety for non-& types

There's an additional subtlety: lifetime parameters of & types are covariant. For other types, they may not be. For instance:

struct Callback<'s> {
    callback: fn(&'s str) -> int;
}

fn some_fn(cb: Callback) -> &str;

// Under proposed rules desugars to:
fn some_fn<'s>(cb: Callback<'s>) -> &'s str;

Here Callback has a contravariant lifetime parameter. And the desugaring doesn't make sense, because there's no way you can get something with a lifetime of 's out of a Callback<'s>; you can only "put one in". In other words, Callback's lifetime parameter is in an output position.

If you just take that into account when applying the rules, then I think they would keep working. But I'm not sure what the situation is with invariant or bivariant lifetime parameters, because I haven't thought about it yet.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jun 26, 2014

Contributor

OK, so in plain English, I think the rule should be: If there's exactly one readable lifetime and N writable ones, all the writable lifetimes are assumed to be the same as the readable one. Lifetime parameters in covariant position are readable, in contravariant writable, invariant both, bivariant neither.

Contributor

glaebhoerl commented Jun 26, 2014

OK, so in plain English, I think the rule should be: If there's exactly one readable lifetime and N writable ones, all the writable lifetimes are assumed to be the same as the readable one. Lifetime parameters in covariant position are readable, in contravariant writable, invariant both, bivariant neither.

@wycats

This comment has been minimized.

Show comment
Hide comment
@wycats

wycats Jun 26, 2014

Contributor

@huonw I think the proposed error messages will go a long way to avoid "compiling spewing weird error messages", no?

Contributor

wycats commented Jun 26, 2014

@huonw I think the proposed error messages will go a long way to avoid "compiling spewing weird error messages", no?

@pcwalton

This comment has been minimized.

Show comment
Hide comment
@pcwalton

pcwalton Jun 26, 2014

Contributor

I was originally a bit nervous about this sort of thing, but now I have no objections.

I'm slightly more nervous about the self thing, but I'm fine with trying it and seeing how it goes. I think that the "suggest-a-lifetime" error messages that we now have make this sort of thing easier to deal with.

Contributor

pcwalton commented Jun 26, 2014

I was originally a bit nervous about this sort of thing, but now I have no objections.

I'm slightly more nervous about the self thing, but I'm fine with trying it and seeing how it goes. I think that the "suggest-a-lifetime" error messages that we now have make this sort of thing easier to deal with.

can avoid writing any lifetimes in ~87% of the cases where they are currently
required.
Doing so is a clear ergonomic win.

This comment has been minimized.

@steveklabnik

steveklabnik Jun 26, 2014

Member

This is the biggest part of this proposal for me. (well, combined with the data that shows that it is)

@steveklabnik

steveklabnik Jun 26, 2014

Member

This is the biggest part of this proposal for me. (well, combined with the data that shows that it is)

Show outdated Hide outdated active/0000-lifetime-elision.md
fn get_str() -> &str;
```
become

This comment has been minimized.

@steveklabnik

steveklabnik Jun 26, 2014

Member

becomes. and isn't this backwards? To elide is to remove, so the ones with the rules become the ones without the rules.

@steveklabnik

steveklabnik Jun 26, 2014

Member

becomes. and isn't this backwards? To elide is to remove, so the ones with the rules become the ones without the rules.

* When combined with a good tutorial on the borrow/lifetime system (which should
be introduced early in the documentation), the above should provide a
reasonably gentle path toward using and understanding explicit lifetimes.

This comment has been minimized.

@steveklabnik

steveklabnik Jun 26, 2014

Member

Yup, I care about this.

@steveklabnik

steveklabnik Jun 26, 2014

Member

Yup, I care about this.

@steveklabnik

This comment has been minimized.

Show comment
Hide comment
@steveklabnik

steveklabnik Jun 26, 2014

Member

A big 👍 from me. If the vast majority of code is doing something a certain way, then it's a good basis for making a rule. This should eliminate a lot of what is effectively boilerplate, and a good lifetimes tutorial / better errors will assist in the pedagogy sense.

Also, if you like the lifetimes, you can keep writing them.

Member

steveklabnik commented Jun 26, 2014

A big 👍 from me. If the vast majority of code is doing something a certain way, then it's a good basis for making a rule. This should eliminate a lot of what is effectively boilerplate, and a good lifetimes tutorial / better errors will assist in the pedagogy sense.

Also, if you like the lifetimes, you can keep writing them.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jun 26, 2014

Member

@glaebhoerl Great point about contravariance, which I hadn't thought about. I agree that a contravariant argument should not be considered as an input position.

Just to be clear, is the suggestion that contravariant positions swap the input/output distinction? (Which would be the typical type-theoretical thing to do.) Concretely, are you proposing that

fn some_fn(&self, cb: Callback) -> int;
fn other_fn(n: int) -> (&T, cb: Callback);

expands to

fn some_fn<'a>(&'a self, cb: Callback<'a>) -> int;
fn other_fn<'a>(n: int) -> (&'a T, cb: Callback<'a>)

The first case makes some sense, but the latter case is pretty surprising -- it would happen because the Callback's lifetime is considered an input position, and thus can establish the (sole) output position for &T.

We could also simply disallow eliding contravariant lifetimes, since it may be preferable to be explicit in those (rare) cases.

Finally, see @wycats's comment above re: the &self rule. It's not arbitrary: the &self parameter definitely plays a special role for methods, and the proposed rules are based on the most common patterns in the libstd corpus.

Member

aturon commented Jun 26, 2014

@glaebhoerl Great point about contravariance, which I hadn't thought about. I agree that a contravariant argument should not be considered as an input position.

Just to be clear, is the suggestion that contravariant positions swap the input/output distinction? (Which would be the typical type-theoretical thing to do.) Concretely, are you proposing that

fn some_fn(&self, cb: Callback) -> int;
fn other_fn(n: int) -> (&T, cb: Callback);

expands to

fn some_fn<'a>(&'a self, cb: Callback<'a>) -> int;
fn other_fn<'a>(n: int) -> (&'a T, cb: Callback<'a>)

The first case makes some sense, but the latter case is pretty surprising -- it would happen because the Callback's lifetime is considered an input position, and thus can establish the (sole) output position for &T.

We could also simply disallow eliding contravariant lifetimes, since it may be preferable to be explicit in those (rare) cases.

Finally, see @wycats's comment above re: the &self rule. It's not arbitrary: the &self parameter definitely plays a special role for methods, and the proposed rules are based on the most common patterns in the libstd corpus.

@bachm

This comment has been minimized.

Show comment
Hide comment
@bachm

bachm Jun 26, 2014

Just posting to express my support for this well written RFC. With the proposed error messages there should be little confusion when an user first encounters unelidable lifetimes.

bachm commented Jun 26, 2014

Just posting to express my support for this well written RFC. With the proposed error messages there should be little confusion when an user first encounters unelidable lifetimes.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jun 26, 2014

Contributor

the latter case is pretty surprising -- it would happen because the Callback's lifetime is considered an input position, and thus can establish the (sole) output position for &T.

Even thinking about this example makes my head hurt... I think the "logic" of it, as it were, is that when the caller of other_fn invokes the second component of the returned tuple, which is the Callback, with something of lifetime 'a, other_fn can then use that to "produce" the first component of the tuple, also of lifetime 'a? Obviously that couldn't physically work without a time machine.

One distinction that I noticed, and I'm not sure if it has significance, is that while the return type of a function f, and an argument of a function g which is f's parameter, are both output positions, f is required to return a value, but it's not required to call the callback g. Again, I'm not sure whether this has implications for how inference should work.

I basically agree with you that it seems reasonable-but-not-imperative to desugar your first example, but not so much the second one. I don't have any concrete rules in mind which might accomplish this.

Finally, see @wycats's comment above re: the &self rule. It's not arbitrary

To avoid getting caught up in debating the meaning of the word "arbitrary" (I wasn't assuming that you flipped a coin): For the first and second rules, there's only one way it can make sense. If the user were to explicitly annotate lifetimes, they would annotate the same ones we infer 100% of the time. For the third rule, there's more than one way it can make sense, and we'd be choosing to favor one of them. Even if our favoring rests on a stronger basis than a coin flip, I don't think this kind of "probably what you meant" inference is something we should be doing.

Contributor

glaebhoerl commented Jun 26, 2014

the latter case is pretty surprising -- it would happen because the Callback's lifetime is considered an input position, and thus can establish the (sole) output position for &T.

Even thinking about this example makes my head hurt... I think the "logic" of it, as it were, is that when the caller of other_fn invokes the second component of the returned tuple, which is the Callback, with something of lifetime 'a, other_fn can then use that to "produce" the first component of the tuple, also of lifetime 'a? Obviously that couldn't physically work without a time machine.

One distinction that I noticed, and I'm not sure if it has significance, is that while the return type of a function f, and an argument of a function g which is f's parameter, are both output positions, f is required to return a value, but it's not required to call the callback g. Again, I'm not sure whether this has implications for how inference should work.

I basically agree with you that it seems reasonable-but-not-imperative to desugar your first example, but not so much the second one. I don't have any concrete rules in mind which might accomplish this.

Finally, see @wycats's comment above re: the &self rule. It's not arbitrary

To avoid getting caught up in debating the meaning of the word "arbitrary" (I wasn't assuming that you flipped a coin): For the first and second rules, there's only one way it can make sense. If the user were to explicitly annotate lifetimes, they would annotate the same ones we infer 100% of the time. For the third rule, there's more than one way it can make sense, and we'd be choosing to favor one of them. Even if our favoring rests on a stronger basis than a coin flip, I don't think this kind of "probably what you meant" inference is something we should be doing.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jun 26, 2014

Member

@glaebhoerl Thanks for the thoughtful comments.

My feeling about &self is that these rules are not inference, but rather shorthand: they are a systematic way of filling in what's been left off of a signature without looking at the body.

The rules are simple enough that it's easy to know, given the signature in your head, whether you can elide or not.

Put another way, the debate is whether

fn foo(&self, t: &T) -> &U;

is simply not allowed/usable as a signature, or whether it has a useful meaning based on the most common lifetime patterns. Once you know the rules, you know immediately that the above would expand into

fn foo<'a,'b>(&'a self, t: &'b T) -> &'a U;

and would only write the elided signature if that's what you wanted.

FWIW, I disagree that the other rules give the only sensible expansion. Not even today's rules do. If you write

fn bar(t: &T, u: &U);

you get distinct lifetimes for the two parameters. But it can also make sense for them to share the same lifetime, and some uses would require it. In that situation, you know you can't leave off the lifetimes, and you write an explicit signature. I think the same would be true with the &self rule.

Member

aturon commented Jun 26, 2014

@glaebhoerl Thanks for the thoughtful comments.

My feeling about &self is that these rules are not inference, but rather shorthand: they are a systematic way of filling in what's been left off of a signature without looking at the body.

The rules are simple enough that it's easy to know, given the signature in your head, whether you can elide or not.

Put another way, the debate is whether

fn foo(&self, t: &T) -> &U;

is simply not allowed/usable as a signature, or whether it has a useful meaning based on the most common lifetime patterns. Once you know the rules, you know immediately that the above would expand into

fn foo<'a,'b>(&'a self, t: &'b T) -> &'a U;

and would only write the elided signature if that's what you wanted.

FWIW, I disagree that the other rules give the only sensible expansion. Not even today's rules do. If you write

fn bar(t: &T, u: &U);

you get distinct lifetimes for the two parameters. But it can also make sense for them to share the same lifetime, and some uses would require it. In that situation, you know you can't leave off the lifetimes, and you write an explicit signature. I think the same would be true with the &self rule.

The error case on `impl` is exceedingly rare: it requires (1) that the `impl` is
for a trait with a lifetime argument, which is uncommon, and (2) that the `Self`
type has multiple lifetime arguments.

This comment has been minimized.

@bstrie

bstrie Jun 26, 2014

Contributor

Does this example arise today in any known Rust codebase?

@bstrie

bstrie Jun 26, 2014

Contributor

Does this example arise today in any known Rust codebase?

This comment has been minimized.

@aturon

aturon Jun 26, 2014

Member

@bstrie I don't know of any cases offhand, which is why the error message here is probably not so important.

@aturon

aturon Jun 26, 2014

Member

@bstrie I don't know of any cases offhand, which is why the error message here is probably not so important.

already elided with today's rules._
The detailed data is available at:
https://gist.github.com/aturon/da49a6d00099fdb0e861

This comment has been minimized.

@bstrie

bstrie Jun 26, 2014

Contributor

Of the 13% of functions which still require explicit lifetimes, do any seem particularly notable for their nonconformity to the usual patterns? It would also be really great if you could select one of these real-world functions and use it in the example error message above.

@bstrie

bstrie Jun 26, 2014

Contributor

Of the 13% of functions which still require explicit lifetimes, do any seem particularly notable for their nonconformity to the usual patterns? It would also be really great if you could select one of these real-world functions and use it in the example error message above.

This comment has been minimized.

@aturon

aturon Jun 26, 2014

Member

Almost all of the remaining cases are situations like:

impl<'a> AsciiCast<&'a[Ascii]> for &'a [u8] {
    fn unsafe fn to_ascii_nocheck(&self) -> &'a[Ascii] { ... }
    ...
}

where the impl involves types with lifetimes, and the fns within refer to those lifetimes directly. That counts against us in two ways:

  1. The impl header has to be annotated so that you can name the lifetime, even though it would otherwise follow the standard pattern, and
  2. The fn definitions have to be annotated to use the outer lifetime.

Note that this kind of example does not require an annotation according to the rules (so you wouldn't get an annotation error if you elided the lifetime). Rather, the annotation is needed to go beyond the patterns provided by the rule.

@aturon

aturon Jun 26, 2014

Member

Almost all of the remaining cases are situations like:

impl<'a> AsciiCast<&'a[Ascii]> for &'a [u8] {
    fn unsafe fn to_ascii_nocheck(&self) -> &'a[Ascii] { ... }
    ...
}

where the impl involves types with lifetimes, and the fns within refer to those lifetimes directly. That counts against us in two ways:

  1. The impl header has to be annotated so that you can name the lifetime, even though it would otherwise follow the standard pattern, and
  2. The fn definitions have to be annotated to use the outer lifetime.

Note that this kind of example does not require an annotation according to the rules (so you wouldn't get an annotation error if you elided the lifetime). Rather, the annotation is needed to go beyond the patterns provided by the rule.

This comment has been minimized.

@aturon

aturon Jun 26, 2014

Member

@bstrie The other predominant case is:

fn difference<'a>(&'a self, other: &'a HashSet<T, H>) -> SetAlgebraItems<'a, T, H>;

where the two input lifetimes are required to match.

@glaebhoerl Take note -- this is a case where even the rules for input positions don't give you what you want.

@aturon

aturon Jun 26, 2014

Member

@bstrie The other predominant case is:

fn difference<'a>(&'a self, other: &'a HashSet<T, H>) -> SetAlgebraItems<'a, T, H>;

where the two input lifetimes are required to match.

@glaebhoerl Take note -- this is a case where even the rules for input positions don't give you what you want.

of a lifetime parameter for a `struct`. There are also some good reasons to
treat elided lifetimes in `struct`s as `'static`.
Again, since shorthand can be added backwards-compatibly, it seems best to wait.

This comment has been minimized.

@bstrie

bstrie Jun 26, 2014

Contributor

Agreed, I'm fine with leaving structs as they are.

@bstrie

bstrie Jun 26, 2014

Contributor

Agreed, I'm fine with leaving structs as they are.

@bstrie

This comment has been minimized.

Show comment
Hide comment
@bstrie

bstrie Jun 26, 2014

Contributor

Above I draw a comparison between lifetime elision and type inference, and how the great thing is that people who choose to be explicit are still welcome to manually annotate lifetimes. However, there is one thing that would support the people who make such a decision and improve teachability for newcomers: make the --pretty typed compiler flag annotate the elided lifetimes just as it annotates the inferred types (or you could make it an entirely separate flag, I suppose).

Contributor

bstrie commented Jun 26, 2014

Above I draw a comparison between lifetime elision and type inference, and how the great thing is that people who choose to be explicit are still welcome to manually annotate lifetimes. However, there is one thing that would support the people who make such a decision and improve teachability for newcomers: make the --pretty typed compiler flag annotate the elided lifetimes just as it annotates the inferred types (or you could make it an entirely separate flag, I suppose).

@rkjnsn

This comment has been minimized.

Show comment
Hide comment
@rkjnsn

rkjnsn Jun 26, 2014

Contributor

Quick question:

Would it be feasible to handle the multiple-input case by having something like

fn frob(s: &str, t: &str) -> &'t str;

expand to

fn frob<'a, 'b>(s: &'a str, t: &'b str) -> &'b str;

While such a shorthand would be mostly orthogonal to the elision rules of this RFC, I bring it up because it seems like it could impact whether we want to treat self specially (the third rule of the RFC), since one would be able to write

fn args<T:ToCStr>(&mut self, args: &[T]) -> &'self mut Command

Also, I realize the lookup rules would take some consideration if this were to be implemented, since lifetimes and parameter names are currently in different namespaces.

Contributor

rkjnsn commented Jun 26, 2014

Quick question:

Would it be feasible to handle the multiple-input case by having something like

fn frob(s: &str, t: &str) -> &'t str;

expand to

fn frob<'a, 'b>(s: &'a str, t: &'b str) -> &'b str;

While such a shorthand would be mostly orthogonal to the elision rules of this RFC, I bring it up because it seems like it could impact whether we want to treat self specially (the third rule of the RFC), since one would be able to write

fn args<T:ToCStr>(&mut self, args: &[T]) -> &'self mut Command

Also, I realize the lookup rules would take some consideration if this were to be implemented, since lifetimes and parameter names are currently in different namespaces.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jun 27, 2014

Contributor

@aturon You're right.

Now we have the interesting situation that you've shown that my stated arguments against "the self rule" are invalid, yet, for some reason, this hasn't convinced me to like it. Apparently, my stated arguments were not the real reason why it bothers me. When your subconscious is telling you something is wrong, it doesn't necessarily go into great detail about why, or which part...

I think a large part of it is because of the fact that I don't think we should semantically/syntactically distinguish the self argument in the first place, or even necessarily have a self keyword at all. (I have a proposal to this effect which I might hopefully have time to write down at some point in the next 5,000 years.) And here we're proposing to distinguish it in an additional way. When you write, "If there are multiple input lifetime positions, but one of them is &self", I read, "If there are multiple input lifetime positions, but one of them is the first argument, or is called "self""... I mean, maybe it holds up statistically, but statistically speaking, there are two popes per square kilometer in Vatican City. (Or currently, I suppose, four.)

Contributor

glaebhoerl commented Jun 27, 2014

@aturon You're right.

Now we have the interesting situation that you've shown that my stated arguments against "the self rule" are invalid, yet, for some reason, this hasn't convinced me to like it. Apparently, my stated arguments were not the real reason why it bothers me. When your subconscious is telling you something is wrong, it doesn't necessarily go into great detail about why, or which part...

I think a large part of it is because of the fact that I don't think we should semantically/syntactically distinguish the self argument in the first place, or even necessarily have a self keyword at all. (I have a proposal to this effect which I might hopefully have time to write down at some point in the next 5,000 years.) And here we're proposing to distinguish it in an additional way. When you write, "If there are multiple input lifetime positions, but one of them is &self", I read, "If there are multiple input lifetime positions, but one of them is the first argument, or is called "self""... I mean, maybe it holds up statistically, but statistically speaking, there are two popes per square kilometer in Vatican City. (Or currently, I suppose, four.)

@krdln

This comment has been minimized.

Show comment
Hide comment
@krdln

krdln Jun 27, 2014

Contributor

@bstrie
I think that not only --pretty typed should reveal all lifetimes, but also each compiler error involving lifetimes should print fully annotated function signature. This is a problem even now, when compiler tells about errors involving unnamed lifetimes.

Contributor

krdln commented Jun 27, 2014

@bstrie
I think that not only --pretty typed should reveal all lifetimes, but also each compiler error involving lifetimes should print fully annotated function signature. This is a problem even now, when compiler tells about errors involving unnamed lifetimes.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jun 27, 2014

Member

@rkjnsn Some proposals along these lines have been made in the comments on #134 and it might be a reasonable design. However, I'd like to separate the question of when you need to write lifetimes from how you write the lifetimes. (I think we can make improvements on both.)

As I mentioned above, I think of the debate around the &self rule as being whether

fn foo(&self, t: &T) -> &U;

should be an error, or work usefully as shorthand based on the most common patterns.

Member

aturon commented Jun 27, 2014

@rkjnsn Some proposals along these lines have been made in the comments on #134 and it might be a reasonable design. However, I'd like to separate the question of when you need to write lifetimes from how you write the lifetimes. (I think we can make improvements on both.)

As I mentioned above, I think of the debate around the &self rule as being whether

fn foo(&self, t: &T) -> &U;

should be an error, or work usefully as shorthand based on the most common patterns.

@steveklabnik

This comment has been minimized.

Show comment
Hide comment
@steveklabnik

steveklabnik Jun 27, 2014

Member

I'd like to leave a 👍 from this Reddit thread, from a Rust newbie: http://www.reddit.com/r/rust/comments/298j3y/question_about_lifetime_parameters/

Member

steveklabnik commented Jun 27, 2014

I'd like to leave a 👍 from this Reddit thread, from a Rust newbie: http://www.reddit.com/r/rust/comments/298j3y/question_about_lifetime_parameters/

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jun 27, 2014

Member

@glaebhoerl The counterpoint is that, like it or not, self is special in today's Rust, in conjunction with the special treatment of . for autoborrowing and the like. I think the proposed design fits current Rust idioms well. We could certainly revisit the rules if our general treatment of self changes.

Member

aturon commented Jun 27, 2014

@glaebhoerl The counterpoint is that, like it or not, self is special in today's Rust, in conjunction with the special treatment of . for autoborrowing and the like. I think the proposed design fits current Rust idioms well. We could certainly revisit the rules if our general treatment of self changes.

@erickt

This comment has been minimized.

Show comment
Hide comment
@erickt

erickt Jun 27, 2014

What about the case where a struct has one lifetime, and a method has another? For example:

struct Foo<'a> {
    x: &'a int,
}

impl<'a> Foo<'a> {
    fn bar<'b>(&'b mut self) -> &'b int {
        self.x
    }
}

fn main() {}

While I think that 'a should unify with 'b, I swear I've seen the case where they aren't identical. Unfortunately I can't think up a good example demonstrating that.

erickt commented Jun 27, 2014

What about the case where a struct has one lifetime, and a method has another? For example:

struct Foo<'a> {
    x: &'a int,
}

impl<'a> Foo<'a> {
    fn bar<'b>(&'b mut self) -> &'b int {
        self.x
    }
}

fn main() {}

While I think that 'a should unify with 'b, I swear I've seen the case where they aren't identical. Unfortunately I can't think up a good example demonstrating that.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jun 27, 2014

Member

@erickt Using the proposed rules, you could elide all the lifetimes in that example impl and fn. The rules for methods do not take into account any enclosing impl lifetimes.

But there are examples where the methods in an impl do talk about the enclosing lifetime, e.g. https://github.com/rust-lang/rust/blob/master/src/libstd/ascii.rs#L203-L207 and these are the primary cases where annotation is required under the proposed rules.

Member

aturon commented Jun 27, 2014

@erickt Using the proposed rules, you could elide all the lifetimes in that example impl and fn. The rules for methods do not take into account any enclosing impl lifetimes.

But there are examples where the methods in an impl do talk about the enclosing lifetime, e.g. https://github.com/rust-lang/rust/blob/master/src/libstd/ascii.rs#L203-L207 and these are the primary cases where annotation is required under the proposed rules.

impl StrSlice for &str { ... } // elided
impl<'a> StrSlice<'a> for &'a str { ... } // expanded
```

This comment has been minimized.

@jfager

jfager Jun 30, 2014

A by-value arg is not a lifetime position, so the following is legal?

fn foo(a: &str, b: int) -> &str

That is, it would possible to use multiple args and still have the lifetimes elided, right? I think the answer is yes but it's not shown in these examples.

@jfager

jfager Jun 30, 2014

A by-value arg is not a lifetime position, so the following is legal?

fn foo(a: &str, b: int) -> &str

That is, it would possible to use multiple args and still have the lifetimes elided, right? I think the answer is yes but it's not shown in these examples.

This comment has been minimized.

@aturon

aturon Jun 30, 2014

Member

@jfager Yes, that's right, and good point about the examples. Will update.

@aturon

aturon Jun 30, 2014

Member

@jfager Yes, that's right, and good point about the examples. Will update.

This comment has been minimized.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Please also add examples of methods within impls where the implementing trait and/or type has lifetime parameters itself, just to underline the scenarios I brought up in my comment here

@pnkfelix

pnkfelix Jul 15, 2014

Member

Please also add examples of methods within impls where the implementing trait and/or type has lifetime parameters itself, just to underline the scenarios I brought up in my comment here

@schmee schmee referenced this pull request Jul 1, 2014

Closed

Switch `<>` back to `[]` #148

@zwarich

This comment has been minimized.

Show comment
Hide comment
@zwarich

zwarich Jul 3, 2014

Contributor

One thing that I don't think has been mentioned here is unsafe code. Since unsafe code depends on properties that are not enforced by the type checker, it would be good to require explicit lifetime parameters for functions using unsafe code.

Contributor

zwarich commented Jul 3, 2014

One thing that I don't think has been mentioned here is unsafe code. Since unsafe code depends on properties that are not enforced by the type checker, it would be good to require explicit lifetime parameters for functions using unsafe code.

@bstrie

This comment has been minimized.

Show comment
Hide comment
@bstrie

bstrie Jul 3, 2014

Contributor

@zwarich, it isn't obvious to me how having explicit lifetime names will make unsafe blocks any safer, since the lifetime names cannot be used for any purpose within the unsafe block itself. The function signature will still make it obvious that the arguments are references, which is the important thing.

Contributor

bstrie commented Jul 3, 2014

@zwarich, it isn't obvious to me how having explicit lifetime names will make unsafe blocks any safer, since the lifetime names cannot be used for any purpose within the unsafe block itself. The function signature will still make it obvious that the arguments are references, which is the important thing.

@bstrie

This comment has been minimized.

Show comment
Hide comment
@bstrie

bstrie Jul 4, 2014

Contributor

Though I'm in favor of it, the only weird thing about requiring explicit lifetimes on unsafe-containing functions is that it might be a little weird to be forced to annotate lifetimes on a trait method whose signature has elided the lifetimes. Though I think it's probably okay because if you can't puzzle out the lifetimes yourself then you're probably unqualified to be writing the unsafe code (and we should probably have a lint pass that would explicitly annotate the function for you anyway).

Contributor

bstrie commented Jul 4, 2014

Though I'm in favor of it, the only weird thing about requiring explicit lifetimes on unsafe-containing functions is that it might be a little weird to be forced to annotate lifetimes on a trait method whose signature has elided the lifetimes. Though I think it's probably okay because if you can't puzzle out the lifetimes yourself then you're probably unqualified to be writing the unsafe code (and we should probably have a lint pass that would explicitly annotate the function for you anyway).

@alexcrichton

This comment has been minimized.

Show comment
Hide comment
@alexcrichton

alexcrichton Jul 9, 2014

Member

This was discussed in yesterday's meeting and it was decided to be merged.

Member

alexcrichton commented Jul 9, 2014

This was discussed in yesterday's meeting and it was decided to be merged.

@alexcrichton alexcrichton merged commit 7a459c9 into rust-lang:master Jul 9, 2014

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jul 13, 2014

Contributor

@aturon To turn this around a bit:

I have the distinct impression that Rule 3 is qualitatively different from Rules 1 and 2 in some meaningful sense, even if I can't seem to put my finger on what, exactly, that is. Do you not have the same feeling? Perhaps you might have a better idea of what the difference could be?

Contributor

glaebhoerl commented Jul 13, 2014

@aturon To turn this around a bit:

I have the distinct impression that Rule 3 is qualitatively different from Rules 1 and 2 in some meaningful sense, even if I can't seem to put my finger on what, exactly, that is. Do you not have the same feeling? Perhaps you might have a better idea of what the difference could be?

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jul 14, 2014

Member

@zwarich @bstrie My apologies for not responding to your comments about unsafe earlier. We discussed this in the meeting where the RFC was accepted, and the consensus was to not treat unsafe code specially.

This is seen as an improvement over our current situation, in which elision is allowed for signatures on unsafe code yet, when used on output lifetimes, almost always gives the wrong lifetime annotation! In other words, the new rules provide safer defaults.

Member

aturon commented Jul 14, 2014

@zwarich @bstrie My apologies for not responding to your comments about unsafe earlier. We discussed this in the meeting where the RFC was accepted, and the consensus was to not treat unsafe code specially.

This is seen as an improvement over our current situation, in which elision is allowed for signatures on unsafe code yet, when used on output lifetimes, almost always gives the wrong lifetime annotation! In other words, the new rules provide safer defaults.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jul 14, 2014

Member

@glaebhoerl It's a good question, and others have voiced a similar feeling about the rules.

As we discussed earlier, rules 1 and 2 are not the "most general" or "only possible" ways to make sense of elided lifetimes. You sometimes need signatures like:

fn foo<'a>(arg1: &'a T, arg2: &'a U) -> &'a V

or

impl<'a> MyType<'a> {
    fn method(&self, arg: U) -> &'a V
}

I think a lot of people sensed a "qualitative difference" by initially assuming that rules 1-2 were fully general. But in fact, none of the rules are fully general; they all just cover the (vastly) common case/intuition.

That said, in the end I think it comes down to whether you see a difference between

fn frob(f: &Foo, b: &Bar) -> &Baz { ... }

and

impl Foo {
    fn frob(&self, b: &Bar) -> &Baz { ... }
}

Semantically, these two are identical. But Rust treats the method version as special, for example for auto-borrowing. Why?

Methods are a part of expressing OO idioms in Rust, and perhaps the most basic aspect of those idioms is the notion of a receiver. By making frob a method on Foo, we are saying that the Foo parameter plays a special conceptual role: it is the thing being acted on by the method, while the other parameters fill in the details of what the requested action is.

If you see things in those terms then rule 3 is as natural as the other rules: the method receiver generally provides the "ambient lifetime" that you care about.

@wycats did a further survey of places where rule 3 applies: https://gist.github.com/wycats/2957ea3090349640b417

The most common cases are indexing, or otherwise extracting some information from the method receiver, which then lives as long as the receiver does. This is the simplest and most common OO idiom, as it plays out in Rust.

All that said, if you disagree with the basic idea of self/receivers/methods, then rule 3 would certainly seem arbitrary. But the rule is designed for today's Rust and OOish idioms it employs.

Member

aturon commented Jul 14, 2014

@glaebhoerl It's a good question, and others have voiced a similar feeling about the rules.

As we discussed earlier, rules 1 and 2 are not the "most general" or "only possible" ways to make sense of elided lifetimes. You sometimes need signatures like:

fn foo<'a>(arg1: &'a T, arg2: &'a U) -> &'a V

or

impl<'a> MyType<'a> {
    fn method(&self, arg: U) -> &'a V
}

I think a lot of people sensed a "qualitative difference" by initially assuming that rules 1-2 were fully general. But in fact, none of the rules are fully general; they all just cover the (vastly) common case/intuition.

That said, in the end I think it comes down to whether you see a difference between

fn frob(f: &Foo, b: &Bar) -> &Baz { ... }

and

impl Foo {
    fn frob(&self, b: &Bar) -> &Baz { ... }
}

Semantically, these two are identical. But Rust treats the method version as special, for example for auto-borrowing. Why?

Methods are a part of expressing OO idioms in Rust, and perhaps the most basic aspect of those idioms is the notion of a receiver. By making frob a method on Foo, we are saying that the Foo parameter plays a special conceptual role: it is the thing being acted on by the method, while the other parameters fill in the details of what the requested action is.

If you see things in those terms then rule 3 is as natural as the other rules: the method receiver generally provides the "ambient lifetime" that you care about.

@wycats did a further survey of places where rule 3 applies: https://gist.github.com/wycats/2957ea3090349640b417

The most common cases are indexing, or otherwise extracting some information from the method receiver, which then lives as long as the receiver does. This is the simplest and most common OO idiom, as it plays out in Rust.

All that said, if you disagree with the basic idea of self/receivers/methods, then rule 3 would certainly seem arbitrary. But the rule is designed for today's Rust and OOish idioms it employs.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jul 14, 2014

Contributor

@aturon Thank you. That's a good defense of the rule, and the survey is especially interesting. At this point, however, I really am just trying to figure out:

if you disagree with the basic idea of self/receivers/methods, then rule 3 would certainly seem arbitrary

In what sense is it "arbitrary" in which the other two rules are not, given that none of them are the "most general" / "only possible" / etc. desugarings? Apparently we agree that, without the OO intuitions about self, there's some kind of difference here, and that Rule 3 is then arbitrary in some sense in which the others aren't. But not in the sense which any of us assumed at first! Which makes it an interesting question. I'd like to try to ferret out the source of this intuition, without any particular purpose in mind, and capture it as something more concrete and precise.

A first stab into the dark is that maybe Rules 1 & 2 are "parametric" in a way that 3 is not, in that Rule 3 singles out a specific argument of the function for special treatment, while the other two treat them all equally. This appears to be true as far as it goes, but it's still an awfully rudimentary theory, and doesn't feel like it would be the whole story.

Contributor

glaebhoerl commented Jul 14, 2014

@aturon Thank you. That's a good defense of the rule, and the survey is especially interesting. At this point, however, I really am just trying to figure out:

if you disagree with the basic idea of self/receivers/methods, then rule 3 would certainly seem arbitrary

In what sense is it "arbitrary" in which the other two rules are not, given that none of them are the "most general" / "only possible" / etc. desugarings? Apparently we agree that, without the OO intuitions about self, there's some kind of difference here, and that Rule 3 is then arbitrary in some sense in which the others aren't. But not in the sense which any of us assumed at first! Which makes it an interesting question. I'd like to try to ferret out the source of this intuition, without any particular purpose in mind, and capture it as something more concrete and precise.

A first stab into the dark is that maybe Rules 1 & 2 are "parametric" in a way that 3 is not, in that Rule 3 singles out a specific argument of the function for special treatment, while the other two treat them all equally. This appears to be true as far as it goes, but it's still an awfully rudimentary theory, and doesn't feel like it would be the whole story.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jul 14, 2014

Member

@glaebhoerl A more focused version of my comment: rule 3 is arbitrary if, and only if, the distinction between functions and methods is arbitrary.

Put another way, if you buy into methods as having a distinct role from functions, then rule 3 has the same standing as the others.

Put yet another way, it's not rule 3 that's singling out a special argument: it's methods that do that.

Member

aturon commented Jul 14, 2014

@glaebhoerl A more focused version of my comment: rule 3 is arbitrary if, and only if, the distinction between functions and methods is arbitrary.

Put another way, if you buy into methods as having a distinct role from functions, then rule 3 has the same standing as the others.

Put yet another way, it's not rule 3 that's singling out a special argument: it's methods that do that.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jul 14, 2014

Contributor

I understand all of that completely. That's not what I'm trying to figure out. My purpose right now is not to try to discredit Rule 3, and hasn't been for a while. My purpose is to try to gain a deeper understanding.

What I want to understand is

if you buy into methods as having a distinct role from functions, then rule 3 has the same standing as the others

what that standing is.

As you've pointed out, Rules 1 & 2 are not the most general or only possible desugarings, either. Given that, should we just say that all three rules are completely arbitrary? As far as I can tell, both of us feel that they're not. But in what way are they not? What logic do they follow, which we sense, but so far, cannot name?

Contributor

glaebhoerl commented Jul 14, 2014

I understand all of that completely. That's not what I'm trying to figure out. My purpose right now is not to try to discredit Rule 3, and hasn't been for a while. My purpose is to try to gain a deeper understanding.

What I want to understand is

if you buy into methods as having a distinct role from functions, then rule 3 has the same standing as the others

what that standing is.

As you've pointed out, Rules 1 & 2 are not the most general or only possible desugarings, either. Given that, should we just say that all three rules are completely arbitrary? As far as I can tell, both of us feel that they're not. But in what way are they not? What logic do they follow, which we sense, but so far, cannot name?

@steveklabnik

This comment has been minimized.

Show comment
Hide comment
@steveklabnik

steveklabnik Jul 14, 2014

Member

Given that, should we just say that all three rules are completely arbitrary?

No, we can not say that they are. "Doesn't cover every possible case != arbitrary." These rules were chosen with specific thought behind them, making them the opposite of arbitrary.

Member

steveklabnik commented Jul 14, 2014

Given that, should we just say that all three rules are completely arbitrary?

No, we can not say that they are. "Doesn't cover every possible case != arbitrary." These rules were chosen with specific thought behind them, making them the opposite of arbitrary.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jul 14, 2014

Contributor

@steveklabnik Excellent. A third person who doesn't think they're arbitrary. :)

But why these rules, then, and not others? What underlying logic do they spring from?

Contributor

glaebhoerl commented Jul 14, 2014

@steveklabnik Excellent. A third person who doesn't think they're arbitrary. :)

But why these rules, then, and not others? What underlying logic do they spring from?

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jul 14, 2014

Member

@glaebhoerl I didn't mean to turn this discussion into a defense of rule 3; I'm also trying to understand better the relationships between the rules. I'm sorry I didn't make that more clear. (Text is hard.)

Let me try again. The initial question was whether I see a qualitative difference between rule 3 and the others. I do not, myself. But I can see how someone with a different perspective on methods (which I think you have?) would feel differently.

My general perspective on the rules is that they are simply shorthand, providing carefully-chosen defaults. Defaults are always heuristic and connected to common patterns of thought and code.

As with any defaults, in a purely semantic sense the rules are arbitrary, because there are other valid (and sometimes useful) lifetime assignments that the language allows.

As heuristics, the rules have a clear quantitative basis.

I think what's up for grabs is the qualitative basis -- how do they "feel", how well do they match our intuitions?

The intuitions that @wycats and I were most interested in come from borrowing/ownership, as opposed to lifetimes. If you write

fn foo(x: &Foo) -> &Bar

you know the function takes in borrowed data and produces borrowed data. The simplest intuition is that the output borrow takes its ownership from the input borrow. It's then not a hard conceptual leap to say that the borrowed ownership of the output is only good for as long as the input's was -- we hope that the elided form can build intuitions about borrowing that lead naturally into the mechanics of lifetimes.

I feel similarly about methods. I'm using a method to access or otherwise manipulate the receiver, so all things being equal I expect any output borrows to flow from my borrow of the receiver.

Does that help?

Member

aturon commented Jul 14, 2014

@glaebhoerl I didn't mean to turn this discussion into a defense of rule 3; I'm also trying to understand better the relationships between the rules. I'm sorry I didn't make that more clear. (Text is hard.)

Let me try again. The initial question was whether I see a qualitative difference between rule 3 and the others. I do not, myself. But I can see how someone with a different perspective on methods (which I think you have?) would feel differently.

My general perspective on the rules is that they are simply shorthand, providing carefully-chosen defaults. Defaults are always heuristic and connected to common patterns of thought and code.

As with any defaults, in a purely semantic sense the rules are arbitrary, because there are other valid (and sometimes useful) lifetime assignments that the language allows.

As heuristics, the rules have a clear quantitative basis.

I think what's up for grabs is the qualitative basis -- how do they "feel", how well do they match our intuitions?

The intuitions that @wycats and I were most interested in come from borrowing/ownership, as opposed to lifetimes. If you write

fn foo(x: &Foo) -> &Bar

you know the function takes in borrowed data and produces borrowed data. The simplest intuition is that the output borrow takes its ownership from the input borrow. It's then not a hard conceptual leap to say that the borrowed ownership of the output is only good for as long as the input's was -- we hope that the elided form can build intuitions about borrowing that lead naturally into the mechanics of lifetimes.

I feel similarly about methods. I'm using a method to access or otherwise manipulate the receiver, so all things being equal I expect any output borrows to flow from my borrow of the receiver.

Does that help?

@jfager

This comment has been minimized.

Show comment
Hide comment
@jfager

jfager Jul 14, 2014

What was the argument against @bill-myers suggestion of using the first input lifetime? That covers more cases for regular functions and rule 3 falls out for free. It's not a particular deep or profound unifying principle, but it's simple and seems less ad-hoc.

jfager commented Jul 14, 2014

What was the argument against @bill-myers suggestion of using the first input lifetime? That covers more cases for regular functions and rule 3 falls out for free. It's not a particular deep or profound unifying principle, but it's simple and seems less ad-hoc.

@kballard

This comment has been minimized.

Show comment
Hide comment
@kballard

kballard Jul 14, 2014

Contributor

First input lifetime seems a bit more ad-hoc, as strange as it sounds.

Methods are special, and self is special in these methods. Rule 3 seems perfectly natural to me given that perspective. But the first input lifetime is not special. It's actually rather arbitrary. There's no reason to believe that in fn foo(a: &str, b: &str) -> &str the output is necessarily more likely to be derived from a than from b.

"First input lifetime" will also cause some possibly surprising behavior in fn foo(self, x: &str) -> &str, where the output is derived from the second parameter x instead of from self. Of course, it usually can't be derived from self (the only way that makes sense is if the type of self contains a lifetime parameter), but that's not a good reason to arbitrarily select the second parameter as the inferred lifetime source.

Overall, "lifetime of self" is a more constrained rule than "first input lifetime", based as it is on the special nature of methods and self, and I believe is much more likely to be a correct heuristic than "first input lifetime".

Contributor

kballard commented Jul 14, 2014

First input lifetime seems a bit more ad-hoc, as strange as it sounds.

Methods are special, and self is special in these methods. Rule 3 seems perfectly natural to me given that perspective. But the first input lifetime is not special. It's actually rather arbitrary. There's no reason to believe that in fn foo(a: &str, b: &str) -> &str the output is necessarily more likely to be derived from a than from b.

"First input lifetime" will also cause some possibly surprising behavior in fn foo(self, x: &str) -> &str, where the output is derived from the second parameter x instead of from self. Of course, it usually can't be derived from self (the only way that makes sense is if the type of self contains a lifetime parameter), but that's not a good reason to arbitrarily select the second parameter as the inferred lifetime source.

Overall, "lifetime of self" is a more constrained rule than "first input lifetime", based as it is on the special nature of methods and self, and I believe is much more likely to be a correct heuristic than "first input lifetime".

@jfager

This comment has been minimized.

Show comment
Hide comment
@jfager

jfager Jul 14, 2014

Under the currently proposed rules, fn foo(self, x: &str) -> &str's output lifetime would also be derived from x via rule 2, wouldn't it? Rule 3 only states it kicks in for &self or &mut self.

jfager commented Jul 14, 2014

Under the currently proposed rules, fn foo(self, x: &str) -> &str's output lifetime would also be derived from x via rule 2, wouldn't it? Rule 3 only states it kicks in for &self or &mut self.

@kballard

This comment has been minimized.

Show comment
Hide comment
@kballard

kballard Jul 14, 2014

Contributor

@jfager Hrm, you're right. I hadn't considered the fn foo(self, x: &str) -> &str case until my previous comment, and there I only considered it in light of rule 3.

I think that, due to rule 3, it may be reasonable to adjust the rules such that fn foo(self, x: &str) -> &str cannot elide the lifetime. This would be a consequence of the fact that self is special, and therefore any method on self reasonably assumes an elided output lifetime is derived from self. My belief is this should be true even for by-value self methods.

That said, this particular case is I think something of an edge case, and I would not consider it a serious problem if the rules are left unchanged.

Contributor

kballard commented Jul 14, 2014

@jfager Hrm, you're right. I hadn't considered the fn foo(self, x: &str) -> &str case until my previous comment, and there I only considered it in light of rule 3.

I think that, due to rule 3, it may be reasonable to adjust the rules such that fn foo(self, x: &str) -> &str cannot elide the lifetime. This would be a consequence of the fact that self is special, and therefore any method on self reasonably assumes an elided output lifetime is derived from self. My belief is this should be true even for by-value self methods.

That said, this particular case is I think something of an edge case, and I would not consider it a serious problem if the rules are left unchanged.

@jfager

This comment has been minimized.

Show comment
Hide comment
@jfager

jfager Jul 15, 2014

It's an edge case but now that it's come up I think it gets right at the discomfort of the current set of rules. The justification for rule 3 is 'methods are special', but this interaction with rule 2 says 'but maybe not that special'. They should either be uniform, or they should be different; it's straddling the fence that feels odd.

"You may elide lifetimes; output lifetimes are assigned the first input lifetime" is arbitrary and there's not a great intuitive reason it should be true, but it's uniform between fns and methods, and despite its arbitrariness it's simple and easy to understand, and it ends up giving you the same code and behavior in all but one of the examples given in this RFC, frob being the exception.

"Elided output lifetimes take the lifetime of self for methods, or the lifetime of a sole input lifetime for functions" is similarly straightforward and simple, but treats methods and fns clearly differently.

I could get behind either.

*Edit: sorry, posted early.

jfager commented Jul 15, 2014

It's an edge case but now that it's come up I think it gets right at the discomfort of the current set of rules. The justification for rule 3 is 'methods are special', but this interaction with rule 2 says 'but maybe not that special'. They should either be uniform, or they should be different; it's straddling the fence that feels odd.

"You may elide lifetimes; output lifetimes are assigned the first input lifetime" is arbitrary and there's not a great intuitive reason it should be true, but it's uniform between fns and methods, and despite its arbitrariness it's simple and easy to understand, and it ends up giving you the same code and behavior in all but one of the examples given in this RFC, frob being the exception.

"Elided output lifetimes take the lifetime of self for methods, or the lifetime of a sole input lifetime for functions" is similarly straightforward and simple, but treats methods and fns clearly differently.

I could get behind either.

*Edit: sorry, posted early.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jul 15, 2014

Contributor

@aturon Yes, that's closer what I was trying to get at. (Though I was also wondering if there might be some drier, more formal formulation of our intuitions.) How does rule 1 fit into these intuitions about borrowing, i.e. why is it more intuitive for each input lifetime to be different rather than tied together?

Contributor

glaebhoerl commented Jul 15, 2014

@aturon Yes, that's closer what I was trying to get at. (Though I was also wondering if there might be some drier, more formal formulation of our intuitions.) How does rule 1 fit into these intuitions about borrowing, i.e. why is it more intuitive for each input lifetime to be different rather than tied together?

result types. So `fn foo(s: &str) -> (&str, &str)` has elided one lifetime in
input position and two lifetimes in output position.
* For `impl` headers, input refers to the lifetimes appears in the type

This comment has been minimized.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Trait definitions themselves are also a form that offers lifetime positions. That may or may not be relevant (I'll be posting a question about that soon -- see a few lines up), but should probably be addressed explicitly.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Trait definitions themselves are also a form that offers lifetime positions. That may or may not be relevant (I'll be posting a question about that soon -- see a few lines up), but should probably be addressed explicitly.

* For `fn` definitions, input refers to argument types while output refers to
result types. So `fn foo(s: &str) -> (&str, &str)` has elided one lifetime in
input position and two lifetimes in output position.

This comment has been minimized.

@pnkfelix

pnkfelix Jul 15, 2014

Member

For an fn method definition, i.e. one that occurs in the scope of an impl block or as the default method in a trait item, are the lifetimes that occur in the implementing type (in the former case) or the trait (in the latter case) also considered to be input positions? (Or perhaps all of the lifetimes bound by impl<'a,'b,...> are part of the input positions? Or perhaps none of them are?)

In other words, is a method considered to be in the scope of its impl header for the purposes of lifetime elision?

(I will follow up to this comment with a concrete set of examples elaborating my question in a moment.)

@pnkfelix

pnkfelix Jul 15, 2014

Member

For an fn method definition, i.e. one that occurs in the scope of an impl block or as the default method in a trait item, are the lifetimes that occur in the implementing type (in the former case) or the trait (in the latter case) also considered to be input positions? (Or perhaps all of the lifetimes bound by impl<'a,'b,...> are part of the input positions? Or perhaps none of them are?)

In other words, is a method considered to be in the scope of its impl header for the purposes of lifetime elision?

(I will follow up to this comment with a concrete set of examples elaborating my question in a moment.)

This comment has been minimized.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Okay, here is a gist with my attempt to survey the space here: https://gist.github.com/pnkfelix/a4054e51400152c63714

It could well be that the intent is (and has always been) to not consider an impl header in scope for lifetime elision on methods. But if so, this needs to be spelled out explicitly in the RFC itself.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Okay, here is a gist with my attempt to survey the space here: https://gist.github.com/pnkfelix/a4054e51400152c63714

It could well be that the intent is (and has always been) to not consider an impl header in scope for lifetime elision on methods. But if so, this needs to be spelled out explicitly in the RFC itself.

This comment has been minimized.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Hmm I guess since this was already merged I should instead open an issue against it.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Hmm I guess since this was already merged I should instead open an issue against it.

This comment has been minimized.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Ah and now I just saw @aturon 's comment here which explicitly confirms that the intent has been to not consider an impl header in scope for lifetime elision on methods.

@pnkfelix

pnkfelix Jul 15, 2014

Member

Ah and now I just saw @aturon 's comment here which explicitly confirms that the intent has been to not consider an impl header in scope for lifetime elision on methods.

pnkfelix added a commit to pnkfelix/rfcs that referenced this pull request Jul 15, 2014

Clarify definition of "input positions" in lifetime elision RFC.
Explicitly note that lifetimes from the `impl` (and `trait`/`struct`)
are not considered "input positions" for the purposes of expanded `fn`
definitions.

Added a collection of examples illustrating this.

Drive-by: Addressed a review comment from @chris-morgan
[here](rust-lang#141 (comment)).
@zwarich

This comment has been minimized.

Show comment
Hide comment
@zwarich

zwarich Jul 17, 2014

Contributor

@glaebhoerl In the absence of any lifetime variables in return types, the assignment of distinct lifetime parameters is the most general type that can be given. That is probably the intuition that is at play here. Of course, once return types with lifetime variables get involved, then this no longer applies, but everyone agreed that this case was broken today anyways.

Contributor

zwarich commented Jul 17, 2014

@glaebhoerl In the absence of any lifetime variables in return types, the assignment of distinct lifetime parameters is the most general type that can be given. That is probably the intuition that is at play here. Of course, once return types with lifetime variables get involved, then this no longer applies, but everyone agreed that this case was broken today anyways.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jul 26, 2014

Contributor

I was thinking that maybe elided lifetimes in arguments of higher-order function parameters should be desugared to higher-rank lifetimes, because that's usually what you want:

// not legal, I believe?
fn print_with(text: &str, printer: |&str|) { ... }

=>

// you pretty much always want this, I think?
fn print_with<'a>(text: &'a str, printer: <'b> |&'b str|) { ... }

The question is, given that closures are going to be merely trait objects, how could we properly generalize this? (There may or may not be an easy answer; I've spent approximately two minutes thinking about it.)

Contributor

glaebhoerl commented Jul 26, 2014

I was thinking that maybe elided lifetimes in arguments of higher-order function parameters should be desugared to higher-rank lifetimes, because that's usually what you want:

// not legal, I believe?
fn print_with(text: &str, printer: |&str|) { ... }

=>

// you pretty much always want this, I think?
fn print_with<'a>(text: &'a str, printer: <'b> |&'b str|) { ... }

The question is, given that closures are going to be merely trait objects, how could we properly generalize this? (There may or may not be an easy answer; I've spent approximately two minutes thinking about it.)

@zwarich

This comment has been minimized.

Show comment
Hide comment
@zwarich

zwarich Jul 26, 2014

Contributor

@glaebhoerl Why does it matter in that particular case? The only lifetimes that can ever be passed to printer are 'a and 'static, so you can always choose 'a.

I assume you were thinking of another case where it does matter?

Contributor

zwarich commented Jul 26, 2014

@glaebhoerl Why does it matter in that particular case? The only lifetimes that can ever be passed to printer are 'a and 'static, so you can always choose 'a.

I assume you were thinking of another case where it does matter?

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jul 26, 2014

Contributor

Maybe the example was bad. But:

The only lifetimes that can ever be passed to printer are 'a and 'static, so you can always choose 'a.

This is not true, because print_with could easily have local variables of type &'x str. (Which, in this case, might be weird, which in turn is why this might've been a bad example; then again, maybe print_with might want to prefix text with a timestamp or something.)

But to amend, imagine this:

fn print_two_with(text1: &str, text2: &str, printer: |&str|) { ... }

Now there are two &str arguments with different lifetimes and we want printer to work for both.

But the point is really that in general, do you ever want the lifetimes of the arguments of an argument function to be pre-determined by lifetime parameters on the outer HOF, instead of the (strictly-)more-general formulation where the argument function itself is parameterized over them?

Contributor

glaebhoerl commented Jul 26, 2014

Maybe the example was bad. But:

The only lifetimes that can ever be passed to printer are 'a and 'static, so you can always choose 'a.

This is not true, because print_with could easily have local variables of type &'x str. (Which, in this case, might be weird, which in turn is why this might've been a bad example; then again, maybe print_with might want to prefix text with a timestamp or something.)

But to amend, imagine this:

fn print_two_with(text1: &str, text2: &str, printer: |&str|) { ... }

Now there are two &str arguments with different lifetimes and we want printer to work for both.

But the point is really that in general, do you ever want the lifetimes of the arguments of an argument function to be pre-determined by lifetime parameters on the outer HOF, instead of the (strictly-)more-general formulation where the argument function itself is parameterized over them?

@zwarich

This comment has been minimized.

Show comment
Hide comment
@zwarich

zwarich Jul 26, 2014

Contributor

@glaebhoerl My (potentially mistaken) assumption is that the legacy closures actually have higher-rank lifetimes, even though it isn't a feature exposed independently in the type system, and rust-lang/rust#15067 is tracking exposing that to the new unboxed closures. This code type-checks:

fn print_two_with<'a, 'b>(text1: &'a str, text2: &'b str, printer: |&str|) {
    if true {
        printer(text1);
    } else {
        printer(text2);
    }
}

whereas this code does not:

fn print_two_with<'a, 'b>(text1: &'a str, text2: &'b str, printer: |&'a str|) {
    if true {
        printer(text1);
    } else {
        printer(text2);
    }
}
Contributor

zwarich commented Jul 26, 2014

@glaebhoerl My (potentially mistaken) assumption is that the legacy closures actually have higher-rank lifetimes, even though it isn't a feature exposed independently in the type system, and rust-lang/rust#15067 is tracking exposing that to the new unboxed closures. This code type-checks:

fn print_two_with<'a, 'b>(text1: &'a str, text2: &'b str, printer: |&str|) {
    if true {
        printer(text1);
    } else {
        printer(text2);
    }
}

whereas this code does not:

fn print_two_with<'a, 'b>(text1: &'a str, text2: &'b str, printer: |&'a str|) {
    if true {
        printer(text1);
    } else {
        printer(text2);
    }
}
@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Jul 27, 2014

Member

@glaebhoerl The current plan is that the elision rules apply recursively for the sugared form of unboxed closure types (i.e., the |x: T| -> U notation), as they have in the past. There's not currently a plan to generalize this to uses of traits directly, although that might actually be the right answer to covariant lifetime positions (see rust-lang/rust#15699 about _co_variance being the odd case, and rust-lang/rust#15907 about its relation to lifetime elision).

Member

aturon commented Jul 27, 2014

@glaebhoerl The current plan is that the elision rules apply recursively for the sugared form of unboxed closure types (i.e., the |x: T| -> U notation), as they have in the past. There's not currently a plan to generalize this to uses of traits directly, although that might actually be the right answer to covariant lifetime positions (see rust-lang/rust#15699 about _co_variance being the odd case, and rust-lang/rust#15907 about its relation to lifetime elision).

glaebhoerl added a commit to glaebhoerl/rfcs that referenced this pull request Aug 8, 2014

Clarify definition of "input positions" in lifetime elision RFC.
Explicitly note that lifetimes from the `impl` (and `trait`/`struct`)
are not considered "input positions" for the purposes of expanded `fn`
definitions.

Added a collection of examples illustrating this.

Drive-by: Addressed a review comment from @chris-morgan
[here](rust-lang#141 (comment)).
@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Aug 7, 2015

Contributor

👍

Contributor

ticki commented Aug 7, 2015

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment