Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: new lifetime elision rules #141
Conversation
nrc
reviewed
Jun 26, 2014
| * If there is exactly one input lifetime position (elided or not), that lifetime | ||
| is assigned to _all_ elided output lifetimes. | ||
|
|
||
| * If there are multiple input lifetime positions, but one of them is `&self` or |
This comment has been minimized.
This comment has been minimized.
nrc
Jun 26, 2014
Member
I find this rule a bit surprising (the others make perfect sense). I can intuitively see the motivation that self ought to be privileged but, I can't really justify why that is so. Looking at the examples below, the ones using this rule took me a lot longer to grok.
This comment has been minimized.
This comment has been minimized.
wycats
Jun 26, 2014
Contributor
The rationale is that in several usage surveys, this was essentially the only pattern we saw when &self was involved.
I believe that the reason for this is that when you're borrowing something out of self, it makes sense to involve another ref for computation. In contrast, it's a very unusual pattern to borrow something out of a value as a method of some other object. It's just not really how people think about using methods and objects in general, so it doesn't happen (almost at all).
I suspect that in cases where this pattern could occur, people use standalone functions instead of methods.
This comment has been minimized.
This comment has been minimized.
bstrie
Jun 26, 2014
Contributor
@wycats, what proportion of the cited 87% would be lost if this rule were not accepted? I don't personally object to it, but I can see how it's a bit more flimsy than the others, and I would be willing to live without it if the statistics bore it out.
This comment has been minimized.
This comment has been minimized.
bill-myers
Jun 26, 2014
I think that it should use the lifetime of the first input parameter, regardless of whether it is self or not, and only if it is an elided lifetime.
This avoids issues with UFC and makes method and non-method functions work the same.
Supporting elision of lifetimes only in the return value when they are explicit on self seems a bad idea, since it is counterintuitive. Also, it doesn't work for multiple explicit lifetimes (e.g. &'a Block<'b>).
nrc
reviewed
Jun 26, 2014
| https://gist.github.com/aturon/da49a6d00099fdb0e861 | ||
|
|
||
| # Drawbacks | ||
|
|
This comment has been minimized.
This comment has been minimized.
nrc
Jun 26, 2014
Member
Another drawback: I find full specification of lifetime parameters makes it easier to understand what is going on. Even today, I often write the lifetimes where they could be elided because I think it makes code easier to reason about if you can name things. If I have a lifetime error, the first thing I do is add explicit lifetimes wherever they are missing.
I get the impression I'm in the minority with this though.
To me, these extra rules trade off easier reading (and writing) when you don't need to think about lifetimes too much against greater cognitive overhead when you do have to think about them. I guess that since reading code is more common than debugging lifetime errors, this trade off is worthwhile. I certainly like the idea of reducing lifetime noise.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
wycats
Jun 26, 2014
Contributor
@nick29581 It might be worth considering having the compiler optionally show you all of the inferred lifetimes when there are error messages that involve lifetimes: rustc --errors=expanded or something.
That said, I think the error message improvements in this proposal go a long way to making it obvious what has happened when you inappropriately elided a lifetime. Similar error message work around other lifetime errors would go a long way to improving the general ergonomics of explicit lifetimes as well, and we should work on that!
This comment has been minimized.
This comment has been minimized.
bstrie
Jun 26, 2014
Contributor
@nick29581, that same argument can be made for type inference. Just like type inference, nothing is stopping you from being fully explicit with lifetimes if you deem it's better for readability.
chris-morgan
reviewed
Jun 26, 2014
| fn get_str<'a>() -> &'a str; | ||
| ``` | ||
|
|
||
| The ellision rules work well for functions that consume references, but not for |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
|
chris-morgan
reviewed
Jun 26, 2014
| * For `impl` headers, input refers to the lifetimes appears in the type | ||
| receiving the `impl`, while output refers to the trait, if any. So `impl<'a> | ||
| Foo<'a>` has `'a` in input position, while `impl<'a> SomeTrait<'a> Foo<'a>` | ||
| has `'a` in both input and output positions. |
This comment has been minimized.
This comment has been minimized.
chris-morgan
Jun 26, 2014
Member
I think the word for is lacking from the second example. It’s not an obvious example of where the lifetimes are, either—it could be rewritten as the probably-fairly-nonsensical “impl<'a, 'b> SomeTrait<'a> for Foo<'b> has 'a in [the] output position and 'b in [the] input position”.
(As for the “the”, I think that should be there in all these cases, or “an” as the case may be in some places. This affects much of the document.)
This comment has been minimized.
This comment has been minimized.
|
I'm nervous about adding elision for output parameters since I'm slightly concerned that may make things less clear (a minor adjustment to a signature that otherwise compiles would make the compiler spew weird errors), but I am in favour of elision in input position in impl BufReader { ... }
impl Reader for BufReader { ... }
impl Reader for (&str, &str) { ... } |
This comment has been minimized.
This comment has been minimized.
|
@huonw Do you mean output parameters in general, or just in impls? |
This comment has been minimized.
This comment has been minimized.
I don't like this rule. The other rules have the property that there's no other way the signature could possibly make sense: i.e., the desugaring is unambiguous. Here we're making an arbitrary choice. I don't think we should do that.
There's an additional subtlety: lifetime parameters of
Here If you just take that into account when applying the rules, then I think they would keep working. But I'm not sure what the situation is with invariant or bivariant lifetime parameters, because I haven't thought about it yet. |
This comment has been minimized.
This comment has been minimized.
|
OK, so in plain English, I think the rule should be: If there's exactly one readable lifetime and N writable ones, all the writable lifetimes are assumed to be the same as the readable one. Lifetime parameters in covariant position are readable, in contravariant writable, invariant both, bivariant neither. |
This comment has been minimized.
This comment has been minimized.
|
@huonw I think the proposed error messages will go a long way to avoid "compiling spewing weird error messages", no? |
This comment has been minimized.
This comment has been minimized.
|
I was originally a bit nervous about this sort of thing, but now I have no objections. I'm slightly more nervous about the |
steveklabnik
reviewed
Jun 26, 2014
| can avoid writing any lifetimes in ~87% of the cases where they are currently | ||
| required. | ||
|
|
||
| Doing so is a clear ergonomic win. |
This comment has been minimized.
This comment has been minimized.
steveklabnik
Jun 26, 2014
Member
This is the biggest part of this proposal for me. (well, combined with the data that shows that it is)
steveklabnik
reviewed
Jun 26, 2014
| fn get_str() -> &str; | ||
| ``` | ||
|
|
||
| become |
This comment has been minimized.
This comment has been minimized.
steveklabnik
Jun 26, 2014
Member
becomes. and isn't this backwards? To elide is to remove, so the ones with the rules become the ones without the rules.
steveklabnik
reviewed
Jun 26, 2014
|
|
||
| * When combined with a good tutorial on the borrow/lifetime system (which should | ||
| be introduced early in the documentation), the above should provide a | ||
| reasonably gentle path toward using and understanding explicit lifetimes. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
A big Also, if you like the lifetimes, you can keep writing them. |
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl Great point about contravariance, which I hadn't thought about. I agree that a contravariant argument should not be considered as an input position. Just to be clear, is the suggestion that contravariant positions swap the input/output distinction? (Which would be the typical type-theoretical thing to do.) Concretely, are you proposing that fn some_fn(&self, cb: Callback) -> int;
fn other_fn(n: int) -> (&T, cb: Callback);expands to fn some_fn<'a>(&'a self, cb: Callback<'a>) -> int;
fn other_fn<'a>(n: int) -> (&'a T, cb: Callback<'a>)The first case makes some sense, but the latter case is pretty surprising -- it would happen because the We could also simply disallow eliding contravariant lifetimes, since it may be preferable to be explicit in those (rare) cases. Finally, see @wycats's comment above re: the |
This comment has been minimized.
This comment has been minimized.
bachm
commented
Jun 26, 2014
|
Just posting to express my support for this well written RFC. With the proposed error messages there should be little confusion when an user first encounters unelidable lifetimes. |
This comment has been minimized.
This comment has been minimized.
Even thinking about this example makes my head hurt... I think the "logic" of it, as it were, is that when the caller of One distinction that I noticed, and I'm not sure if it has significance, is that while the return type of a function I basically agree with you that it seems reasonable-but-not-imperative to desugar your first example, but not so much the second one. I don't have any concrete rules in mind which might accomplish this.
To avoid getting caught up in debating the meaning of the word "arbitrary" (I wasn't assuming that you flipped a coin): For the first and second rules, there's only one way it can make sense. If the user were to explicitly annotate lifetimes, they would annotate the same ones we infer 100% of the time. For the third rule, there's more than one way it can make sense, and we'd be choosing to favor one of them. Even if our favoring rests on a stronger basis than a coin flip, I don't think this kind of "probably what you meant" inference is something we should be doing. |
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl Thanks for the thoughtful comments. My feeling about The rules are simple enough that it's easy to know, given the signature in your head, whether you can elide or not. Put another way, the debate is whether fn foo(&self, t: &T) -> &U;is simply not allowed/usable as a signature, or whether it has a useful meaning based on the most common lifetime patterns. Once you know the rules, you know immediately that the above would expand into fn foo<'a,'b>(&'a self, t: &'b T) -> &'a U;and would only write the elided signature if that's what you wanted. FWIW, I disagree that the other rules give the only sensible expansion. Not even today's rules do. If you write fn bar(t: &T, u: &U);you get distinct lifetimes for the two parameters. But it can also make sense for them to share the same lifetime, and some uses would require it. In that situation, you know you can't leave off the lifetimes, and you write an explicit signature. I think the same would be true with the |
bstrie
reviewed
Jun 26, 2014
|
|
||
| The error case on `impl` is exceedingly rare: it requires (1) that the `impl` is | ||
| for a trait with a lifetime argument, which is uncommon, and (2) that the `Self` | ||
| type has multiple lifetime arguments. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
aturon
Jun 26, 2014
Author
Member
@bstrie I don't know of any cases offhand, which is why the error message here is probably not so important.
bstrie
reviewed
Jun 26, 2014
| already elided with today's rules._ | ||
|
|
||
| The detailed data is available at: | ||
| https://gist.github.com/aturon/da49a6d00099fdb0e861 |
This comment has been minimized.
This comment has been minimized.
bstrie
Jun 26, 2014
Contributor
Of the 13% of functions which still require explicit lifetimes, do any seem particularly notable for their nonconformity to the usual patterns? It would also be really great if you could select one of these real-world functions and use it in the example error message above.
This comment has been minimized.
This comment has been minimized.
aturon
Jun 26, 2014
Author
Member
Almost all of the remaining cases are situations like:
impl<'a> AsciiCast<&'a[Ascii]> for &'a [u8] {
fn unsafe fn to_ascii_nocheck(&self) -> &'a[Ascii] { ... }
...
}where the impl involves types with lifetimes, and the fns within refer to those lifetimes directly. That counts against us in two ways:
- The
implheader has to be annotated so that you can name the lifetime, even though it would otherwise follow the standard pattern, and - The
fndefinitions have to be annotated to use the outer lifetime.
Note that this kind of example does not require an annotation according to the rules (so you wouldn't get an annotation error if you elided the lifetime). Rather, the annotation is needed to go beyond the patterns provided by the rule.
This comment has been minimized.
This comment has been minimized.
aturon
Jun 26, 2014
Author
Member
@bstrie The other predominant case is:
fn difference<'a>(&'a self, other: &'a HashSet<T, H>) -> SetAlgebraItems<'a, T, H>;where the two input lifetimes are required to match.
@glaebhoerl Take note -- this is a case where even the rules for input positions don't give you what you want.
bstrie
reviewed
Jun 26, 2014
| of a lifetime parameter for a `struct`. There are also some good reasons to | ||
| treat elided lifetimes in `struct`s as `'static`. | ||
| Again, since shorthand can be added backwards-compatibly, it seems best to wait. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Above I draw a comparison between lifetime elision and type inference, and how the great thing is that people who choose to be explicit are still welcome to manually annotate lifetimes. However, there is one thing that would support the people who make such a decision and improve teachability for newcomers: make the |
steveklabnik
referenced this pull request
Jun 26, 2014
Closed
RFC: Remove the `'` from lifetime parameters #134
aturon
added some commits
Jun 26, 2014
This comment has been minimized.
This comment has been minimized.
|
Quick question: Would it be feasible to handle the multiple-input case by having something like
expand to
While such a shorthand would be mostly orthogonal to the elision rules of this RFC, I bring it up because it seems like it could impact whether we want to treat
Also, I realize the lookup rules would take some consideration if this were to be implemented, since lifetimes and parameter names are currently in different namespaces. |
This comment has been minimized.
This comment has been minimized.
|
@aturon You're right. Now we have the interesting situation that you've shown that my stated arguments against "the I think a large part of it is because of the fact that I don't think we should semantically/syntactically distinguish the |
This comment has been minimized.
This comment has been minimized.
|
@bstrie |
This comment has been minimized.
This comment has been minimized.
jfager
commented
Jul 14, 2014
|
What was the argument against @bill-myers suggestion of using the first input lifetime? That covers more cases for regular functions and rule 3 falls out for free. It's not a particular deep or profound unifying principle, but it's simple and seems less ad-hoc. |
This comment has been minimized.
This comment has been minimized.
|
First input lifetime seems a bit more ad-hoc, as strange as it sounds. Methods are special, and "First input lifetime" will also cause some possibly surprising behavior in Overall, "lifetime of |
This comment has been minimized.
This comment has been minimized.
jfager
commented
Jul 14, 2014
|
Under the currently proposed rules, |
This comment has been minimized.
This comment has been minimized.
|
@jfager Hrm, you're right. I hadn't considered the I think that, due to rule 3, it may be reasonable to adjust the rules such that That said, this particular case is I think something of an edge case, and I would not consider it a serious problem if the rules are left unchanged. |
This comment has been minimized.
This comment has been minimized.
jfager
commented
Jul 15, 2014
|
It's an edge case but now that it's come up I think it gets right at the discomfort of the current set of rules. The justification for rule 3 is 'methods are special', but this interaction with rule 2 says 'but maybe not that special'. They should either be uniform, or they should be different; it's straddling the fence that feels odd. "You may elide lifetimes; output lifetimes are assigned the first input lifetime" is arbitrary and there's not a great intuitive reason it should be true, but it's uniform between fns and methods, and despite its arbitrariness it's simple and easy to understand, and it ends up giving you the same code and behavior in all but one of the examples given in this RFC, "Elided output lifetimes take the lifetime of self for methods, or the lifetime of a sole input lifetime for functions" is similarly straightforward and simple, but treats methods and fns clearly differently. I could get behind either. *Edit: sorry, posted early. |
This comment has been minimized.
This comment has been minimized.
|
@aturon Yes, that's closer what I was trying to get at. (Though I was also wondering if there might be some drier, more formal formulation of our intuitions.) How does rule 1 fit into these intuitions about borrowing, i.e. why is it more intuitive for each input lifetime to be different rather than tied together? |
pnkfelix
reviewed
Jul 15, 2014
| result types. So `fn foo(s: &str) -> (&str, &str)` has elided one lifetime in | ||
| input position and two lifetimes in output position. | ||
|
|
||
| * For `impl` headers, input refers to the lifetimes appears in the type |
This comment has been minimized.
This comment has been minimized.
pnkfelix
Jul 15, 2014
Member
Trait definitions themselves are also a form that offers lifetime positions. That may or may not be relevant (I'll be posting a question about that soon -- see a few lines up), but should probably be addressed explicitly.
pnkfelix
reviewed
Jul 15, 2014
|
|
||
| * For `fn` definitions, input refers to argument types while output refers to | ||
| result types. So `fn foo(s: &str) -> (&str, &str)` has elided one lifetime in | ||
| input position and two lifetimes in output position. |
This comment has been minimized.
This comment has been minimized.
pnkfelix
Jul 15, 2014
Member
For an fn method definition, i.e. one that occurs in the scope of an impl block or as the default method in a trait item, are the lifetimes that occur in the implementing type (in the former case) or the trait (in the latter case) also considered to be input positions? (Or perhaps all of the lifetimes bound by impl<'a,'b,...> are part of the input positions? Or perhaps none of them are?)
In other words, is a method considered to be in the scope of its impl header for the purposes of lifetime elision?
(I will follow up to this comment with a concrete set of examples elaborating my question in a moment.)
This comment has been minimized.
This comment has been minimized.
pnkfelix
Jul 15, 2014
Member
Okay, here is a gist with my attempt to survey the space here: https://gist.github.com/pnkfelix/a4054e51400152c63714
It could well be that the intent is (and has always been) to not consider an impl header in scope for lifetime elision on methods. But if so, this needs to be spelled out explicitly in the RFC itself.
This comment has been minimized.
This comment has been minimized.
pnkfelix
Jul 15, 2014
Member
Hmm I guess since this was already merged I should instead open an issue against it.
This comment has been minimized.
This comment has been minimized.
pnkfelix
referenced this pull request
Jul 15, 2014
Closed
unclear definition of lifetime input positions in RFC 39 #165
pnkfelix
added a commit
to pnkfelix/rfcs
that referenced
this pull request
Jul 15, 2014
pnkfelix
referenced this pull request
Jul 15, 2014
Merged
Clarify definition of "input positions" in lifetime elision RFC. #166
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl In the absence of any lifetime variables in return types, the assignment of distinct lifetime parameters is the most general type that can be given. That is probably the intuition that is at play here. Of course, once return types with lifetime variables get involved, then this no longer applies, but everyone agreed that this case was broken today anyways. |
This was referenced Jul 20, 2014
This comment has been minimized.
This comment has been minimized.
|
I was thinking that maybe elided lifetimes in arguments of higher-order function parameters should be desugared to higher-rank lifetimes, because that's usually what you want:
=>
The question is, given that closures are going to be merely trait objects, how could we properly generalize this? (There may or may not be an easy answer; I've spent approximately two minutes thinking about it.) |
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl Why does it matter in that particular case? The only lifetimes that can ever be passed to I assume you were thinking of another case where it does matter? |
This comment has been minimized.
This comment has been minimized.
|
Maybe the example was bad. But:
This is not true, because But to amend, imagine this:
Now there are two But the point is really that in general, do you ever want the lifetimes of the arguments of an argument function to be pre-determined by lifetime parameters on the outer HOF, instead of the (strictly-)more-general formulation where the argument function itself is parameterized over them? |
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl My (potentially mistaken) assumption is that the legacy closures actually have higher-rank lifetimes, even though it isn't a feature exposed independently in the type system, and rust-lang/rust#15067 is tracking exposing that to the new unboxed closures. This code type-checks: fn print_two_with<'a, 'b>(text1: &'a str, text2: &'b str, printer: |&str|) {
if true {
printer(text1);
} else {
printer(text2);
}
}whereas this code does not: fn print_two_with<'a, 'b>(text1: &'a str, text2: &'b str, printer: |&'a str|) {
if true {
printer(text1);
} else {
printer(text2);
}
} |
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl The current plan is that the elision rules apply recursively for the sugared form of unboxed closure types (i.e., the |
glaebhoerl
added a commit
to glaebhoerl/rfcs
that referenced
this pull request
Aug 8, 2014
kosciej
referenced this pull request
Oct 25, 2014
Closed
23.2 Lifetimes: Functions | mut_one() appears to just work? #256
mahkoh
referenced this pull request
Feb 16, 2015
Closed
Remove lifetime elision in type parameter position #869
This comment has been minimized.
This comment has been minimized.
|
|
wycats
referenced this pull request
Aug 10, 2016
Closed
Explicitly answer the question "Is Rust object oriented?" #467
This comment has been minimized.
This comment has been minimized.
b-jonas0
commented
Sep 12, 2018
|
After this change, would there be any lifetime elision rules for lifetimes that appear in the head of a lambda expression (anonymous function)? |
aturon commentedJun 26, 2014
Rendered (draft)
text/tracking issue: rust-lang/rust#15552
Note: the core idea for this RFC and the initial survey both came from @wycats.