Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change "at least" to "at most" #1875

Closed
wants to merge 1 commit into from

Conversation

mulkieran
Copy link
Contributor

@mulkieran mulkieran commented Mar 20, 2019

Note that this is related somehow to #1901.

@mulkieran
Copy link
Contributor Author

Closed, since this is a duplicate of #1782.

@mulkieran mulkieran closed this Mar 20, 2019
@mulkieran mulkieran deleted the master-lifetimes branch March 20, 2019 17:32
@mulkieran mulkieran restored the master-lifetimes branch April 8, 2019 14:56
@mulkieran
Copy link
Contributor Author

Reopening, since #1782 doesn't seem to be moving.

@mulkieran mulkieran reopened this Apr 8, 2019
The returned value must not last longer than the arguments.

If they are both bounded from below, using "at least", then the implication
is that the returned value can last longer than the arguments, which is
precisely what must not happen.

Signed-off-by: mulhern <amulhern@redhat.com>
@mulkieran
Copy link
Contributor Author

rebased.

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

No, I don't think this is correct. An "at most" lifetime would be useless, because it wouldn't tell you anything about when it is safe to use. Something which was immediately dead would satisfy any "at most" lifetime.

It really is "at least" -- that example fn longest<'a>(x: &'a str, y: &'a str) -> &'a str describes two input &str that are guaranteed to be valid for the lifetime 'a, returning a &str that is also valid for at least the lifetime 'a. That's easily implemented if I return one of the input values, which one would probably assume is happening, but note that I could also satisfy this by returning an arbitrary &'static str which lives forever!

From the perspective of the borrow checker, both x and y will be borrowed for the lifetime 'a, regardless of what we actually choose to return.

@mulkieran
Copy link
Contributor Author

No, I don't think this is correct. An "at most" lifetime would be useless, because it wouldn't tell you anything about when it is safe to use. Something which was immediately dead would satisfy any "at most" lifetime.

@cuviper You're right that, if if the only constraint were that the lifetime must bound it from above, the value could be immediately dead. But the other constraint is that its use bounds it from below. They both must be satisfied, and that's where errors can come in.

It really is "at least" -- that example fn longest<'a>(x: &'a str, y: &'a str) -> &'a str describes two input &str that are guaranteed to be valid for the lifetime 'a, returning a &str that is also valid for at least the lifetime 'a. That's easily implemented if I return one of the input values, which one would probably assume is happening, but note that I could also satisfy this by returning an arbitrary &'static str which lives forever!

I think that if the return type is 'static, that means that the lifetime constraint is simply being ignored and does not enter here. It's only if the parameter and the return type are both bound by the lifetime, either explicitly or implicitly, that there is a constraint. Try the following:

// Function parameter and return value has implicit lifetime 'a, internal implementation returns a static str.
fn junk(_imp: &str) -> &str {
    "abc"
}

fn main() {
    let d;
    {
        let z = String::from("abc");
        d = junk(&z); // lifetime 'a (z borrowed here)
        print!("{}", z); // lifetime 'a
        print!("{}", d); // lifetime 'a
    } // lifetime 'a (z goes out of scope here)
    print!("{}", d); // This use means that lifetime must extend here, but it can't.
}

There are lots of choices for lifetime 'a. One choice is just the line d = junk(&z). That satisfies the "at least" requirement for the parameter, but not the "at most" requirement for the return value. The largest possible lifetime is the annotated one. That satisfies the "at least" requirement for the parameter, but a larger lifetime would not, since z goes out of scope there and can't be borrowed after that. But d's lifetime must extend as far at the last print call, so even this largest lifetime is not large enough because the lifetime of the return value is bounded from above by 'a.

Compiling playground v0.0.1 (/playground)
error[E0597]: `z` does not live long enough
  --> src/main.rs:42:18
   |
42 |         d = junk(&z);
   |                  ^^ borrowed value does not live long enough
...
45 |     }
   |     - `z` dropped here while still borrowed
46 |     print!("{}", d);
   |                  - borrow later used here

If I change the return type of junk to 'static str, there is no compiler error. But that is not because the rule is at least, but because the lifetime 'a no longer constrains the return value.

From the perspective of the borrow checker, both x and y will be borrowed for the lifetime 'a, regardless of what we actually choose to return.

That sounds right.

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

print!("{}", d); // This use means that lifetime must extend here, but it can't.

No, you don't have the power to "extend" the lifetime like this.

Saying "at least" when we return &'a T does not mean that that the implementor has to return something that lives as long as the caller wants greater than 'a. Rather, it means that the implementor can choose to return anything it wants that lives for 'a or longer. That could be something that lives for exactly 'a, like returning the input references back, or it could be something 'static. Either way is fine, but the caller only gets to use it for 'a regardless, as that's the only contract.

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

@cuviper You're right that, if if the only constraint were that the lifetime must bound it from above, the value could be immediately dead. But the other constraint is that its use bounds it from below. They both must be satisfied, and that's where errors can come in.

Consider a different function then, with no inputs:

fn get<'a>() -> &'a str { "foo" }

This function has to return something that lives for at least 'a, with no input bounds apart from the type parameter.

In fact, the only thing we can logically return in this case is a &'static str that is cast by variance to &'a str, since an arbitrary 'a includes 'static itself. That input lifetime might make a difference though if we were returning some other invariant type that needed a lifetime.

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

One more:

fn choose<'a, 'b: 'a, T>(a: &'a T, b: &'b T) -> &'a T {
    if flip_a_coin() { a } else { b }
}

We can return either reference, because 'b outlives 'a, so it lives at least as long as the required 'a. The caller can only use the return value for 'a, since that's the minimum guaranteed extent.

But if we had written it the other way, 'a: 'b such that 'b is at most 'a, then we wouldn't be allowed to return b from this function.

@mulkieran
Copy link
Contributor Author

print!("{}", d); // This use means that lifetime must extend here, but it can't.

No, you don't have the power to "extend" the lifetime like this.

Saying "at least" when we return &'a T does not mean that that the implementor has to return something that lives as long as the caller wants greater than 'a. Rather, it means that the implementor can choose to return anything it wants that lives for 'a or longer. That could be something that lives for exactly 'a, like returning the input references back, or it could be something 'static. Either way is fine, but the caller only gets to use it for 'a regardless, as that's the only contract.

@cuviper You're definitely getting my meaning wrong here. I thought of a new formulation that I like better: " 'a can be any lifetime you want, so long as it does not exceed the scope of the parameter and it encompasses all the uses of the return value." And if you can't find a lifetime that simultaneously satisfies those two requirements, then the borrow checker will complain (or else the compiler is broken).

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

'a can be any lifetime you want, so long as it does not exceed the scope of the parameter and it encompasses all the uses of the return value.

Who is "you" in this scenario, the caller or the callee?

When the caller provides a parameter with lifetime 'a, they must guarantee that the parameter will in fact live at least as long as the region defined by 'a. When the caller receives a return value with a lifetime of 'a, they may also assume that this value will survive for at least the region 'a, but can't use it any longer than that. Since 'a is an input type parameter, the caller does get to define the scope of that region 'a, which may be really complicated with NLL.

When the callee (the implementor of the function) receives a parameter with lifetime 'a, they may assume that the parameter lives at least as long as 'a, but can't do anything with it longer than that. When the callee returns a value with lifetime 'a, the value provided must be something known to live at least as long as 'a. The callee has no control over 'a, but it can choose to return something based on an input 'a to match, or something known to live longer ('b: 'a or 'static) through variance.

From either perspective, the lifetimes are still always "at least". I suppose the "at most" might come in where I said, "but can't use it any longer than that." It lives at least that long, but it's unknown whether it could actually live longer, so that's the most you can use it.

@mulkieran
Copy link
Contributor Author

@cuviper You're right that, if if the only constraint were that the lifetime must bound it from above, the value could be immediately dead. But the other constraint is that its use bounds it from below. They both must be satisfied, and that's where errors can come in.

Consider a different function then, with no inputs:

fn get<'a>() -> &'a str { "foo" }

This function has to return something that lives for at least 'a, with no input bounds apart from the type parameter.

In fact, the only thing we can logically return in this case is a &'static str that is cast by variance to &'a str, since an arbitrary 'a includes 'static itself. That input lifetime might make a difference though if we were returning some other invariant type that needed a lifetime.

I think that here, I'ld say, that 'a is not bounded from above, because it is not bound to a parameter, just to the return value. So, it must be big enough for all uses of the return type, and that is the only constraint. Such a constraint ought always to be satisfiable.

To implement a function like that you can't just make a String and return a reference, because you'll get a borrowing error in the implementation of the function, but a Box seems to compile: fn get<'a>() -> &'a str { &Box::new("foo")}. I find that odd, shouldn't the Box just be going out of scope? Maybe it just gets compiled away, and the borrow checker sees only a static string.

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

fn get<'a>() -> &'a str { &Box::new("foo")}

That makes &Box<&'static str>, which will auto-deref to &'static str and cast by variance to &'a str. The Box will just drop as you return. &Box::new("foo".to_string()) would be a problem though.

@mulkieran
Copy link
Contributor Author

'a can be any lifetime you want, so long as it does not exceed the scope of the parameter and it encompasses all the uses of the return value.

Who is "you" in this scenario, the caller or the callee?

The caller.

When the caller provides a parameter with lifetime 'a, they must guarantee that the parameter will in fact live at least as long as the region defined by 'a. When the caller receives a return value with a lifetime of 'a, they may also assume that this value will survive for at least the region 'a, but can't use it any longer than that. Since 'a is an input type parameter, the caller does get to define the scope of that region 'a, which may be really complicated with NLL.

Notice how 'a sneakily became an upper bound on the return value in your paragraph above. "but can't use it any longer than that".

<-- SNIP -->

From either perspective, the lifetimes are still always "at least". I suppose the "at most" might come in where I said, "but can't use it any longer than that." It lives at least that long, but it's unknown whether it could actually live longer, so that's the most you can use it.

Aah, yeah, you noticed that too.

I really think this is an existence property. When I write the code that calls the function and expect it to pass the borrow checker, I'm saying, to the compiler: "There exists some lifetime that does not exceed the scope of the parameter, and encompasses all the uses of the return value. Please find any lifetime that fullfills those requirements and trouble me no further."

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

OK, I'm not sure where to go from here. I still think the original line is entirely correct:

The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

Changing that to "at most" is incorrect, because the function really is required to fulfill 'a, no less. Maybe there could be a following statement to say that this becomes an upper bound on how long the returned reference may be used.

@cuviper
Copy link
Member

cuviper commented Apr 8, 2019

It's kind of like how clamping a minimum value uses the max function. The duality is a bit weird.

@mulkieran
Copy link
Contributor Author

I don't mind if this is closed.

I got what I wanted, which was a stimulating discussion about lifetimes, which should go some way to solidifying my understanding, and making me more facile with the tricky stuff. So, thanks!

I'll still think the wording is incorrect and misleading. But I also feel that the Rust book is always going to struggle trying to find the balance between chatty and friendly on the one hand, and precisely correct on the other. As a former academic, I lean way toward the precisely correct side. So in future I'm going to confine myself to PRs and issues regarding the not-so-chatty-and-friendly Rust docs, like the man pages and the compiler error explanations. And that will be perfectly satisfying to me.

@mulkieran
Copy link
Contributor Author

One more:

fn choose<'a, 'b: 'a, T>(p: &'a T, q: &'b T) -> &'a T {
    if flip_a_coin() { p } else { q }
}

We can return either reference, because 'b outlives 'a, so it lives at least as long as the required 'a. The caller can only use the return value for 'a, since that's the minimum guaranteed extent.

But if we had written it the other way, 'a: 'b such that 'b is at most 'a, then we wouldn't be allowed to return b from this function.

These are just notes to self, really. I realized that thinking about these things would be easier if the names weren't overloaded, so I'm going to throw in a few names for the arguments to the function, P and Q, as well as changing the names of the parameters to p and q for clarity. Also, two concrete lifetimes, A and B, so as not to confuse them w/ the lifetime variables 'a and 'b.

When I write the code and call the function choose(&P, &Q) and expect it to pass the borrow checker, I'm saying: "There exist two lifetimes B and A such that

  • (A does not exceed the scope of P AND A encompasses all the uses of the return value) AND
  • B is at least as large as A AND
  • B does not exceed the scope of Q".

Also, there is an implicit lower bound on both lifetimes, which is that they must be at least as large as the call site.

If I were the compiler, taxed with finding these two lifetimes, I would find any that satisfy 'a's requirements first. Then I would see if there was any one of these possible 'a lifetimes that was contained within the scope of Q. If I find one, then I've found a valid 'b lifetime, and I'm done.

If the return type were bounded by 'b, then I'ld be saying to the compiler:
"There exist two lifetimes B and A such that

  • A does not exceed the scope of P AND
  • B is at least as large as A AND
  • (B does not exceed the scope of Q AND B encompasses all the uses of the return value)".

As the compiler, I would first look for possible lifetimes to satisfy 'b's requirements. Then, I would be done, because the constraints on 'a are both from above, so I could maintain that A was just the call site, and nothing more.

A reasonable and compilable function can be constructed w/ either lifetime signature.

But the lifetimes do constrain the implementation of the function. If the return type is bound by 'a, then either argument can be returned, but if by 'b, only q. This feels like one of those things that on the face of it is is so obvious that it is way too hard to explain, so I won't try.

I'm more and more confident that this is a correct formulation, but I'm also aware that this is far too technical and mathematical to be appropriate for the book.

Closing...

@olalonde
Copy link

olalonde commented Jul 14, 2022

FWIW I feel like @mulkieran is correct here. It seems that @cuviper is looking at fn longest<'a>(x: &'a str, y: &'a str) -> &'a str from the perspective of an implementor, which is weird to me... You first write your function then you write the right annotations, not the other way around (unless you already have the function written in your head).

The function signature is mostly of interest to the caller. It tells you that the variable which receives the return value must have a lifetime at most 'a where 'a is the overlap of the lifetimes of the 'a parameters. What happens inside the function really is irrelevant to the caller.

I believe the sentence is confusing because it is talking in terms of constraints that the implementer of the function has:

The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

Again, this is a weird thing to explain and doesn't help understand what the borrow checker does to verify that the return value is not used beyond where it is allowed to.

@cuviper
Copy link
Member

cuviper commented Jul 14, 2022

I stand by my position even from the perspective of the caller. The alternative, if you tell the caller that "this return value lives for at most 'a", leaves the caller in a bad place not knowing how long they can use that lifetime. It implies that something less would be possible -- so when is it safe to use at all?

For the implementor, "at least 'a" is setting a minimum bar. The actual reference could be something that lives longer like 'static (assuming variance is allowed), but it can't be any less than 'a.

For the caller, when we say it lives "at least 'a", that means they can use it at any point up to the end of 'a, but no more. It might actually live longer, but not necessarily, so you can't touch it after that. I guess you could frame this two ways: the return value will live at least 'a, so then you can use it at most 'a.

@olalonde
Copy link

olalonde commented Jul 14, 2022

I stand by my position even from the perspective of the caller. The alternative, if you tell the caller that "this return value lives for at most 'a", leaves the caller in a bad place not knowing how long they can use that lifetime. It implies that something less would be possible -- so when is it safe to use at all?

I genuinely have trouble understanding what you are getting at here... As a caller, you are only worried about how long you can hold on to the return value. You can't possibly have a lifetime that is too short, only one that is too long.

let x = shortest("foo", "bazz");
// hmmm is x's lifetime to short?
// lets make it a bit longer...
// and a bit more...
// ok, might be safe to use now, lifetime seems long enough

For the caller, when we say it lives "at least 'a", that means they can use it at any point up to the end of 'a, but no more.

Well, in that case, all confusion is cleared...

I still feel that at most 'a would convey that meaning better but it might just be me.

@cuviper
Copy link
Member

cuviper commented Jul 14, 2022

I genuinely have trouble understanding what you are getting at here... How can the lifetime of the caller be too short?

let x = shortest("foo", "bazz");

The caller isn't really responsible for the lifetime of the reference -- that was shortest's job. The caller makes the request, "shortest, please give me a reference that lives at least 'a". The compiler determines what 'a should be from those input parameters and from how the result is actually used.

The caller also has their own borrow checking, but that's a little different. Consider:

let x;
{
    let foo = String::from("foo");
    let bazz = String::from("bazz");
    x = shortest(&foo, &bazz);
    // 'a is limited by the input parameters
    // x can still be used here
    dbg!(x);
}
dbg!(x); // here is an error that foo and bazz don't live long enough

@olalonde
Copy link

olalonde commented Jul 14, 2022

I think part of the misunderstanding was that I think about lifetimes in this order:

  1. determine the lifetimes of the variables by looking at their scope
  2. check that the lifetimes don't violate any constraints

For example, here I would think to myself:

  1. Ok the lifetime of x goes from x = ... to the last line.
  2. Are there any constraints on x? Oh yeah look, the return type of that function is 'a, x can't extend beyond 'a (again for me it's more natural to describe this as at most 'a although it appears we mean the same thing)! But what is 'a? Oh look, it's the lifetime where the lifetimes of foo and bazz overlap. Well, oops, the lifetime of x extends beyond that, it is therefore invalid.

@olalonde
Copy link

After sleeping on it, I think I now better understand what is meant by "the return value lives at least as long as lifetime 'a". What it really means is "the return value lives at least as long as lifetime 'a (but we are not sure it lives beyond, so we can't assume it does)". For me, this was what at most was conveying but I understand how it could be confusing depending on perspective.

@cuviper
Copy link
Member

cuviper commented Jul 15, 2022

I think I get the disconnect -- you see "the return value lives at least as long" and you're connecting that to x itself, making it a statement about how long x lives. What we really mean is that the "stuff" inside the return value (especially data that it borrows) will live at least as long as 'a, and then yes that means x can only live at most 'a as the contracted lifetime.

The specific quote targeted by this PR does talk about the data rather than the "return value" though -- "The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a."

@fubupc
Copy link

fubupc commented Aug 31, 2022

I think I get the disconnect -- you see "the return value lives at least as long" and you're connecting that to x itself, making it a statement about how long x lives. What we really mean is that the "stuff" inside the return value (especially data that it borrows) will live at least as long as 'a, and then yes that means x can only live at most 'a as the contracted lifetime.

Yes, that's exactly what confused me: lifetime seems to have two meanings depends on the context:

  1. scope of reference itself (to be more accurate, from reference's declaration to the point where reference last used)
  2. scope of referent. (from value's creation to destroy)

The function signature also tells Rust that the string slice returned from the function will live at least as long as lifetime 'a.

In this sentence, the string slice (just a reference) seems shoud be the String refered by the string slice (the actual data) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants