Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HRTB error span when -Zno-leak-check is used #63678

Open
wants to merge 1 commit into
base: master
from

Conversation

@Aaron1011
Copy link
Contributor

commented Aug 18, 2019

As described in #57374, NLL currently produces unhelpful higher-ranked
trait bound (HRTB) errors when '-Zno-leak-check' is enabled.

This PR tackles one half of this issue - making the error message point
at the proper span. The error message itself is still the very generic
"higher-ranked subtype error", but this can be improved in a follow-up
PR.

The root cause of the bad spans lies in how NLL attempts to compute the
'blamed' region, for which it will retrieve a span for.
Consider the following code, which (correctly) does not compile:

let my_val: u8 = 25;
let a: &u8 = &my_val;
let b = a;
let c = b;
let d: &'static u8 = c;

This will cause NLL to generate the following subtype constraints:

d :< c
c :< b
b <: a

Since normal Rust lifetimes are covariant, this results in the following
region constraints (I'm using 'd to denote the lifetime of 'd',
'c to denote the lifetime of 'c, etc.):

'c: 'd
'b: 'c
'a: 'b

From this, we can derive that 'a: 'd holds, which implies that 'a: 'static
must hold. However, this is not the case, since 'a refers to 'my_val',
which does not outlive the current function.

When NLL attempts to infer regions for this code, it will see that the
region 'a has grown 'too large' - it will be inferred to outlive
'static, despite the fact that is not declared as outliving 'static
We can find the region responsible, 'd, by starting at the end of
the 'constraint chain' we generated above. This works because for normal
(non-higher-ranked) lifetimes, we generally build up a 'chain' of
lifetime constraints away from the original variable/lifetime.
That is, our original lifetime 'a is required to outlive progressively
more regions. If it ends up living for too long, we can look at the
'end' of this chain to determine the 'most recent' usage that caused
the lifetime to grow too large.

However, this logic does not work correctly when higher-ranked trait
bounds (HRTBs) come into play. This is because HRTBs have
contravariance with respect to their bound regions. For example,
this code snippet compiles:

let a: for<'a> fn(&'a ()) = |_| {};
let b: fn(&'static ()) = a;

Here, we require that 'a' is a subtype of 'b'. Because of
contravariance, we end up with the region constraint 'static: 'a,
not 'a: 'static

This means that our 'constraint chains' grow in the opposite direction
of 'normal lifetime' constraint chains. As we introduce subtypes, our
lifetime ends up being outlived by other lifetimes, rather than
outliving other lifetimes. Therefore, starting at the end of the
'constraint chain' will cause us to 'blame' a lifetime close to the original
definition of a variable, instead of close to where the bad lifetime
constraint is introduced.

This PR improves how we select the region to blame for 'too large'
universal lifetimes, when bound lifetimes are involved. If the region
we're checking is a 'placeholder' region (e.g. the region 'a' in
for<'a>, or the implicit region in fn(&())), we start traversing the
constraint chain from the beginning, rather than the end.

There are two (maybe more) different ways we generate region constraints for NLL:
requirements generated from trait queries, and requirements generated
from MIR subtype constraints. While the former always use explicit
placeholder regions, the latter is more tricky. In order to implement
contravariance for HRTBs, TypeRelating replaces placeholder regions with
existential regions. This requires us to keep track of whether or not an
existential region was originally a placeholder region. When we look for
a region to blame, we check if our starting region is either a
placeholder region or is an existential region created from a
placeholder region. If so, we start iterating from the beginning of the
constraint chain, rather than the end.

Improve HRTB error span when -Zno-leak-check is used
As described in #57374, NLL currently produces unhelpful higher-ranked
trait bound (HRTB) errors when '-Zno-leak-check' is enabled.

This PR tackles one half of this issue - making the error message point
at the proper span. The error message itself is still the very generic
"higher-ranked subtype error", but this can be improved in a follow-up
PR.

The root cause of the bad spans lies in how NLL attempts to compute the
'blamed' region, for which it will retrieve a span for.
Consider the following code, which (correctly) does not compile:

```rust
let my_val: u8 = 25;
let a: &u8 = &my_val;
let b = a;
let c = b;
let d: &'static u8 = c;
```

This will cause NLL to generate the following subtype constraints:

d :< c
c :< b
b <: a

Since normal Rust lifetimes are covariant, this results in the following
region constraints (I'm using 'd to denote the lifetime of 'd',
'c to denote the lifetime of 'c, etc.):

'c: 'd
'b: 'c
'a: 'b

From this, we can derive that 'a: 'd holds, which implies that 'a: 'static
must hold. However, this is not the case, since 'a refers to 'my_val',
which does not outlive the current function.

When NLL attempts to infer regions for this code, it will see that the
region 'a has grown 'too large' - it will be inferred to outlive
'static, despite the fact that is not declared as outliving 'static
We can find the region responsible, 'd, by starting at the *end* of
the 'constraint chain' we generated above. This works because for normal
(non-higher-ranked) lifetimes, we generally build up a 'chain' of
lifetime constraints *away* from the original variable/lifetime.
That is, our original lifetime 'a is required to outlive progressively
more regions. If it ends up living for too long, we can look at the
'end' of this chain to determine the 'most recent' usage that caused
the lifetime to grow too large.

However, this logic does not work correctly when higher-ranked trait
bounds (HRTBs) come into play. This is because HRTBs have
*contravariance* with respect to their bound regions. For example,
this code snippet compiles:

```rust
let a: for<'a> fn(&'a ()) = |_| {};
let b: fn(&'static ()) = a;
```

Here, we require that 'a' is a subtype of 'b'. Because of
contravariance, we end up with the region constraint 'static: 'a,
*not* 'a: 'static

This means that our 'constraint chains' grow in the opposite direction
of 'normal lifetime' constraint chains. As we introduce subtypes, our
lifetime ends up being outlived by other lifetimes, rather than
outliving other lifetimes. Therefore, starting at the end of the
'constraint chain' will cause us to 'blame' a lifetime close to the original
definition of a variable, instead of close to where the bad lifetime
constraint is introduced.

This PR improves how we select the region to blame for 'too large'
universal lifetimes, when bound lifetimes are involved. If the region
we're checking is a 'placeholder' region (e.g. the region 'a' in
for<'a>, or the implicit region in fn(&())), we start traversing the
constraint chain from the beginning, rather than the end.

There are two (maybe more) different ways we generate region constraints for NLL:
requirements generated from trait queries, and requirements generated
from MIR subtype constraints. While the former always use explicit
placeholder regions, the latter is more tricky. In order to implement
contravariance for HRTBs, TypeRelating replaces placeholder regions with
existential regions. This requires us to keep track of whether or not an
existential region was originally a placeholder region. When we look for
a region to blame, we check if our starting region is either a
placeholder region or is an existential region created from a
placeholder region. If so, we start iterating from the beginning of the
constraint chain, rather than the end.
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Aug 18, 2019

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

@Aaron1011

This comment has been minimized.

Copy link
Contributor Author

commented Aug 18, 2019

@rust-highfive rust-highfive assigned nikomatsakis and unassigned eddyb Aug 18, 2019

@Aaron1011

This comment has been minimized.

Copy link
Contributor Author

commented Aug 21, 2019

When I wrote tis PR description, I incorrectly believed that contravariance applied to all HRTBs - it actually only applys to functions. However, I believe that my reasoning about 'constraint chains' is still correct, due to the way that we generate outlives requirements when HRTBs are involved.

@Centril

This comment has been minimized.

Copy link
Member

commented Aug 22, 2019

@Aaron1011 That's because the terminology is misleading here. for<'a> fn(&'a u8) does not involve so called "Higher Rank Trait Bounds" because there are no trait bounds involved. Rather, for<'a> fn(&'a u8) is rank-1 polymorphic type and fn(for<'a> fn(&'a u8)) is a rank-2 polymorphic type (making it higher ranked).

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Aug 30, 2019

@Aaron1011 thanks for the detailed PR description. Very helpful. I do have a few minor nits or clarifications regarding terminology.

As you noted, variance and higher-ranked aren't really connected. Variance really depends on the structure of a type: so if you have fn(T), it is contravariant with respect to T, even if no higher-ranked things are involved at all.

Second, you use placeholders here to refer to multiple things. I try to use placeholders in one specific way: to refer to a free (not bound) lifetime that is meant to represent "any lifetime". Indeed when you have a type for<'a> fn(&'a u32), we will (at times) use placeholders to represent 'a, but the 'a in that type is not a placeholder -- it's a bound lifetime.

Part of the reason to make that distinction is precisely because, in some cases, we replace bound lifetimes with existential lifetimes (as you noted) and in some cases we replace them with placeholders. Which one we use will depend on where the binder appears. For example here, where the binder is in the subtype:

for<'a> fn(&'a u32) <: fn(&'static u32)

we would replace 'a with an existential region. This is because the function has said it accepts a parameter with any lifetime -- therefore, we can convert it to a function fn(&'b u32) for any single lifetime 'b. So we use an inference variable to mean "find the correct lifetime to replace the bound lifetime 'a with".

In contrast, a binder in the supertype position means something else:

fn(&'static u32) <: for<'a>fn(&'a u32)

Here we are converting into a type with a binder. In that case, we replace 'a with a placeholder, indicating that the type we are converting in must work with any lifetime we could pick (which, in this case, is not true, as it requires specifically 'static).

If you combine those things you get that (for example):

for<'a> fn(&'a u32) <: for<'b> fn(&'b u32)

because the variable we create for 'a is inferred to be equal to the placeholder we created for 'b (and universes, in turn, tell us that the variable for 'a can indeed be equal to that placeholder).

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Aug 30, 2019

One other thing:

d :< c

the inverse of <: is typically written d :> c, though usually I prefer to just write c <: d =)

In any case I think either :> or >: would be ok, but it's important to use >, since the idea is that the "subtype" is "smaller" than the "supertype" (as a subtype represents a narrower range of values).

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Aug 30, 2019

Anyway, I've scheduled this PR review for September 4th, since I don't think I'll get to it today and it looks like it's going to take some concentration. @Aaron1011 I'll try to answer your questions on Zulip then =)

@Aaron1011

This comment has been minimized.

Copy link
Contributor Author

commented Aug 31, 2019

@nikomatsakis: Thanks for the feedback! I still don't have a very strong grasp of higher-ranked lifetimes, so I'm sorry if my terminology is unclear.

When I was referring to 'placeholders', I meant to refer specifically to NLLRegionVariableOrigin::Placeholder. As you mentioned, these can get replaced with existential lifetimes, but (I think) it makes sense for us to treat them like placeholder regions for the purposes of determining which span to blame.

@nikomatsakis
Copy link
Contributor

left a comment

Hey @Aaron1011! This is good stuff. I left some comments and some questions. I'm pretty happy and mostly looking to leave a few comments and try to tweak the names of things. I think I see why this change results in better results, but I realized as I was trying to write it out that I didn't quite see, and hence I'm doing a local build to try and dig a bit through the logs.

Thanks for looking at this!

@@ -410,15 +410,17 @@ pub enum NLLRegionVariableOrigin {
/// from a `for<'a> T` binder). Meant to represent "any region".
Placeholder(ty::PlaceholderRegion),

Existential,
Existential {
was_placeholder: bool

This comment has been minimized.

Copy link
@nikomatsakis

nikomatsakis Sep 4, 2019

Contributor

I don't think this is quite the right name, per my prior comments. This existential didn't come from a placeholder, or at least not placeholder in the (rather narrow) sense that I aim to use the term. I'm trying to decide what name I would prefer, though. Perhaps something like from_hr_binders? I guess that's pretty obscure. Perhaps from_forall or from_for_binder or just from_for? It would then include a comment like:

Existential {
    /// If this is true, then this variable was created to represent a lifetime
    /// bound in a `for` binder. For example, it might have been created to
    /// represent the lifetime `'a` in a type like `for<'a> fn(&'a u32)`.
    /// Such variables are created when we are trying to figure out if there
    /// is any valid instantiation of `'a` that could fit into some scenario.
    ///
    /// This is used to inform error reporting: in the case that we are trying to
    /// determine whether there is any valid instantiation of a `'a` variable that meets
    /// some constraint C, we want to blame the "source" of that `for` type, 
    /// rather than blaming the source of the constraint C.
    from_forall: bool,
}

This comment has been minimized.

Copy link
@nikomatsakis

nikomatsakis Sep 4, 2019

Contributor

The reason for me being persnickety here is that there is a subtle distinction between the region in a for<'a> binder and the placeholder that we (sometimes) instantiate it with, and I'd prefer not to blur those lines, since it can be hard enough to explain as is.

let constraint = path[i];
let mut range = 0..path.len();

let should_reverse = match from_region_origin {

This comment has been minimized.

Copy link
@nikomatsakis

nikomatsakis Sep 4, 2019

Contributor

I think we should call this blame_source, and we should add a comment like this:


As noted above, when reporting an error, there is typically a chain of constraints leading from some "source" region to some "target" region. In most cases, we prefer to "blame" the constraints closer to the target -- but there is one exception. When constraints arise from higher-ranked subtyping, we generally prefer to blame the source value, as the "target" in this case tends to be some type annotation that the user gave. Therefore, if we find that the region origin is some instantiation of a higher-ranked region, we start our search from the "source" point rather than the "target", and we also tweak a few other things.

An example might be this bit of Rust code:

let x: fn(&'static ()) = |_| {};
let y: for<'a> fn(&'a ()) = x;

We will detect the error in this code as a conflict between a placeholder region (representing 'a) and 'static. We would prefer to blame the source of the 'static value (x) than the type annotation.

true
}
NLLRegionVariableOrigin::Placeholder(_)
| NLLRegionVariableOrigin::Existential { was_placeholder: true } => {

This comment has been minimized.

Copy link
@nikomatsakis

nikomatsakis Sep 4, 2019

Contributor

I'm going to test this myself, but maybe you know the answer -- what changes when was_placeholder is ignored here? which tests?

This comment has been minimized.

Copy link
@Aaron1011

Aaron1011 Sep 4, 2019

Author Contributor

Some of the new tests should start failing. I can't remember which of them actually triggered this logic, but any test that causes us to replace a NLLRegionVariableOrigin::Placeholder with a NLLRegionVariableOrigin::Existential will have an incorrect error span.

@JohnCSimon

This comment has been minimized.

Copy link

commented Sep 14, 2019

Ping from triage
@Aaron1011 Just letting you know this hasn't been touched in 10 days.
cc: @nikomatsakis

Thanks!

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 17, 2019

☔️ The latest upstream changes (presumably #64535) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.