Skip to content

Commit

Permalink
Auto merge of #31394 - nikomatsakis:incr-comp-variance, r=pnkfelix
Browse files Browse the repository at this point in the history
Make the dep. graph edges created by variance just mirror the constraint graph.

Note that this extends <#31304>, so the first few commits are on a different topic.

r? @pnkfelix
  • Loading branch information
bors committed Feb 18, 2016
2 parents f075698 + 01ebc37 commit 8e2a577
Show file tree
Hide file tree
Showing 9 changed files with 1,422 additions and 1,261 deletions.
12 changes: 11 additions & 1 deletion src/librustc/front/map/mod.rs
Expand Up @@ -325,7 +325,17 @@ impl<'ast> Map<'ast> {
return DepNode::Krate,

NotPresent =>
panic!("Walking parents from `{}` led to `NotPresent` at `{}`", id0, id),
// Some nodes, notably struct fields, are not
// present in the map for whatever reason, but
// they *do* have def-ids. So if we encounter an
// empty hole, check for that case.
return self.opt_local_def_id(id)
.map(|def_id| DepNode::Hir(def_id))
.unwrap_or_else(|| {
panic!("Walking parents from `{}` \
led to `NotPresent` at `{}`",
id0, id)
}),
}
}
}
Expand Down
1,250 changes: 0 additions & 1,250 deletions src/librustc_typeck/variance.rs

This file was deleted.

302 changes: 302 additions & 0 deletions src/librustc_typeck/variance/README.md
@@ -0,0 +1,302 @@
This file infers the variance of type and lifetime parameters. The
algorithm is taken from Section 4 of the paper "Taming the Wildcards:
Combining Definition- and Use-Site Variance" published in PLDI'11 and
written by Altidor et al., and hereafter referred to as The Paper.

This inference is explicitly designed *not* to consider the uses of
types within code. To determine the variance of type parameters
defined on type `X`, we only consider the definition of the type `X`
and the definitions of any types it references.

We only infer variance for type parameters found on *data types*
like structs and enums. In these cases, there is fairly straightforward
explanation for what variance means. The variance of the type
or lifetime parameters defines whether `T<A>` is a subtype of `T<B>`
(resp. `T<'a>` and `T<'b>`) based on the relationship of `A` and `B`
(resp. `'a` and `'b`).

We do not infer variance for type parameters found on traits, fns,
or impls. Variance on trait parameters can make indeed make sense
(and we used to compute it) but it is actually rather subtle in
meaning and not that useful in practice, so we removed it. See the
addendum for some details. Variances on fn/impl parameters, otoh,
doesn't make sense because these parameters are instantiated and
then forgotten, they don't persist in types or compiled
byproducts.

### The algorithm

The basic idea is quite straightforward. We iterate over the types
defined and, for each use of a type parameter X, accumulate a
constraint indicating that the variance of X must be valid for the
variance of that use site. We then iteratively refine the variance of
X until all constraints are met. There is *always* a sol'n, because at
the limit we can declare all type parameters to be invariant and all
constraints will be satisfied.

As a simple example, consider:

enum Option<A> { Some(A), None }
enum OptionalFn<B> { Some(|B|), None }
enum OptionalMap<C> { Some(|C| -> C), None }

Here, we will generate the constraints:

1. V(A) <= +
2. V(B) <= -
3. V(C) <= +
4. V(C) <= -

These indicate that (1) the variance of A must be at most covariant;
(2) the variance of B must be at most contravariant; and (3, 4) the
variance of C must be at most covariant *and* contravariant. All of these
results are based on a variance lattice defined as follows:

* Top (bivariant)
- +
o Bottom (invariant)

Based on this lattice, the solution V(A)=+, V(B)=-, V(C)=o is the
optimal solution. Note that there is always a naive solution which
just declares all variables to be invariant.

You may be wondering why fixed-point iteration is required. The reason
is that the variance of a use site may itself be a function of the
variance of other type parameters. In full generality, our constraints
take the form:

V(X) <= Term
Term := + | - | * | o | V(X) | Term x Term

Here the notation V(X) indicates the variance of a type/region
parameter `X` with respect to its defining class. `Term x Term`
represents the "variance transform" as defined in the paper:

If the variance of a type variable `X` in type expression `E` is `V2`
and the definition-site variance of the [corresponding] type parameter
of a class `C` is `V1`, then the variance of `X` in the type expression
`C<E>` is `V3 = V1.xform(V2)`.

### Constraints

If I have a struct or enum with where clauses:

struct Foo<T:Bar> { ... }

you might wonder whether the variance of `T` with respect to `Bar`
affects the variance `T` with respect to `Foo`. I claim no. The
reason: assume that `T` is invariant w/r/t `Bar` but covariant w/r/t
`Foo`. And then we have a `Foo<X>` that is upcast to `Foo<Y>`, where
`X <: Y`. However, while `X : Bar`, `Y : Bar` does not hold. In that
case, the upcast will be illegal, but not because of a variance
failure, but rather because the target type `Foo<Y>` is itself just
not well-formed. Basically we get to assume well-formedness of all
types involved before considering variance.

#### Dependency graph management

Because variance works in two phases, if we are not careful, we wind
up with a muddled mess of a dep-graph. Basically, when gathering up
the constraints, things are fairly well-structured, but then we do a
fixed-point iteration and write the results back where they
belong. You can't give this fixed-point iteration a single task
because it reads from (and writes to) the variance of all types in the
crate. In principle, we *could* switch the "current task" in a very
fine-grained way while propagating constraints in the fixed-point
iteration and everything would be automatically tracked, but that
would add some overhead and isn't really necessary anyway.

Instead what we do is to add edges into the dependency graph as we
construct the constraint set: so, if computing the constraints for
node `X` requires loading the inference variables from node `Y`, then
we can add an edge `Y -> X`, since the variance we ultimately infer
for `Y` will affect the variance we ultimately infer for `X`.

At this point, we've basically mirrored the inference graph in the
dependency graph. This means we can just completely ignore the
fixed-point iteration, since it is just shuffling values along this
graph. In other words, if we added the fine-grained switching of tasks
I described earlier, all it would show is that we repeatedly read the
values described by the constraints, but those edges were already
added when building the constraints in the first place.

Here is how this is implemented (at least as of the time of this
writing). The associated `DepNode` for the variance map is (at least
presently) `Signature(DefId)`. This means that, in `constraints.rs`,
when we visit an item to load up its constraints, we set
`Signature(DefId)` as the current task (the "memoization" pattern
described in the `dep-graph` README). Then whenever we find an
embedded type or trait, we add a synthetic read of `Signature(DefId)`,
which covers the variances we will compute for all of its
parameters. This read is synthetic (i.e., we call
`variance_map.read()`) because, in fact, the final variance is not yet
computed -- the read *will* occur (repeatedly) during the fixed-point
iteration phase.

In fact, we don't really *need* this synthetic read. That's because we
do wind up looking up the `TypeScheme` or `TraitDef` for all
references types/traits, and those reads add an edge from
`Signature(DefId)` (that is, they share the same dep node as
variance). However, I've kept the synthetic reads in place anyway,
just for future-proofing (in case we change the dep-nodes in the
future), and because it makes the intention a bit clearer I think.

### Addendum: Variance on traits

As mentioned above, we used to permit variance on traits. This was
computed based on the appearance of trait type parameters in
method signatures and was used to represent the compatibility of
vtables in trait objects (and also "virtual" vtables or dictionary
in trait bounds). One complication was that variance for
associated types is less obvious, since they can be projected out
and put to myriad uses, so it's not clear when it is safe to allow
`X<A>::Bar` to vary (or indeed just what that means). Moreover (as
covered below) all inputs on any trait with an associated type had
to be invariant, limiting the applicability. Finally, the
annotations (`MarkerTrait`, `PhantomFn`) needed to ensure that all
trait type parameters had a variance were confusing and annoying
for little benefit.

Just for historical reference,I am going to preserve some text indicating
how one could interpret variance and trait matching.

#### Variance and object types

Just as with structs and enums, we can decide the subtyping
relationship between two object types `&Trait<A>` and `&Trait<B>`
based on the relationship of `A` and `B`. Note that for object
types we ignore the `Self` type parameter -- it is unknown, and
the nature of dynamic dispatch ensures that we will always call a
function that is expected the appropriate `Self` type. However, we
must be careful with the other type parameters, or else we could
end up calling a function that is expecting one type but provided
another.

To see what I mean, consider a trait like so:

trait ConvertTo<A> {
fn convertTo(&self) -> A;
}

Intuitively, If we had one object `O=&ConvertTo<Object>` and another
`S=&ConvertTo<String>`, then `S <: O` because `String <: Object`
(presuming Java-like "string" and "object" types, my go to examples
for subtyping). The actual algorithm would be to compare the
(explicit) type parameters pairwise respecting their variance: here,
the type parameter A is covariant (it appears only in a return
position), and hence we require that `String <: Object`.

You'll note though that we did not consider the binding for the
(implicit) `Self` type parameter: in fact, it is unknown, so that's
good. The reason we can ignore that parameter is precisely because we
don't need to know its value until a call occurs, and at that time (as
you said) the dynamic nature of virtual dispatch means the code we run
will be correct for whatever value `Self` happens to be bound to for
the particular object whose method we called. `Self` is thus different
from `A`, because the caller requires that `A` be known in order to
know the return type of the method `convertTo()`. (As an aside, we
have rules preventing methods where `Self` appears outside of the
receiver position from being called via an object.)

#### Trait variance and vtable resolution

But traits aren't only used with objects. They're also used when
deciding whether a given impl satisfies a given trait bound. To set the
scene here, imagine I had a function:

fn convertAll<A,T:ConvertTo<A>>(v: &[T]) {
...
}

Now imagine that I have an implementation of `ConvertTo` for `Object`:

impl ConvertTo<i32> for Object { ... }

And I want to call `convertAll` on an array of strings. Suppose
further that for whatever reason I specifically supply the value of
`String` for the type parameter `T`:

let mut vector = vec!["string", ...];
convertAll::<i32, String>(vector);

Is this legal? To put another way, can we apply the `impl` for
`Object` to the type `String`? The answer is yes, but to see why
we have to expand out what will happen:

- `convertAll` will create a pointer to one of the entries in the
vector, which will have type `&String`
- It will then call the impl of `convertTo()` that is intended
for use with objects. This has the type:

fn(self: &Object) -> i32

It is ok to provide a value for `self` of type `&String` because
`&String <: &Object`.

OK, so intuitively we want this to be legal, so let's bring this back
to variance and see whether we are computing the correct result. We
must first figure out how to phrase the question "is an impl for
`Object,i32` usable where an impl for `String,i32` is expected?"

Maybe it's helpful to think of a dictionary-passing implementation of
type classes. In that case, `convertAll()` takes an implicit parameter
representing the impl. In short, we *have* an impl of type:

V_O = ConvertTo<i32> for Object

and the function prototype expects an impl of type:

V_S = ConvertTo<i32> for String

As with any argument, this is legal if the type of the value given
(`V_O`) is a subtype of the type expected (`V_S`). So is `V_O <: V_S`?
The answer will depend on the variance of the various parameters. In
this case, because the `Self` parameter is contravariant and `A` is
covariant, it means that:

V_O <: V_S iff
i32 <: i32
String <: Object

These conditions are satisfied and so we are happy.

#### Variance and associated types

Traits with associated types -- or at minimum projection
expressions -- must be invariant with respect to all of their
inputs. To see why this makes sense, consider what subtyping for a
trait reference means:

<T as Trait> <: <U as Trait>

means that if I know that `T as Trait`, I also know that `U as
Trait`. Moreover, if you think of it as dictionary passing style,
it means that a dictionary for `<T as Trait>` is safe to use where
a dictionary for `<U as Trait>` is expected.

The problem is that when you can project types out from `<T as
Trait>`, the relationship to types projected out of `<U as Trait>`
is completely unknown unless `T==U` (see #21726 for more
details). Making `Trait` invariant ensures that this is true.

Another related reason is that if we didn't make traits with
associated types invariant, then projection is no longer a
function with a single result. Consider:

```
trait Identity { type Out; fn foo(&self); }
impl<T> Identity for T { type Out = T; ... }
```

Now if I have `<&'static () as Identity>::Out`, this can be
validly derived as `&'a ()` for any `'a`:

<&'a () as Identity> <: <&'static () as Identity>
if &'static () < : &'a () -- Identity is contravariant in Self
if 'static : 'a -- Subtyping rules for relations

This change otoh means that `<'static () as Identity>::Out` is
always `&'static ()` (which might then be upcast to `'a ()`,
separately). This was helpful in solving #21750.


0 comments on commit 8e2a577

Please sign in to comment.