Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum variant types #2593

Open
wants to merge 14 commits into
base: master
from

Conversation

Projects
None yet
@varkor
Copy link
Member

commented Nov 10, 2018

Enum variants are to be considered types in their own rights. This allows them to be irrefutably matched upon. Where possible, type inference will infer variant types, but as variant types may always be treated as enum types this does not cause any issues with backwards-compatibility.

enum Either<A, B> { L(A), R(B) }

fn all_right<A, B>(b: B) -> Either<A, B>::R {
    Either::R(b)
}

let Either::R(b) = all_right::<(), _>(1729);
println!("b = {}", b);

Rendered

Thanks to @Centril for providing feedback on this RFC!

@varkor varkor referenced this pull request Nov 10, 2018

Closed

Types for enum variants #1450

@Centril Centril added the T-lang label Nov 10, 2018

@alexreg

This comment has been minimized.

Copy link

commented Nov 10, 2018

Great work, @varkor. I've been looking forward to this for a long time. Just as a side-point, I'd love to follow this up with an RFC for the ideas in https://internals.rust-lang.org/t/pre-rfc-using-existing-structs-and-tuple-structs-as-enum-variants/7529 once this gets implemented in nightly (or perhaps even before). Since you've worked on this, would appreciate your thoughts at some point.

@bchallenor

This comment has been minimized.

Copy link

commented Nov 10, 2018

Although sum types are becoming increasingly common in programming languages, most do not choose to allow the variants to be treated as types in their own right (that is, the author has not found any that permit this design pattern).

This is possible in Scala - as in your example, Left and Right are subtypes of Either, and can be referred to independently. Coming from Scala, I miss this feature in Rust, and I am fully in favour of this RFC.

@varkor

This comment has been minimized.

Copy link
Member Author

commented Nov 10, 2018

This is possible in Scala - as in your example, Left and Right are subtypes of Either, and can be referred to independently.

Ah, great, I'll add that in, thanks!

@Ixrec

This comment has been minimized.

Copy link
Contributor

commented Nov 10, 2018

I think I'm in favor of the proposed functionality and semantics here. Where I'm stumbling is the nomenclature/terminology/teachability(?); it's not clear to me that "introducing a new kind of type: variant types" is the best description of this. In particular, precisely because this proposal feels so lightweight compared to previous ones, it doesn't really "feel" like what we're doing is adding a whole new type kind the way structural records or anonymous enums would be doing. It sounds like it could be equally well described as doing the "duplicating a variant as a standalone struct" workaround automagically, so those extra structs are just always there (except they get a specific layout guarantee and different conversion syntax that regular structs wouldn't get). Is there some detail I overlooked that makes this clearly not a sugar?

I'm guessing this is at least partially ignorance on my part because

The loose distinction between the enum type and its variant types could be confusing to those unfamiliar with variant types.

makes it sound like "variant types" are an actual thing with their own special properties that no other kinds of types have, and I just have no idea what that would be (since being autogenerated, having a certain layout guarantee and different conversion syntax seem like "surface level" properties that aren't really part of the type system per se). Maybe I just need to see some more examples of how these types behave?

@leonardo-m

This comment has been minimized.

Copy link

commented Nov 10, 2018

Nice RFC.

In all cases, the most specific type (i.e. the variant type if possible) is chosen by the type inference.

Is code like this still allowed, or is the compiler going to tell me that the Sum::B(b) branch of the match is impossible and needs to be removed?

enum Sum { A(u8), B(u8) }
let x = Sum::A(5); // x: Sum::A
match x {
    Sum::A(a) => {},
    Sum::B(b) => {},
}

Both options have advantages and disadvantages.

@varkor

This comment has been minimized.

Copy link
Member Author

commented Nov 10, 2018

Is code like this still allowed, or is the compiler going to tell me that the Sum::B(b) branch of the match is impossible and needs to be removed?

This is a good question — I'll make note of it in the RFC. Although matching on variant types permits irrefutable matches, it must also accept the any other variants with the same type — otherwise it's not backwards compatible with existing code.

Where I'm stumbling is the nomenclature/terminology/teachability(?); it's not clear to me that "introducing a new kind of type: variant types" is the best description of this.

since being autogenerated, having a certain layout guarantee and different conversion syntax seem like "surface level" properties that aren't really part of the type system per se

It's quite possible there's a better way to explain this. They are essentially as you say, though they act slightly differently from structs (on top of the points you made) in the way they are pattern-matched (as above in this comment) and their discriminant value. I thought it would be clearer to describe them as an entirely new kind of type, but perhaps calling them special kinds of structs would be more intuitive as you say. I'll think about how to reword the relevant sections.

@varkor varkor force-pushed the varkor:enum-variant-types branch from 14a6e83 to 2b00420 Nov 10, 2018

@nrc

This comment has been minimized.

Copy link
Member

commented Nov 10, 2018

This was previously proposed in #1450. That was postponed because we were unsure about the general story around type fallback (e.g, integer types, default generic types, etc). Enum variants would add another case of this and so we wanted to be certain that the current approach is good and there are no weird interactions. IIRC, there was also some very minor backwards incompatibility.

This RFC should address those and issues, and summarise how this RFC is different to #1450.

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

I'm still personally very strongly in favour of this feature! The general mood on #1405 was also positive.

@eddyb

This comment has been minimized.

Copy link
Member

commented Nov 10, 2018

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

I'm not sure that would need to be at odds with variant types, if Rust ends up with refinement types I expect variant types to be refinements of their enum.

@Centril

This comment has been minimized.

Copy link
Contributor

commented Nov 10, 2018

First, irrespective of what happens with the RFC;
I always appreciate the effort put into well thought out RFCs and this is one of those, so thank you!

I am of two minds and a bit torn about the proposal here.

  1. I think it would help immensely to make an API such as syn::Expr more ergonomic and avoid auxiliary structs such as syn::ExprBox. Here you don't need any implementations on the variant types you'd get because the variant types are for the most part just dumb data.

  2. At the same time, precisely because this RFC does not permit implementations on variant types, as I think is proper to avoid the pitfalls of Java-OOP-inheritance APIs, it will not allow you to refactor an API such as syn::Lit into one where syn::LitStr is a variant type (i.e. Lit::Str) because the implementations there would not be admitted by the type checker.

  3. All in all, I think the benefits of this proposal are well motivated and the costs in terms of understanding are not that great. I think this proposal is something that a subset of users would naturally expect; the user also doesn't have to do much, expressiveness is given for free to them.

  4. The RFC interacts well with #1806 as well as goals and plans for gradual struct initialization; in fact, it provides the "missing link" that makes gradual initialization more generally applicable in the type system. This is a wonderful thing.

  5. Thus on balance I think the RFC is a good idea.

(Feel free to integrate any points that you found relevant into the text of the RFC)


@nrc

This RFC should address those and issues, and summarise how this RFC is different to #1450.

👍

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

(Aside, but let's not go too deeply into this: I personally think that refinement / dependent typing is both a good idea, a good fit for Rust's general aim for correctness and type system power for library authors -- and RFC 2000 is sort of dependent types anyways so it's sort of sunk cost wrt. complexity -- the use cases for dependent/refinement types are sort of different than the goal here; With dependent types we wish to express things like { x: usize | x < 10 })


@eddyb

I'm not sure that would need to be at odds with variant types, if Rust ends up with refinement types I expect variant types to be refinements of their enum.

I agree; I think you can think of variant types in the general framework of refinement / dependent types;
With the notation due to @petrochenkov for pattern matching as a boolean operator, we can think of variant types as:

type FooVar = { x: Foo | x is Foo::Variant(...) };
@burdges

This comment has been minimized.

Copy link

commented Nov 10, 2018

We do want formal verification of rust code eventually, and afaik doing that well requires refinement types. I'm not saying rust itself needs refinement types per se, but rust should eventually have a type system plugin/fork/preprocessor for formal verification features, like refinement types. I do like this feature of course, but ideally the syntax here should avoid conflicts with refinement types.

@Centril

This comment has been minimized.

Copy link
Contributor

commented Nov 10, 2018

@burdges

I do like this feature of course, but ideally the syntax here should avoid conflicts with refinement types.

Are there any such conflicts in your view?

@Centril

This comment has been minimized.

Copy link
Contributor

commented Nov 10, 2018

(Or to elaborate; if there are any conflicts with the RFC as proposed with refinement typing, then stable Rust as is has that conflict since the RFC does not introduce any new syntax...)

@ExpHP

This comment has been minimized.

Copy link

commented Nov 10, 2018

Note that because a variant type, e.g. Sum::A, is not a subtype of the enum type (rather, it can simply be coerced to the enum type), a type like VecSum::A is not a subtype of Vec. (However, this should not pose a problem as it should generally be convenient to coerce Sum::A to Sum upon either formation or use.)

So, if I understand correctly, all existing code that uses enums now have coercions all over the place in order to ensure they continue functioning? I'm really not sure this works...

let mut x:

x = None;

// At this point the compiler knows the type
// of x is Option<?0>::None.

// But Option<_>::Some cannot be coerced to None
x = Some(1); // type error?
@leonardo-m

This comment has been minimized.

Copy link

commented Nov 10, 2018

let mut x:
x = None;
// At this point the compiler knows the type
// of x is Option<?0>::None.
// But Option<_>::Some cannot be coerced to None
x = Some(1); // type error?

I think in theory the type system should infer x to be of type Option<i32> because it sees both assignments.

But I think we need a formalization of the involved type system rules, to assure soundness, before implementing this proposal...

@mark-i-m

This comment has been minimized.

Copy link
Contributor

commented Nov 10, 2018

I would rather frame this as follows:

  • the type is Option
  • the compiler tracks the most specific variant the value is if it can be known statically. This could be done with a standard dataflow analysis.
@leonardo-m

This comment has been minimized.

Copy link

commented Nov 10, 2018

We do want formal verification of rust code eventually, and afaik doing that well requires refinement types. I'm not saying rust itself needs refinement types per se, but rust should eventually have a type system plugin/fork/preprocessor for formal verification features, like refinement types.

While I don't dislike LiquidHaskell-like refinement typing, lately for the future of Rust I prefer a style of verification as in the Why3 language ( http://why3.lri.fr/ , that is also related to the Ada-SPARK verification style). We'll need a pre-RFC for this.

@leonardo-m

This comment has been minimized.

Copy link

commented Nov 11, 2018

I hope this syntax is also supported (I suggest to add it to the RFC):

enum Sum { A(u32), B, C }
fn print_a1(Sum::A(x): Sum::A) {}

A question regarding the ABI: is the print_a1() function receiving the Sum discriminant too as argument?

And in future it could also be supported the more DRY syntax (I think suggested by Centril):

fn print_a2(Sum::A(x)) {}

You could also add a new (silly) example to this RFC that shows the purposes of this type system improvement:

enum List { Nil, Succ(u32, Box<List>) }

fn prepend(x: u32, lst: List) -> List {
    List::Succ(x, box lst)
}

With this improvement you can write instead:

fn prepend(x: u32, lst: List) -> List::Succ {
    List::Succ(x, box lst)
}

Then you can define a list_head_succ() function that returns the head of the result of prepend() without a unwraps or Option result:

fn list_head(lst: List) -> Option<u32> {
    match lst {
        List::Succ(x, _) => x,
        List::Nil => None,
    }
}

fn list_head_succ(List::Succ(x, _): List::Succ) -> u32 { x }

{ x: usize | x < 10 }

For the common case of integer intervals for Rust I sometimes prefer a shorter and simpler syntax like:

type Small = usize[.. 10];

@shepmaster
Copy link
Member

left a comment

"Overhead"

I'm mostly interested in this RFC from the point-of-view of "enums of lots of standalone other types". The biggest example I have is the AST expressed in fuzzy-pickles, a Rust parser which uses this pattern extensively:

pub enum Item {
    AttributeContaining(AttributeContaining),
    Const(Const),
    Enum(Enum),
    // ...

Unfortunately, I don't see this as being a large win for such a case due to the "forced overhead" of each enum variant still being the same size as all the other variants. It's an understandable decision, just not one that I see as helping as much as it could.

This is mentioned in the alternatives section, but I want to make sure the point is reiterated.

Multiple variants

I didn't see any mention of if multiple variants would be supported:

#[derive(Debug)]
enum Count {
    Zero,
    One,
    Many(usize),
}

fn example(c: Count) {
    use Count::*;
    match c {
        x @ Zero | x @ One => println!("{:?}", x), // what is the type of `x` here?
        x => println!("{:?}", x),
    }
} 

It may also be worth explicitly calling out what the type is for those catch-all patterns as well as in cases of match guards.

foo @

This may be swerving into refinement type territory, but I naturally wanted to not type the foo @ in the previous example:

match c {
    Zero | One => println!("{:?}", c),
    // ...

I feel this is a pretty hidden and uncommon aspect of patterns, and it'd be nice to just be able to intuit the type based on the pattern without adding the explicit binding. That might even mean we could do:

if let Count::Many(..) = c {
    println!("{}", c.0);
}
Show resolved Hide resolved text/0000-enum-variant-types.md
and `impl Trait for Enum::Variant` are forbidden. This dissuades inclinations to implement
abstraction using behaviour-switching on enums (for example, by simulating inheritance-based
subtyping, with the enum type as the parent and each variant as children), rather than using traits
as is natural in Rust.

This comment has been minimized.

Copy link
@shepmaster

shepmaster Nov 11, 2018

Member

I'm a fan of the proposed style, but it might be worth stating why Rust the language wants to dissuade this pattern.

- Passing a known variant to a function, matching on it, and use `unreachable!()` arms for the other
variants.
- Passing individual fields from the variant to a function.
- Duplicating a variant as a standalone `struct`.

This comment has been minimized.

Copy link
@shepmaster

shepmaster Nov 11, 2018

Member

I disagree that this goal is going to be as widely achieved by this RFC as I would like due to the following point:

the variant types proposed here have identical representations to their enums

That means that if I have an enum with large variants:

enum Thing {
    One([u8; 128]),
    Two(u8),
}

Even the "small" variants (e.g. Thing::Two) are still going to take "a lot" of space.

This comment has been minimized.

Copy link
@eddyb

eddyb Nov 11, 2018

Member

If space is a concern then we could have it so variant types only convert to their enum by-value, so e.g. a &Thing::Two wouldn't be a valid &Thing.

That's weaker than something more akin to refinement typing, but maybe it's enough?

This comment has been minimized.

Copy link
@Centril

Centril Nov 11, 2018

Contributor

@eddyb I think that's already the case; the RFC doesn't state anywhere, as far as I can tell, that &Thing::Two is a valid &Thing. Also note that the RFC explicitly states that Thing::Two and Thing having the same layout is not a guarantee so we could change the layout to be more space efficient.

Centril added a commit to Centril/rust that referenced this pull request Jan 17, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 18, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 18, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 19, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

@linacambridge linacambridge referenced this pull request Jan 20, 2019

Merged

Bookmarks #525

@CAD97

This comment has been minimized.

Copy link

commented Feb 6, 2019

ping from not-triage-but-this-was-mentioned-on-internals: what's the status of this RFC?

It seems that rust-lang/rust#57644 is a future-compat lint for syntax space for this (or an equivalent) feature; what's between this and proposal-merge?

@ExpHP

This comment has been minimized.

Copy link

commented Feb 6, 2019

Okay, let's take a look at the new stuff about type inference.

  • If the value is treated as a single variant (and possibly additionally as the enum), we choose
    the single variant type. For example:
let x = Sum::A(5); // x: Sum::A
println!("x is {}", x.0);
  • If the type is treated as multiple different variants, we choose the enum type.
let mut x = Sum::A(5); // x: Sum
println!("x is {}", x.0); // error: no field `0` on type `Sum`
x = Sum::B;
println!("x is not numeric");
  • In a case where the type variable is unknown, we default to the enum.

I suspect this can't work for the same reason that we get type annotations needed all the time when trying to call methods; the type checker works strictly in order. It wants to be able to resolve the type of x.0 as soon as it sees it, not several statements later.

Without that restriction, we get the following madness:

#[derive(Debug)]
struct Struct {
    field: i8,
}

#[derive(Debug)]
enum Enum {
    A { field: Struct },
    B { field: Struct },
}

impl std::ops::Deref for Enum {
    type Target = Struct;
    
    fn deref(&self) -> &Struct {
        match self {
            Enum::A { field } => field,
            Enum::B { field } => field,
        }
    }
}

fn main() {
    let mut a = Enum::A { field: Struct { field: 0 } };
    
    println!("`{:?}`", a.field);
    
    // Commenting out the following line changes the output of the
    // above println from: `0`
    //                 to: `Struct { field: 0 }`
    a = Enum::B { field: Struct { field: 0 } };
}

(by the way, because the above snippet does currently compile and print 0 when you comment out the second line, this constitutes a breaking change)

@varkor

This comment has been minimized.

Copy link
Member Author

commented Mar 2, 2019

I suspect this can't work for the same reason that we get type annotations needed all the time when trying to call methods; the type checker works strictly in order. It wants to be able to resolve the type of x.0 as soon as it sees it, not several statements later.

(Edited: see @alercah's comment below.)

// Commenting out the following line changes the output of the
// above println from: `0`
//                 to: `Struct { field: 0 }`

Enum::A still implements Deref, so I would expect this to print `0`, regardless of whether that line was commented out or not.

@alercah

This comment has been minimized.

Copy link
Contributor

commented Mar 2, 2019

That doesn't sound right to me; types can be inferred due to unification which may well occur several statements later:

fn main() {
    let x = vec![1, 2, 3].into_iter().collect();
    let y = &x;
    let _z : &Vec<u64> = y;
}
@golddranks

This comment has been minimized.

Copy link

commented Mar 2, 2019

@alercah I think @ExpHP meant cases like this, since they are talking about "trying to call methods":

fn return_vec() -> Vec<u64> {
    let mut x = vec![1, 2, 3].into_iter().collect();
    
    // uncomment to make it compile
    // let mut x: Vec<u64> = x;
    
    x.push(4);
    
    x
}

fn main() {
    println!("{:?}", return_vec());
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=940235f3b5743e92086a19465d5b8607

@ExpHP

This comment has been minimized.

Copy link

commented Mar 3, 2019

@golddranks is right.

It is a common misconception that type inference can use information later in the function body to resolve types earlier in the body. Type checking works almost entirely strictly forwards in a function body. The only thing that type inference actually does is allow the compiler to reason about types generically, using type inference variables to delay filling in the details that it does not yet need to know.

@alercah your example goes like this:

let x = vec![1, 2, 3].into_iter().collect();
let y = &x;
let _z : &Vec<u64> = y;
  1. The temporary vec![1, 2, 3] receives a type of Vec<?int_0>, where ?int_0 is an integer-flavored type inference variable
  2. .into_iter() and .collect() are resolved to IntoIter::into_iter and Iterator::collect because there exist general enough impls for these traits to apply to Vec<?int_0>.
  3. x ends up with type ?1, with the obligation ?1: FromIterator<?int_0>. (To be checked later)
  4. y gets type &?1.
  5. _z has type &Vec<u64>. When the compiler unifies the type of the expression y with the type of _z, it determines that ?int_0 = u64.

If a x.push() is inserted before let _z = then the compiler rejects it, because no methods are known for ?1. (it has no outermost type constructor, so inherent methods cannot be sought; and it is not known to satisfy any traits with a push associated method)

Here's some more fun examples to help convince you:

Example 1
struct Wrapped<T>(T);

impl Wrapped<Vec<u8>> {
    fn boo(&self) { println!("Vec<u8>") }
}

trait Boo {
    fn boo(&self) { println!("Trait method") }
}

impl<T> Boo for Vec<T> {}

fn main() {
    let mut a = Wrapped(vec![]);

    a.boo();
    a.0.push(());
}

Does this compile? If so, what does it print? If not, what is the error?

(Hint: This would compile if the compiler could determine that a has type Vec<()>. But... can it?)

Solution
   Compiling playground v0.0.1 (/playground)
error[E0308]: mismatched types
  --> src/main.rs:17:14
   |
17 |     a.0.push(());
   |              ^^ expected u8, found ()
   |
   = note: expected type `u8`
              found type `()`

Somebody who thinks type inference can flow information backwards might think that a will have type Vec<()> and that this should therefore compile and print "Trait method". Such is not the case.

a initially has type Wrapped<Vec<?0>>. At a.boo(), the compiler needs to know immediately what method is being called. It searches inherent methods of Wrapped<Vec<_>> first because these take precedence over trait methods. There is a single match, <Wrapped<Vec<u8>>>::boo, so it selects this, and determines that ?0 = u8.

Example 2
struct Wrapped<T>(T);

impl Wrapped<Vec<u8>> {
    fn boo(&self) { println!("Vec<u8>") }
}

// !!!!! This was added to example 1
impl Wrapped<Vec<()>> {
    fn boo(&self) { println!("Vec<())>") }
}

// !!!!!! Trait was removed
// (it was already shown to be irrelevant)

fn main() {
    let mut a = Wrapped(vec![]);

    a.boo();
    a.0.push(());
}

Does this compile? If so, what does it print? If not, what is the error?

Solution
   Compiling playground v0.0.1 (/playground)
error[E0034]: multiple applicable items in scope
  --> src/main.rs:21:7
   |
21 |     a.boo();
   |       ^^^ multiple `boo` found
   |
note: candidate #1 is defined in an impl for the type `Wrapped<std::vec::Vec<u8>>`
  --> src/main.rs:4:5
   |
4  |     fn boo(&self) { println!("Vec<u8>") }
   |     ^^^^^^^^^^^^^
note: candidate #2 is defined in an impl for the type `Wrapped<std::vec::Vec<()>>`
  --> src/main.rs:9:5
   |
9  |     fn boo(&self) { println!("Vec<())>") }
   |     ^^^^^^^^^^^^^

Type checking here begins similarly to problem 1, but when it searches the inherent methods of Wrapped<Vec<_>>, it finds two candidates and refuses to continue.

Why does it give up? Well, let's consider a more sinister example. Above, the signatures just so happen to match, but what if they didn't?

struct Wrapped<T>(T);

impl Wrapped<Vec<u8>> {
    fn boo(&self) -> Self { Wrapped(self.0.to_vec()) }
}

impl Wrapped<Vec<()>> {
    fn boo(&self) -> (u32, u32) { (2, 3) }
}

fn main() {
    let mut a = Wrapped(vec![]);

    println!("{:?}", a.boo().0.rotate_left(3));
    
    // a.0.push(());  // (1)
    // a.0.push(1u8); // (2)
}

If the compiler did not give up when type-checking a.boo(), then when it checks the println there are two completely different interpretations: (notice that one of these interpretations even inserts auto-refs and coercions not present in the other)

// if a is Vec<u8>
let tmp: &Wrapped<Vec<u8>> = &a;
let tmp: Wrapped<Vec<u8>> = <Wrapped<Vec<u8>>>::boo(&a);
let tmp: () = <&mut [u8]>::rotate_left(&mut *tmp.0, 3_usize);
println!("{:?}", tmp);

// if a is Vec<()>
let tmp: (u32, u32) = <Wrapped<Vec<()>>>::boo(&a);
let tmp: u32 = <u32>::rotate_left(tmp.0, 3_u32);
println!("{:?}", tmp);

Before reaching either (1) or (2), how would it decide which interpretation to use for this line? Scenarios like this are likely why the compiler gives up so willingly.

Example 3
struct Wrapped<T>(T);

trait Boo {
    type Assoc;
    fn boo(&self) -> Self::Assoc;
}

// !!!!!! The inherent methods from Problem 2
//        have been replaced with trait impls
impl Boo for Wrapped<Vec<u8>> {
    type Assoc = Self;
    fn boo(&self) -> Self { Wrapped(self.0.to_vec()) }
}

impl Boo for Wrapped<Vec<()>> {
    type Assoc = (u32, u32);
    fn boo(&self) -> (u32, u32) { (2, 3) }
}

fn main() {
    let mut a = Wrapped(vec![]);

    println!("{:?}", a.boo());
    // println!("{:?}", a.boo().0.rotate_left(3));
    
    a.0.push(());
}

Does this compile? If so, what does it print? If not, what is the error?

What if we uncomment the second println!?

(note: for now just assume that println!("{:?}", x) expands to code containing a call to fmt::Debug::fmt(&x))

Solution
   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.58s
     Running `target/debug/playground`
(2, 3)

In contrast to the multiple inherent methods in Problem 2, the compiler allows multiple impls from the same trait to match. Basically:

  • a has type Wrapped<Vec<?0>>
  • The output of a.boo() can be written generically as <Wrapped<Vec<?0>> as Boo>::Assoc.
  • This is borrowed and fed to Debug::fmt, creating the obligation that <Wrapped<Vec<?0>> as Boo>::Assoc: Debug. Trait obligations are checked lazily, so it's okay that we still don't know this type!
  • a.0.push(()) tells it that ?0 = ().
  • At some point later it verifies that <Wrapped<Vec<()>> as Boo>::Assoc implements Debug.

Of course, if you uncomment the second println, you instead get

error[E0282]: type annotations needed
  --> src/main.rs:25:22
   |
25 |     println!("{:?}", a.boo().0.rotate_left(3));
   |                      ^^^^^^^ cannot infer type
   |
   = note: type must be known at this point

for reasons already discussed in the solution to problem 2.


Enum::A still implements Deref, so I would expect this to print 0, regardless of whether that line was commented out or not.

So auto-deref takes precedence over "inherent" fields of the variant? If so that's quite surprising and should be mentioned in the RFC.

@alercah

This comment has been minimized.

Copy link
Contributor

commented Mar 3, 2019

@ExpHP

This comment has been minimized.

Copy link

commented Mar 3, 2019

Yes, I believe you're correct in general, but unless I'm mistaken, nothing
prevents the type checker from letting x.0 be typed as a variable until
later.

Even for usage as simple as Debug::fmt(&x.field) (which does not need to know anything more than typeof(x.field): Debug), it would still need to potentially come back and insert a deref to make it Debug::fmt(&(*x).field) after it is already finished checking the expression. If the compiler could do that here, then it could no doubt also solve many of the millions of other paper cuts that all ultimately boil down to these limitations of the type checker.

I've long lost count of how many times I've responded to github issues or posts on URLO to explain my understanding of what's happening behind the scenes in these apparent "bugs" of type inference.... and I am thoroughly convinced at this point that if there was a reasonable and scalable solution, then it'd already be implemented and I wouldn't be here playing the role of the antagonist!

@varkor

This comment has been minimized.

Copy link
Member Author

commented Mar 27, 2019

I suspect this can't work for the same reason that we get type annotations needed all the time when trying to call methods; the type checker works strictly in order.

I imagine the right solution, then, is to require explicit annotations for variants and fallback on the general enum type. (This is forwards compatible with a solution that doesn't involve explicit annotations, so I don't think this means losing too much.)

@varkor

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2019

@rust-lang/lang: what's the next step for this RFC? As far as I can tell, all the comments so far have been addressed (in particular, the type inference is now much more conservative), and there hasn't been much movement on this RFC for a while now.

@aturon aturon added the I-nominated label Apr 10, 2019

@aturon

This comment has been minimized.

Copy link
Member

commented Apr 10, 2019

Nominating for discussion in a future Lang Team meeting.

@eddyb

This comment has been minimized.

Copy link
Member

commented Apr 11, 2019

Btw, has anyone proposed using "bidirectional type-checking" for this?
That is, Some(x): Some<_> producing a value of type Some<T> but Some(x) without that expected type would produce Option<T>?

@Centril Centril removed the I-nominated label Apr 11, 2019

@varkor

This comment has been minimized.

Copy link
Member Author

commented Apr 25, 2019

@eddyb: with or without type ascription? If I understand correctly, that kind of thing is what is enabled by this RFC (i.e. opt-in to specific variants by explicitly type-annotating).

@varkor

This comment has been minimized.

Copy link
Member Author

commented Apr 25, 2019

@rust-lang/lang: I notice this has been unnominated. Did this get discussed?

(Also, as @nrc is no longer on the lang team, should this RFC be reassigned?)

@Centril

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

I think @eddyb's idea to use Expectation / bidirectional type-checking in this way makes sense both from the POV of making the feature sufficiently expressive to become worthwhile, but also because it seems to me that it makes the implementation simpler in that it already fits rustc's pipeline. You will still need fallback to achieve Some(x) with NoExpectation resulting in Option<T>.

As for the nomination; when we discussed it, we said that it's not a roadmap priority. However, in my personal view, if @varkor (when they are free and const generics is in a good state implementation wise), or someone else, wants to spearhead the implementation work, then I find it acceptable to move ahead now. What we shouldn't do however is to simply leave the feature in limbo as a tracking issue. In the interest of fairness to the implementor, I think they shouldn't have to do speculative work that may not be accepted later, so ideally we should line up a design, an implementor, and FCP.

@varkor

This comment has been minimized.

Copy link
Member Author

commented Apr 25, 2019

I think they shouldn't have to do speculative work that may not be accepted later, so ideally we should line up a design, an implementor, and FCP.

I would be willing to take on the implementation of this feature if it was accepted (once the const generics work has quietened down), presuming that a feature that is not a roadmap priority could still be considered worthwhile if someone is volunteering to work on it.

@alexreg

This comment has been minimized.

Copy link

commented Apr 25, 2019

@varkor I believe one or two members of the compiler team have previously said that if someone wishes to work on a non-roadmap feature, that's generally no problem, and no one should stand in their way at least... I don't want to put specific words in other's mouths though. :-)

@Centril

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

@alexreg Sure, that's true. :) I believe what @varkor is seeking is a commitment from the language team that if the implementation is done, it will actually eventually be stabilized.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

Hmm, I do not agree with this. My sense is that I would like be doing less "off-roadmap" work, not more, and I say that both from the perspective of the compiler team and the lang team. The problem is that work like this is never "contained". There is a need not only to implement the feature, but to document it, to support it, etc.

I agree that some sort of approach to variant types would be great. This RFC might even be advocating for the right approach -- I haven't had time to read it, so that's a bit hard for me to say! But then, maybe that's the point. I feel like if we are going to try to do "variant types", it would probably behoove us to also spent more time digging into the design space. (I feel this touches on a couple of related issues, such as closed traits, the relationship between enums and traits, the need for nested enums.)

Don't get me wrong, I want this work to be done, and I always feel terrible discouraging people from doing work they want to do. But I would definitely want to have a team-wide conversation where we consider all the possible work to be done and choose carefully where to spend our energy.

@alexreg

This comment has been minimized.

Copy link

commented Apr 26, 2019

Worth nominating for meeeting discussion then? I thought the the idea of "alternative solutions" was to examine the design space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.