New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum variant types #2593

Open
wants to merge 13 commits into
base: master
from

Conversation

Projects
None yet
@varkor
Copy link
Member

varkor commented Nov 10, 2018

Enum variants are to be considered types in their own rights. This allows them to be irrefutably matched upon. Where possible, type inference will infer variant types, but as variant types may always be treated as enum types this does not cause any issues with backwards-compatibility.

enum Either<A, B> { L(A), R(B) }

fn all_right<A, B>(b: B) -> Either<A, B>::R {
    Either::R(b)
}

let Either::R(b) = all_right::<(), _>(1729);
println!("b = {}", b);

Rendered

Thanks to @Centril for providing feedback on this RFC!

@varkor varkor referenced this pull request Nov 10, 2018

Closed

Types for enum variants #1450

@Centril Centril added the T-lang label Nov 10, 2018

@alexreg

This comment has been minimized.

Copy link

alexreg commented Nov 10, 2018

Great work, @varkor. I've been looking forward to this for a long time. Just as a side-point, I'd love to follow this up with an RFC for the ideas in https://internals.rust-lang.org/t/pre-rfc-using-existing-structs-and-tuple-structs-as-enum-variants/7529 once this gets implemented in nightly (or perhaps even before). Since you've worked on this, would appreciate your thoughts at some point.

@bchallenor

This comment has been minimized.

Copy link

bchallenor commented Nov 10, 2018

Although sum types are becoming increasingly common in programming languages, most do not choose to allow the variants to be treated as types in their own right (that is, the author has not found any that permit this design pattern).

This is possible in Scala - as in your example, Left and Right are subtypes of Either, and can be referred to independently. Coming from Scala, I miss this feature in Rust, and I am fully in favour of this RFC.

@varkor

This comment has been minimized.

Copy link
Member

varkor commented Nov 10, 2018

This is possible in Scala - as in your example, Left and Right are subtypes of Either, and can be referred to independently.

Ah, great, I'll add that in, thanks!

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Nov 10, 2018

I think I'm in favor of the proposed functionality and semantics here. Where I'm stumbling is the nomenclature/terminology/teachability(?); it's not clear to me that "introducing a new kind of type: variant types" is the best description of this. In particular, precisely because this proposal feels so lightweight compared to previous ones, it doesn't really "feel" like what we're doing is adding a whole new type kind the way structural records or anonymous enums would be doing. It sounds like it could be equally well described as doing the "duplicating a variant as a standalone struct" workaround automagically, so those extra structs are just always there (except they get a specific layout guarantee and different conversion syntax that regular structs wouldn't get). Is there some detail I overlooked that makes this clearly not a sugar?

I'm guessing this is at least partially ignorance on my part because

The loose distinction between the enum type and its variant types could be confusing to those unfamiliar with variant types.

makes it sound like "variant types" are an actual thing with their own special properties that no other kinds of types have, and I just have no idea what that would be (since being autogenerated, having a certain layout guarantee and different conversion syntax seem like "surface level" properties that aren't really part of the type system per se). Maybe I just need to see some more examples of how these types behave?

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Nov 10, 2018

Nice RFC.

In all cases, the most specific type (i.e. the variant type if possible) is chosen by the type inference.

Is code like this still allowed, or is the compiler going to tell me that the Sum::B(b) branch of the match is impossible and needs to be removed?

enum Sum { A(u8), B(u8) }
let x = Sum::A(5); // x: Sum::A
match x {
    Sum::A(a) => {},
    Sum::B(b) => {},
}

Both options have advantages and disadvantages.

@varkor

This comment has been minimized.

Copy link
Member

varkor commented Nov 10, 2018

Is code like this still allowed, or is the compiler going to tell me that the Sum::B(b) branch of the match is impossible and needs to be removed?

This is a good question — I'll make note of it in the RFC. Although matching on variant types permits irrefutable matches, it must also accept the any other variants with the same type — otherwise it's not backwards compatible with existing code.

Where I'm stumbling is the nomenclature/terminology/teachability(?); it's not clear to me that "introducing a new kind of type: variant types" is the best description of this.

since being autogenerated, having a certain layout guarantee and different conversion syntax seem like "surface level" properties that aren't really part of the type system per se

It's quite possible there's a better way to explain this. They are essentially as you say, though they act slightly differently from structs (on top of the points you made) in the way they are pattern-matched (as above in this comment) and their discriminant value. I thought it would be clearer to describe them as an entirely new kind of type, but perhaps calling them special kinds of structs would be more intuitive as you say. I'll think about how to reword the relevant sections.

@varkor varkor force-pushed the varkor:enum-variant-types branch from 14a6e83 to 2b00420 Nov 10, 2018

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Nov 10, 2018

This was previously proposed in #1450. That was postponed because we were unsure about the general story around type fallback (e.g, integer types, default generic types, etc). Enum variants would add another case of this and so we wanted to be certain that the current approach is good and there are no weird interactions. IIRC, there was also some very minor backwards incompatibility.

This RFC should address those and issues, and summarise how this RFC is different to #1450.

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

I'm still personally very strongly in favour of this feature! The general mood on #1405 was also positive.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Nov 10, 2018

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

I'm not sure that would need to be at odds with variant types, if Rust ends up with refinement types I expect variant types to be refinements of their enum.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Nov 10, 2018

First, irrespective of what happens with the RFC;
I always appreciate the effort put into well thought out RFCs and this is one of those, so thank you!

I am of two minds and a bit torn about the proposal here.

  1. I think it would help immensely to make an API such as syn::Expr more ergonomic and avoid auxiliary structs such as syn::ExprBox. Here you don't need any implementations on the variant types you'd get because the variant types are for the most part just dumb data.

  2. At the same time, precisely because this RFC does not permit implementations on variant types, as I think is proper to avoid the pitfalls of Java-OOP-inheritance APIs, it will not allow you to refactor an API such as syn::Lit into one where syn::LitStr is a variant type (i.e. Lit::Str) because the implementations there would not be admitted by the type checker.

  3. All in all, I think the benefits of this proposal are well motivated and the costs in terms of understanding are not that great. I think this proposal is something that a subset of users would naturally expect; the user also doesn't have to do much, expressiveness is given for free to them.

  4. The RFC interacts well with #1806 as well as goals and plans for gradual struct initialization; in fact, it provides the "missing link" that makes gradual initialization more generally applicable in the type system. This is a wonderful thing.

  5. Thus on balance I think the RFC is a good idea.

(Feel free to integrate any points that you found relevant into the text of the RFC)


@nrc

This RFC should address those and issues, and summarise how this RFC is different to #1450.

👍

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

(Aside, but let's not go too deeply into this: I personally think that refinement / dependent typing is both a good idea, a good fit for Rust's general aim for correctness and type system power for library authors -- and RFC 2000 is sort of dependent types anyways so it's sort of sunk cost wrt. complexity -- the use cases for dependent/refinement types are sort of different than the goal here; With dependent types we wish to express things like { x: usize | x < 10 })


@eddyb

I'm not sure that would need to be at odds with variant types, if Rust ends up with refinement types I expect variant types to be refinements of their enum.

I agree; I think you can think of variant types in the general framework of refinement / dependent types;
With the notation due to @petrochenkov for pattern matching as a boolean operator, we can think of variant types as:

type FooVar = { x: Foo | x is Foo::Variant(...) };
@burdges

This comment has been minimized.

Copy link

burdges commented Nov 10, 2018

We do want formal verification of rust code eventually, and afaik doing that well requires refinement types. I'm not saying rust itself needs refinement types per se, but rust should eventually have a type system plugin/fork/preprocessor for formal verification features, like refinement types. I do like this feature of course, but ideally the syntax here should avoid conflicts with refinement types.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Nov 10, 2018

@burdges

I do like this feature of course, but ideally the syntax here should avoid conflicts with refinement types.

Are there any such conflicts in your view?

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Nov 10, 2018

(Or to elaborate; if there are any conflicts with the RFC as proposed with refinement typing, then stable Rust as is has that conflict since the RFC does not introduce any new syntax...)

@ExpHP

This comment has been minimized.

Copy link

ExpHP commented Nov 10, 2018

Note that because a variant type, e.g. Sum::A, is not a subtype of the enum type (rather, it can simply be coerced to the enum type), a type like VecSum::A is not a subtype of Vec. (However, this should not pose a problem as it should generally be convenient to coerce Sum::A to Sum upon either formation or use.)

So, if I understand correctly, all existing code that uses enums now have coercions all over the place in order to ensure they continue functioning? I'm really not sure this works...

let mut x:

x = None;

// At this point the compiler knows the type
// of x is Option<?0>::None.

// But Option<_>::Some cannot be coerced to None
x = Some(1); // type error?
@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Nov 10, 2018

let mut x:
x = None;
// At this point the compiler knows the type
// of x is Option<?0>::None.
// But Option<_>::Some cannot be coerced to None
x = Some(1); // type error?

I think in theory the type system should infer x to be of type Option<i32> because it sees both assignments.

But I think we need a formalization of the involved type system rules, to assure soundness, before implementing this proposal...

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Nov 10, 2018

I would rather frame this as follows:

  • the type is Option
  • the compiler tracks the most specific variant the value is if it can be known statically. This could be done with a standard dataflow analysis.
@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Nov 10, 2018

We do want formal verification of rust code eventually, and afaik doing that well requires refinement types. I'm not saying rust itself needs refinement types per se, but rust should eventually have a type system plugin/fork/preprocessor for formal verification features, like refinement types.

While I don't dislike LiquidHaskell-like refinement typing, lately for the future of Rust I prefer a style of verification as in the Why3 language ( http://why3.lri.fr/ , that is also related to the Ada-SPARK verification style). We'll need a pre-RFC for this.

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Nov 11, 2018

I hope this syntax is also supported (I suggest to add it to the RFC):

enum Sum { A(u32), B, C }
fn print_a1(Sum::A(x): Sum::A) {}

A question regarding the ABI: is the print_a1() function receiving the Sum discriminant too as argument?

And in future it could also be supported the more DRY syntax (I think suggested by Centril):

fn print_a2(Sum::A(x)) {}

You could also add a new (silly) example to this RFC that shows the purposes of this type system improvement:

enum List { Nil, Succ(u32, Box<List>) }

fn prepend(x: u32, lst: List) -> List {
    List::Succ(x, box lst)
}

With this improvement you can write instead:

fn prepend(x: u32, lst: List) -> List::Succ {
    List::Succ(x, box lst)
}

Then you can define a list_head_succ() function that returns the head of the result of prepend() without a unwraps or Option result:

fn list_head(lst: List) -> Option<u32> {
    match lst {
        List::Succ(x, _) => x,
        List::Nil => None,
    }
}

fn list_head_succ(List::Succ(x, _): List::Succ) -> u32 { x }

{ x: usize | x < 10 }

For the common case of integer intervals for Rust I sometimes prefer a shorter and simpler syntax like:

type Small = usize[.. 10];

@shepmaster
Copy link
Member

shepmaster left a comment

"Overhead"

I'm mostly interested in this RFC from the point-of-view of "enums of lots of standalone other types". The biggest example I have is the AST expressed in fuzzy-pickles, a Rust parser which uses this pattern extensively:

pub enum Item {
    AttributeContaining(AttributeContaining),
    Const(Const),
    Enum(Enum),
    // ...

Unfortunately, I don't see this as being a large win for such a case due to the "forced overhead" of each enum variant still being the same size as all the other variants. It's an understandable decision, just not one that I see as helping as much as it could.

This is mentioned in the alternatives section, but I want to make sure the point is reiterated.

Multiple variants

I didn't see any mention of if multiple variants would be supported:

#[derive(Debug)]
enum Count {
    Zero,
    One,
    Many(usize),
}

fn example(c: Count) {
    use Count::*;
    match c {
        x @ Zero | x @ One => println!("{:?}", x), // what is the type of `x` here?
        x => println!("{:?}", x),
    }
} 

It may also be worth explicitly calling out what the type is for those catch-all patterns as well as in cases of match guards.

foo @

This may be swerving into refinement type territory, but I naturally wanted to not type the foo @ in the previous example:

match c {
    Zero | One => println!("{:?}", c),
    // ...

I feel this is a pretty hidden and uncommon aspect of patterns, and it'd be nice to just be able to intuit the type based on the pattern without adding the explicit binding. That might even mean we could do:

if let Count::Many(..) = c {
    println!("{}", c.0);
}
Show resolved Hide resolved text/0000-enum-variant-types.md
and `impl Trait for Enum::Variant` are forbidden. This dissuades inclinations to implement
abstraction using behaviour-switching on enums (for example, by simulating inheritance-based
subtyping, with the enum type as the parent and each variant as children), rather than using traits
as is natural in Rust.

This comment has been minimized.

@shepmaster

shepmaster Nov 11, 2018

Member

I'm a fan of the proposed style, but it might be worth stating why Rust the language wants to dissuade this pattern.

- Passing a known variant to a function, matching on it, and use `unreachable!()` arms for the other
variants.
- Passing individual fields from the variant to a function.
- Duplicating a variant as a standalone `struct`.

This comment has been minimized.

@shepmaster

shepmaster Nov 11, 2018

Member

I disagree that this goal is going to be as widely achieved by this RFC as I would like due to the following point:

the variant types proposed here have identical representations to their enums

That means that if I have an enum with large variants:

enum Thing {
    One([u8; 128]),
    Two(u8),
}

Even the "small" variants (e.g. Thing::Two) are still going to take "a lot" of space.

This comment has been minimized.

@eddyb

eddyb Nov 11, 2018

Member

If space is a concern then we could have it so variant types only convert to their enum by-value, so e.g. a &Thing::Two wouldn't be a valid &Thing.

That's weaker than something more akin to refinement typing, but maybe it's enough?

This comment has been minimized.

@Centril

Centril Nov 11, 2018

Contributor

@eddyb I think that's already the case; the RFC doesn't state anywhere, as far as I can tell, that &Thing::Two is a valid &Thing. Also note that the RFC explicitly states that Thing::Two and Thing having the same layout is not a guarantee so we could change the layout to be more space efficient.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Dec 22, 2018

@thibaultdelor

To take your example you can argue that your translation is what this RFC would lead to, whereas Rust in the current state it would translate to :

data Foo = Alpha Beta | Gamma | ()

No, it would not.

data Foo = Alpha Beta | Gamma | () is not legal Haskell because () is not an identifier. The syntax of Haskell data types as defined in the Haskell2010 report when simplified is roughly:

data UIdent $(LIdent)* = $(UIdent $(Type)*)|*

Furthermore, the translation to Alpha Beta and Gamma is wrong because Alpha and Gamma are types in current Rust whereas in the Haskell definition they are suddenly constructors in the value namespace.

The semantically accurate transformation (forgetting about recursive types) of a Rust enum to a Haskell data type is:

enum TypeName {
    VariantName(Type, ..., Type),
    ...,
    VariantName(Type, ..., Type),
}

into:

data TypeName
  = VariantName Type ... Type
  | ...
  | VariantName Type ... Type

Bar, Baz and Quuz being just name bindings... In Rust, this name are used to do pattern matching and for construction.

In Haskell as well:

data List a = Nil | Cons a (List a)

instance Functor List where
  fmap f list = case list of Nil -> Nil; Cons x xs -> Cons (f x) (fmap f xs)
@thibaultdelor

This comment has been minimized.

Copy link

thibaultdelor commented Dec 22, 2018

Indeed you are right. Doesn't really change my point.

@varkor

This comment has been minimized.

Copy link
Member

varkor commented Dec 22, 2018

Me: Why the hell aren't you just return Gamma?". Baz isn't useful except for pattern matching orconstructing the enum.

This works only when your variant contains at most one type (e.g. Baz(Gamma), rather than Baz(Alpha, Beta, Gamma)) — otherwise you have to constantly repackage them into a tuple, or pull them into a separate struct, which is just extra boilerplate and loses the semantic intent.

One of those 2 things would make me change my mind:

Removing the limitation on Variant types

I'm not strongly opposed to allowing impls on variants; I was intentionally being conservative. If there are strong feelings in favour of permitting impls, I can add it to the RFC. I'll add it as an unresolved question.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Dec 22, 2018

I'm not strongly opposed to allowing impls on variants; I was intentionally being conservative. If there are strong feelings in favour of permitting impls, I can add it to the RFC. I'll add it as an unresolved question.

I personally would like to see it very much. Otherwise variant types feel like "2nd class" types from the start. It can be handled as part of specialisation, right?

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Dec 29, 2018

It can be handled as part of specialisation, right?

This presumes that there will ever be specialization. Last I paid attention, there was still no satisfactory conceptual foundation that people felt was good enough.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Dec 29, 2018

@shepmaster It's been figured out. Just waiting on Chalk foundations I believe, which are almost there. http://aturon.github.io/2018/04/05/sound-specialization/

@alercah

This comment has been minimized.

Copy link
Contributor

alercah commented Jan 2, 2019

I'm really glad to see this proposal.

Coming from someone writing AST types, I definitely would like to see impls allowed on individual variants in order to allow some, but not all variants, to contain some functionality. I think I disagree with the RFC author that we necessarily want to discourage behaviour switching on enums---it's a very standard thing with some kinds of subtypes because you already have the implementations handy.

(This definitely goes up with the idea of a fully general struct/enum unification approach, but assuming that's a little out of reach, I'd still like to leave the door. It may even allow considering using the trait system for the "arbitrary subset" refinement approach in the future, and I'll try to illustrate that idea at the end of the post.)

My musings there eventually lead me to think that we should consider forcing all the variants to have the same size/layout so that they are reference-compatible, because I think that there's a decent chance the differently-sized version has an alternative solution.

Rationale for impls

In my simple (but a little contrived) example, suppose that I'm designing a language where Decl is an AST node type. Some Decls have names, others don't. I might want to have something like NamedDecl for these. One way is to build that into the type structure:

enum Decl {
  Named(NamedDecl),
  Pragma(...),
  ...
}

struct NamedDecl {
  name: String,
  inner: InnerNamedDecl,
}

enum InnerNamedDecl {
  Func(FuncDecl),
  Var(VarDecl),
}

This is really ugly though, because I have a ton of layers of indirection and abstraction here. Additionally, it has a major usability cost: even though name is common to both FuncDecl and VarDecl, once you have a FuncDecl you've lost track of the name!

An alternative could be to simply write partial functions on Decl which return None or Err(_) or panic, as appropriate, if called on a declaration without a name, but that means that if there's a lot of related functionality. NamedDecl perhaps is too small for this, but imagine if some Decls introduce scopes in which name lookup can be performed. But this is also really bad: it means that all of these partial functions have to pattern match to do dispatch. And they're doing that every time.

Instead, traits should be used to factor out common functionality, right? So we want something like this:

enum Decl {
  Func(FuncDecl),
  Var(VarDecl),
  Pragma(PragmaDecl),
  ...
}

struct Func(...);
struct Var(...);
struct Pragma(...);

trait NamedDecl { ... }
impl NamedDecl for Func { ... }
impl NamedDecl for Var { ... }

This is a good approach. You can write functions that take a &dyn NamedDecl for runtime polymorphism (say, because your name lookup returns a Decl that's always going to be named), and if needed you could have NamedDecl contain a method fn as_decl(&self) -> &Decl so that you can pattern match to recover the static type. This isn't uncommon, because you'll want to treat FuncDecl and VarDecl separately in many cases, in a way that simply trying to farm out to traits isn't going to work for.

Banning impls on the individual variant types would effectively force a user to either abandon this pattern (which I think is the best of the options available) or forego the benefits of variant types altogether. Unless there's a better pattern that I'm missing, I think this argues very strongly in favour of impls on individual variant types.

Alternative Matching Nit

This approach does have one small weakness though, which is on alternative-match patterns:

fn foo<D: NamedDecl>(d: D) { ... }

fn bar(d: Decl) {
  match d {
    d @ (Decl::Func(_) | Decl::Var(_)) => foo(d)
    ...
  }
}

Morally, it feels like this "ought" to work---within the branch of the match, even though you don't know which variant d is, it certainly implements NamedDecl and the compiler certainly could desugar it into two separate branches which don't force a unification on d (indeed, I can imagine a macro which did just that!). This isn't a new issue---you can already run into unification-type issues like this with or patterns---but it's worth nothing that this will likely make this a bigger pain point. I think it's still worth it, however.

And Now Alercah Derails the Thread

I generally think that it's worth considering the future direction of things, especially since I see this feature careening on a collision course with sealed traits. #2618 is a new RFC proposal that illustrates how this direction is very much unconsciously in people's minds. Sealed traits, of course, are a thing that keeps coming up and nobody has really bothered with a serious RFC because there's easy-enough ways to simulate them.

But sealed traits actually have a significant implication on the type system, because a sealed trait effectively creates a finite subtype. Consider my example of the trait NamedDecl above. Imagine if I could declare it as trait NamedDecl : Decl to indicate that it can only be implemented for Decl. I could then ask to write this, without an as_decl() helper function:

fn destructure<D: NamedDecl>(d: D) {
  match d {
    FuncDecl(...) => ...,
    VarDecl(...) => ...,
  }
}

In other words, we can just use the NamedDecl trait to act as an arbitrary subset type for Decl! It would work for &dyn NamedDecl

This could work on &dyn NamedDecl even, by having the runtime match actually be against the vtable pointer. (Well, let's assume that we don't have lifetime parameters involved here, because they complexify things somewhat. But if we actually build the language out in this direction we could try to find a nice resolution for them). Once we get to this point, why does it even matter that NamedDecl is bounded by an enum? You could just use it like an arbitrary sum type.

We also arrive at something similar if we come at the original enum problem from the other direction, and consider the idea of allowing enum variants to be nameless. Looping back again, we could imagine something like:

struct FuncDecl { ... }
struct VarDecl { ... }
struct PragmaDecl { ... }

inline enum NamedDecl {
  FuncDecl,
  VarDecl,
}

inline enum Decl : NamedDecl {
  PragmaDecl,
}

where the variants have no actual identity of their own. (This is arguably a more elegant solution to the original problem even, and possibly worth considering as an alternative to the original proposal. If we want to bikeshed syntax, perhaps enum NamedDecl { ...FuncDecl } could be used to declare a FuncDecl variant without its own name. The risk here, of course, is that it wreaks havoc with type inference by overloading the constructor post hoc, so perhaps this is sadly inaccessible.)

I'd have to think through the implications carefully, but I think offhand that the only major difference between the sealed trait approach and the 'inline enum' approach becomes that of whether the variants all share the same layout or not---the inline enum approach forces a representation that can hold any of the variants and distinguish between them easily; the trait approach says that they can have different sizes but forces the variant information to be passed by generic parameter or fat pointer.

In other words, enums and sealed traits share the same basic properties of being sum types. But enums have a shared layout and nominal variants; sealed traits have varied layout and unnamed variants. Maybe aiming for a solution that unifies this all together by allowing selecting between nominality and layout separately is ideal.

@H2CO3

This comment has been minimized.

Copy link

H2CO3 commented Jan 2, 2019

My musings there eventually lead me to think that we should consider forcing all the variants to have the same size/layout so that they are reference-compatible, because I think that there's a decent chance the differently-sized version has an alternative solution.

What exactly do you mean by "reference compatible" here?

it means that all of these partial functions have to pattern match to do dispatch

I might be missing something, but doesn't anything dynamic need to perform some sort of runtime check to get to the correct function/type(/anything associated with them)? A pattern match is a comparison of the enum discriminant against a constant, and a branch, or an indirect branch indexed by the discriminant (depending on how the compiler happens to codegen it), modulo optimizations. But a method call through a trait object isn't free either: it's an offset against the vtable pointer and an indirect call (again, modulo devirtualization).

This is a good approach.

Isn't that particular use case of ASTs solved by just implementing the trait on the member structs, though?

This isn't uncommon, because you'll want to treat FuncDecl and VarDecl separately in many cases, in a way that simply trying to farm out to traits isn't going to work for.

But with newtype variants around structs, you still have the concrete types of the wrapped structs, so you can just call different methods (possibly from different traits) on them. I'll explain this below in more detail, but I think if you want to treat two types differently, you simply shouldn't try to achieve that by means of a common trait. Just use the concrete types, or perhaps different traits.

but it's worth not[h]ing that this will likely make this a bigger pain point.

Going from a trait to a specific type induces feelings of object-oriented "downcasting" in me. (No surprise, the hack that is known as Any uses the same approach on the surface.) This direction always felt just wrong to me. If you are abstracting over a certain property (i.e. you expect a trait), why would you try to get to the guts of the concrete type? If the concrete type cares about its own internals, it should implement the trait accordingly, instead of various callers of the trait method switching on it from the outside.

I've long enjoyed Rust's enums because of the beauty, simplicity, and obvious behavior of pattern matching, and I'd really prefer not to spoil it by having arbitrary downcasts over (partial) subsets of variants. I see how this might result in some writing convenience, but it's awful for readability (which, I argue, should be weighted a lot more in evaluating language features. We spend much more time reading code than power-typing). By the way I've done a lot of AST manipulation in the past too, and this never was a real issue. Yes, it means that when the represented syntax changes, you have to restructure the AST nodes associated with it. I'd consider that to be inevitable by nature, rather than a "problem".

@thibaultdelor

This comment has been minimized.

Copy link

thibaultdelor commented Jan 2, 2019

If you are abstracting over a certain property (i.e. you expect a trait), why would you try to get to the guts of the concrete type?

I see at least two reasons:

  1. Performance. Casting it once means that it becomes a concreet type and you can do static dispatch.
  2. The limitations of Rust traits. Traits in Rust are much harder to deal with than concrete types and there's a lot of thing you can't do. If the the trait is not object safe, you can't return it etc...

I have always advocate for "downcasting is a code smell" in other language. On the Rust side, unlike most language, things are much different and concrete types have very different properties than traits.

@H2CO3

This comment has been minimized.

Copy link

H2CO3 commented Jan 2, 2019

Performance. Casting it once means that it becomes a concreet type and you can do static dispatch.

This can already be achieved by just matching on the enum once and using the contents of its variants directly. By the way, traits as bounds to type parameters result in static dispatch. Unless you specifically ask for a trait object, there's no dynamic dispatch going on by default.

Traits in Rust are much harder to deal with than concrete types and there's a lot of thing you can't do. If the the trait is not object safe, you can't return it etc...

That is not true anymore because there is impl Trait.

Of course traits are different from concrete types – that's kind of their raison d'être. Generic programming is a quite high level of abstraction, indeed – so people not familiar with it might need some time until they get used to programming against abstract interfaces rather than concrete types. I wouldn't say that this means that "traits are harder to deal with" in general, nor that this would warrant conflating certain, very different aspects of the language.

@thibaultdelor

This comment has been minimized.

Copy link

thibaultdelor commented Jan 3, 2019

This can already be achieved by just matching on the enum once and using the contents of its variants directly.

Well then you are not using traits at all, that's a completely different story. You were making a point with abstraction and traits.

That is not true anymore because there is impl Trait.

impl trait is not magic, it's possible only when you know the concrete type at compile time. impl trait is more related to encapsulation than abstraction.

I wouldn't say that this means that "traits are harder to deal with" in general

Traits have additional restriction, can be monomorphised, require knowledge about trait objects, require to be boxed in some scenario...
They are harder to deal with, not because of abstraction, because of technical limitations due to Rust design (which is about zero cost abstraction etc.).

@alercah

This comment has been minimized.

Copy link
Contributor

alercah commented Jan 3, 2019

My musings there eventually lead me to think that we should consider forcing all the variants to have the same size/layout so that they are reference-compatible, because I think that there's a decent chance the differently-sized version has an alternative solution.

What exactly do you mean by "reference compatible" here?

I mean that &E::V can be safely cast to &E, which requires that the layouts match.

it means that all of these partial functions have to pattern match to do dispatch

I might be missing something, but doesn't anything dynamic need to perform some sort of runtime check to get to the correct function/type(/anything associated with them)? A pattern match is a comparison of the enum discriminant against a constant, and a branch, or an indirect branch indexed by the discriminant (depending on how the compiler happens to codegen it), modulo optimizations. But a method call through a trait object isn't free either: it's an offset against the vtable pointer and an indirect call (again, modulo devirtualization).

Yes, it does. But here I meant that you're doing this repeatedly because it's done on every single method, rather than doing it only once: if you pattern match on an enum to destructure it and then access the fields of the variant, you don't need to do the dispatch every time you access a field. Trait objects basically do have to do dynamic dispatch each time, but it's nicely hidden by the compiler.

This is a good approach.

Isn't that particular use case of ASTs solved by just implementing the trait on the member structs, though?

On the variant types you mean? The RFC proposes to ban that.

This isn't uncommon, because you'll want to treat FuncDecl and VarDecl separately in many cases, in a way that simply trying to farm out to traits isn't going to work for.

But with newtype variants around structs, you still have the concrete types of the wrapped structs, so you can just call different methods (possibly from different traits) on them. I'll explain this below in more detail, but I think if you want to treat two types differently, you simply shouldn't try to achieve that by means of a common trait. Just use the concrete types, or perhaps different traits.

Yes, you will. But sometimes you'll have a function which takes a NamedDecl and then wants to behave differently based on the concrete type. I don't think there's any problem with that personally, but it does require some way to destructure the NamedDecl into one of the concrete types.

but it's worth not[h]ing that this will likely make this a bigger pain point.

Going from a trait to a specific type induces feelings of object-oriented "downcasting" in me. (No surprise, the hack that is known as Any uses the same approach on the surface.) This direction always felt just wrong to me. If you are abstracting over a certain property (i.e. you expect a trait), why would you try to get to the guts of the concrete type? If the concrete type cares about its own internals, it should implement the trait accordingly, instead of various callers of the trait method switching on it from the outside.

I've long enjoyed Rust's enums because of the beauty, simplicity, and obvious behavior of pattern matching, and I'd really prefer not to spoil it by having arbitrary downcasts over (partial) subsets of variants. I see how this might result in some writing convenience, but it's awful for readability (which, I argue, should be weighted a lot more in evaluating language features. We spend much more time reading code than power-typing). By the way I've done a lot of AST manipulation in the past too, and this never was a real issue. Yes, it means that when the represented syntax changes, you have to restructure the AST nodes associated with it. I'd consider that to be inevitable by nature, rather than a "problem".

I dislike downcasting too, but destructuring an enum into its variant is not really different from downcasting, especially over a sealed trait (where you know every possible concrete type). I think the moral argument here is about when you expect consumers of a trait to be (morally) parametric. For enums you generally don't, but for unsealed traits you generally do. For a sealed trait, I think it starts to blur the line---and I think that taking advantage of that blurring in language design is valuable.

@H2CO3

This comment has been minimized.

Copy link

H2CO3 commented Jan 3, 2019

@alercah

Yes, it does. But here I meant that you're doing this repeatedly because it's done on every single method, rather than doing it only once:

Can you please provide a more specific example? The quote above seems to go against your former argument. I suspect we might be talking about different things… What I meant is, given an enum:

enum Foo {
    VarOne(ContentOne),
    VarTwo(ContentTwo),
}

you can implement a method on it, which calls into methods on the variants' associated data:

impl Foo {
    fn frobnicate(&self) {
        match *self { // 1 runtime check per method call against discriminant
            VarOne(ref content) => content.method_1(), // no dynamic dispatch here
            VarTwo(ref content) => content.method_2(), // no dynamic dispatch here
        }
    }
}

This isn't any worse, in terms of number of runtime checks, than:

trait Frobnicator { fn frobnicate(&self); }

impl Frobnicator for ContentOne {
    fn frobnicate(&self) { // 1 virtual call per method when used as a trait object
        self.method_1() // no dynamic dispatch here
    }
}

impl Frobnicator for ContentOne {
    fn frobnicate(&self) { // 1 virtual call per method when used as a trait object
        self.method_2() // no dynamic dispatch here
    }
}

let f: &dyn Frobnicator = &ContentOne::new();
f.frobnicate(); // the virtual call happens here

On the variant types you mean? The RFC proposes to ban that.

No, I meant on the types of the associated values wrapped in each newtype variant.

Yes, you will. But sometimes you'll have a function which takes a NamedDecl and then wants to behave differently based on the concrete type. I don't think there's any problem with that personally

Well, then we just disagree here. I think if consumer code wants to behave differently based on concrete types, an enum with public variants is a perfectly fine solution, and such an enum around the different types should be provided. There is no point in consuming a trait if its abstraction capabilities are thrown away because one ends up inspecting the concrete type anyway. That only results in unnecessary complication and an unclear mixture of concepts, which are hard to understand because they belong to neither approach (concrete types or abstract interfaces).

To clarify, I'm not saying that switching on an enum is in itself bad; rather, switching on an enum and taking advantage of the heterogeneous types, while pretending to program against a homogeneous interface is bad.

I dislike downcasting too, but destructuring an enum into its variant is not really different from downcasting

Indeed, it isn't from a technical point of view. However, enums are expected to be used like that, while one of the primary purposes of a trait is the exact opposite: to provide abstraction via information hiding. Therefore…

I think that taking advantage of that blurring in language design is valuable.

…I disagree with this statement too. This is exactly what I'm trying to avoid, as mentioned in the previous point.

@H2CO3

This comment has been minimized.

Copy link

H2CO3 commented Jan 3, 2019

@thibaultdelor

impl trait is not magic, it's possible only when you know the concrete type at compile time.

I am well aware that impl Trait is backed by a concrete type. What's a specific example when you couldn't usefully return an impl Trait because "you don't know the concrete type at compile time"? And how could a proposal around (even more) static dispatch resolve such a problem?

impl trait is more related to encapsulation than abstraction.

That is really vague; encapsulation is a form of/tool for abstraction. I also fail to see why hiding a concrete type under an interface doesn't qualify as "abstraction". But I'm not particularly interested in arguing over definitions; I was merely falsifying your (incorrect) claim that a type implementing a non-object-safe trait can't be returned.

Traits have additional restriction, can be monomorphised, require knowledge about trait objects, require to be boxed in some scenario...
They are harder to deal with, not because of abstraction, because of technical limitations due to Rust design (which is about zero cost abstraction etc.).

Again, these claims are very general and hard to concretize. Yes, generic functions and types with trait bounds can be monomorphized. Is that bad? Yes, traits can and sometimes need to be boxed, just like other types. Is that bad? When you make a trait object, it becomes a concrete type and can for all purposes be treated as one. Also based on this, I'm not sure about the "additional restrictions". You're really making an apples to oranges comparison here. Traits, generics, and trait objects all have their specific purpose, and they are used differently from (concrete, unrelated) types. I don't really see that as a problem. They are distinct language features solving different problems, which is why they all exist and are all useful in different ways.

@thibaultdelor

This comment has been minimized.

Copy link

thibaultdelor commented Jan 3, 2019

What's a specific example when you couldn't usefully return an impl Trait because "you don't know the concrete type at compile time"?

Are you actually suggesting that it rarely occurs?!? Doing something like the following is pretty common IMO :

if something {
  return this;
} else {
  return that;
}

(where this and that both implement a trait)
I have plenty of concrete example in minds, I am just unsure what's your point here...

And how could a proposal around (even more) static dispatch resolve such a problem?

Which problem?!? Which proposal?!?

I was merely falsifying your (incorrect) claim that a type implementing a non-object-safe trait can't be returned.

You are arguing with a straw-man... A non object-safe trait can't be return as such. If you know the concrete type then you can but it's not much different conceptually than returning the concrete type and it's not my point. When talking about abstraction, the case where you abstract over a single type is not really the most relevant one.

Is that bad? [...] Is that bad? [...]

Who said it was? I said harder to deal with, because it has technical implications and limitations and forces you to think (like boxing) about concepts that you don't have to with concrete types.

When you make a trait object [...]

You can't always do that, that was my point.

I am going to stop replying, the discussion isn't constructive.

@H2CO3

This comment has been minimized.

Copy link

H2CO3 commented Jan 3, 2019

Are you actually suggesting that it rarely occurs?!?

No need to yell. I'm not suggesting that this "rarely occurs", it was just unclear what you precisely meant by "knowing the concrete type".

Your example is legitimate. However, it still does need some sort of runtime check. If the returned concrete type depends on a parameter that is only known at runtime, there is no way a runtime check could be avoided.

the discussion isn't constructive.

Agreed.

@varkor varkor force-pushed the varkor:enum-variant-types branch from 03d2402 to 9e34d30 Jan 11, 2019

Centril added a commit to Centril/rust that referenced this pull request Jan 17, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 17, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 17, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 18, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 18, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

Centril added a commit to Centril/rust that referenced this pull request Jan 19, 2019

Rollup merge of rust-lang#57501 - petrochenkov:highvar, r=alexreg
High priority resolutions for associated variants

In rust-lang#56225 variants were assigned lowest priority during name resolution to avoid crater run and potential breakage.

This PR changes the rules to give variants highest priority instead.
Some motivation:
- If variants (and their constructors) are treated as associated items, then they are obviously *inherent* associated items since they don't come from traits.
- Inherent associated items have higher priority during resolution than associated items from traits.
- The reason is that there is a way to disambiguate in favor of trait items (`<Type as Trait>::Ambiguous`), but there's no way to disambiguate in favor of inherent items, so they became unusable in case of ambiguities if they have low priority.
- It's technically problematic to fallback from associated types to anything until lazy normalization (?) is implemented.

Crater found some regressions from this change, but they are all in type positions, e.g.
```rust
fn f() -> Self::Ambiguos { ... } // Variant `Ambiguous` or associated type `Ambiguous`?
```
, so variants are not usable there right now, but they may become usable in the future if rust-lang/rfcs#2593 is accepted.
This PR keeps code like this successfully resolving, but introduces a future-compatibility lint `ambiguous_associated_items` that recommends rewriting it as `<Self as Trait>::Ambiguous`.

@linacambridge linacambridge referenced this pull request Jan 20, 2019

Open

Bookmarks #525

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment