New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum variant types #2593

Open
wants to merge 13 commits into
base: master
from

Conversation

Projects
None yet
@varkor
Member

varkor commented Nov 10, 2018

Enum variants are to be considered types in their own rights. This allows them to be irrefutably matched upon. Where possible, type inference will infer variant types, but as variant types may always be treated as enum types this does not cause any issues with backwards-compatibility.

enum Either<A, B> { L(A), R(B) }

fn all_right<A, B>(b: B) -> Either<A, B>::R {
    Either::R(b)
}

let Either::R(b) = all_right::<(), _>(1729);
println!("b = {}", b);

Rendered

Thanks to @Centril for providing feedback on this RFC!

@varkor varkor referenced this pull request Nov 10, 2018

Closed

Types for enum variants #1450

@Centril Centril added the T-lang label Nov 10, 2018

@alexreg

This comment has been minimized.

alexreg commented Nov 10, 2018

Great work, @varkor. I've been looking forward to this for a long time. Just as a side-point, I'd love to follow this up with an RFC for the ideas in https://internals.rust-lang.org/t/pre-rfc-using-existing-structs-and-tuple-structs-as-enum-variants/7529 once this gets implemented in nightly (or perhaps even before). Since you've worked on this, would appreciate your thoughts at some point.

@bchallenor

This comment has been minimized.

bchallenor commented Nov 10, 2018

Although sum types are becoming increasingly common in programming languages, most do not choose to allow the variants to be treated as types in their own right (that is, the author has not found any that permit this design pattern).

This is possible in Scala - as in your example, Left and Right are subtypes of Either, and can be referred to independently. Coming from Scala, I miss this feature in Rust, and I am fully in favour of this RFC.

@varkor

This comment has been minimized.

Member

varkor commented Nov 10, 2018

This is possible in Scala - as in your example, Left and Right are subtypes of Either, and can be referred to independently.

Ah, great, I'll add that in, thanks!

@Ixrec

This comment has been minimized.

Contributor

Ixrec commented Nov 10, 2018

I think I'm in favor of the proposed functionality and semantics here. Where I'm stumbling is the nomenclature/terminology/teachability(?); it's not clear to me that "introducing a new kind of type: variant types" is the best description of this. In particular, precisely because this proposal feels so lightweight compared to previous ones, it doesn't really "feel" like what we're doing is adding a whole new type kind the way structural records or anonymous enums would be doing. It sounds like it could be equally well described as doing the "duplicating a variant as a standalone struct" workaround automagically, so those extra structs are just always there (except they get a specific layout guarantee and different conversion syntax that regular structs wouldn't get). Is there some detail I overlooked that makes this clearly not a sugar?

I'm guessing this is at least partially ignorance on my part because

The loose distinction between the enum type and its variant types could be confusing to those unfamiliar with variant types.

makes it sound like "variant types" are an actual thing with their own special properties that no other kinds of types have, and I just have no idea what that would be (since being autogenerated, having a certain layout guarantee and different conversion syntax seem like "surface level" properties that aren't really part of the type system per se). Maybe I just need to see some more examples of how these types behave?

@leonardo-m

This comment has been minimized.

leonardo-m commented Nov 10, 2018

Nice RFC.

In all cases, the most specific type (i.e. the variant type if possible) is chosen by the type inference.

Is code like this still allowed, or is the compiler going to tell me that the Sum::B(b) branch of the match is impossible and needs to be removed?

enum Sum { A(u8), B(u8) }
let x = Sum::A(5); // x: Sum::A
match x {
    Sum::A(a) => {},
    Sum::B(b) => {},
}

Both options have advantages and disadvantages.

@varkor

This comment has been minimized.

Member

varkor commented Nov 10, 2018

Is code like this still allowed, or is the compiler going to tell me that the Sum::B(b) branch of the match is impossible and needs to be removed?

This is a good question — I'll make note of it in the RFC. Although matching on variant types permits irrefutable matches, it must also accept the any other variants with the same type — otherwise it's not backwards compatible with existing code.

Where I'm stumbling is the nomenclature/terminology/teachability(?); it's not clear to me that "introducing a new kind of type: variant types" is the best description of this.

since being autogenerated, having a certain layout guarantee and different conversion syntax seem like "surface level" properties that aren't really part of the type system per se

It's quite possible there's a better way to explain this. They are essentially as you say, though they act slightly differently from structs (on top of the points you made) in the way they are pattern-matched (as above in this comment) and their discriminant value. I thought it would be clearer to describe them as an entirely new kind of type, but perhaps calling them special kinds of structs would be more intuitive as you say. I'll think about how to reword the relevant sections.

@varkor varkor force-pushed the varkor:enum-variant-types branch from 14a6e83 to 2b00420 Nov 10, 2018

@nrc

This comment has been minimized.

Member

nrc commented Nov 10, 2018

This was previously proposed in #1450. That was postponed because we were unsure about the general story around type fallback (e.g, integer types, default generic types, etc). Enum variants would add another case of this and so we wanted to be certain that the current approach is good and there are no weird interactions. IIRC, there was also some very minor backwards incompatibility.

This RFC should address those and issues, and summarise how this RFC is different to #1450.

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

I'm still personally very strongly in favour of this feature! The general mood on #1405 was also positive.

@eddyb

This comment has been minimized.

Member

eddyb commented Nov 10, 2018

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

I'm not sure that would need to be at odds with variant types, if Rust ends up with refinement types I expect variant types to be refinements of their enum.

@Centril

This comment has been minimized.

Contributor

Centril commented Nov 10, 2018

First, irrespective of what happens with the RFC;
I always appreciate the effort put into well thought out RFCs and this is one of those, so thank you!

I am of two minds and a bit torn about the proposal here.

  1. I think it would help immensely to make an API such as syn::Expr more ergonomic and avoid auxiliary structs such as syn::ExprBox. Here you don't need any implementations on the variant types you'd get because the variant types are for the most part just dumb data.

  2. At the same time, precisely because this RFC does not permit implementations on variant types, as I think is proper to avoid the pitfalls of Java-OOP-inheritance APIs, it will not allow you to refactor an API such as syn::Lit into one where syn::LitStr is a variant type (i.e. Lit::Str) because the implementations there would not be admitted by the type checker.

  3. All in all, I think the benefits of this proposal are well motivated and the costs in terms of understanding are not that great. I think this proposal is something that a subset of users would naturally expect; the user also doesn't have to do much, expressiveness is given for free to them.

  4. The RFC interacts well with #1806 as well as goals and plans for gradual struct initialization; in fact, it provides the "missing link" that makes gradual initialization more generally applicable in the type system. This is a wonderful thing.

  5. Thus on balance I think the RFC is a good idea.

(Feel free to integrate any points that you found relevant into the text of the RFC)


@nrc

This RFC should address those and issues, and summarise how this RFC is different to #1450.

👍

For the sake of completeness, an alternative might be some kind of general refinement type, though I don't think that is a good fit with Rust.

(Aside, but let's not go too deeply into this: I personally think that refinement / dependent typing is both a good idea, a good fit for Rust's general aim for correctness and type system power for library authors -- and RFC 2000 is sort of dependent types anyways so it's sort of sunk cost wrt. complexity -- the use cases for dependent/refinement types are sort of different than the goal here; With dependent types we wish to express things like { x: usize | x < 10 })


@eddyb

I'm not sure that would need to be at odds with variant types, if Rust ends up with refinement types I expect variant types to be refinements of their enum.

I agree; I think you can think of variant types in the general framework of refinement / dependent types;
With the notation due to @petrochenkov for pattern matching as a boolean operator, we can think of variant types as:

type FooVar = { x: Foo | x is Foo::Variant(...) };
@burdges

This comment has been minimized.

burdges commented Nov 10, 2018

We do want formal verification of rust code eventually, and afaik doing that well requires refinement types. I'm not saying rust itself needs refinement types per se, but rust should eventually have a type system plugin/fork/preprocessor for formal verification features, like refinement types. I do like this feature of course, but ideally the syntax here should avoid conflicts with refinement types.

@Centril

This comment has been minimized.

Contributor

Centril commented Nov 10, 2018

@burdges

I do like this feature of course, but ideally the syntax here should avoid conflicts with refinement types.

Are there any such conflicts in your view?

@Centril

This comment has been minimized.

Contributor

Centril commented Nov 10, 2018

(Or to elaborate; if there are any conflicts with the RFC as proposed with refinement typing, then stable Rust as is has that conflict since the RFC does not introduce any new syntax...)

@ExpHP

This comment has been minimized.

ExpHP commented Nov 10, 2018

Note that because a variant type, e.g. Sum::A, is not a subtype of the enum type (rather, it can simply be coerced to the enum type), a type like VecSum::A is not a subtype of Vec. (However, this should not pose a problem as it should generally be convenient to coerce Sum::A to Sum upon either formation or use.)

So, if I understand correctly, all existing code that uses enums now have coercions all over the place in order to ensure they continue functioning? I'm really not sure this works...

let mut x:

x = None;

// At this point the compiler knows the type
// of x is Option<?0>::None.

// But Option<_>::Some cannot be coerced to None
x = Some(1); // type error?
@leonardo-m

This comment has been minimized.

leonardo-m commented Nov 10, 2018

let mut x:
x = None;
// At this point the compiler knows the type
// of x is Option<?0>::None.
// But Option<_>::Some cannot be coerced to None
x = Some(1); // type error?

I think in theory the type system should infer x to be of type Option<i32> because it sees both assignments.

But I think we need a formalization of the involved type system rules, to assure soundness, before implementing this proposal...

@mark-i-m

This comment has been minimized.

Contributor

mark-i-m commented Nov 10, 2018

I would rather frame this as follows:

  • the type is Option
  • the compiler tracks the most specific variant the value is if it can be known statically. This could be done with a standard dataflow analysis.
@leonardo-m

This comment has been minimized.

leonardo-m commented Nov 10, 2018

We do want formal verification of rust code eventually, and afaik doing that well requires refinement types. I'm not saying rust itself needs refinement types per se, but rust should eventually have a type system plugin/fork/preprocessor for formal verification features, like refinement types.

While I don't dislike LiquidHaskell-like refinement typing, lately for the future of Rust I prefer a style of verification as in the Why3 language ( http://why3.lri.fr/ , that is also related to the Ada-SPARK verification style). We'll need a pre-RFC for this.

@leonardo-m

This comment has been minimized.

leonardo-m commented Nov 11, 2018

I hope this syntax is also supported (I suggest to add it to the RFC):

enum Sum { A(u32), B, C }
fn print_a1(Sum::A(x): Sum::A) {}

A question regarding the ABI: is the print_a1() function receiving the Sum discriminant too as argument?

And in future it could also be supported the more DRY syntax (I think suggested by Centril):

fn print_a2(Sum::A(x)) {}

You could also add a new (silly) example to this RFC that shows the purposes of this type system improvement:

enum List { Nil, Succ(u32, Box<List>) }

fn prepend(x: u32, lst: List) -> List {
    List::Succ(x, box lst)
}

With this improvement you can write instead:

fn prepend(x: u32, lst: List) -> List::Succ {
    List::Succ(x, box lst)
}

Then you can define a list_head_succ() function that returns the head of the result of prepend() without a unwraps or Option result:

fn list_head(lst: List) -> Option<u32> {
    match lst {
        List::Succ(x, _) => x,
        List::Nil => None,
    }
}

fn list_head_succ(List::Succ(x, _): List::Succ) -> u32 { x }

{ x: usize | x < 10 }

For the common case of integer intervals for Rust I sometimes prefer a shorter and simpler syntax like:

type Small = usize[.. 10];

@shepmaster

"Overhead"

I'm mostly interested in this RFC from the point-of-view of "enums of lots of standalone other types". The biggest example I have is the AST expressed in fuzzy-pickles, a Rust parser which uses this pattern extensively:

pub enum Item {
    AttributeContaining(AttributeContaining),
    Const(Const),
    Enum(Enum),
    // ...

Unfortunately, I don't see this as being a large win for such a case due to the "forced overhead" of each enum variant still being the same size as all the other variants. It's an understandable decision, just not one that I see as helping as much as it could.

This is mentioned in the alternatives section, but I want to make sure the point is reiterated.

Multiple variants

I didn't see any mention of if multiple variants would be supported:

#[derive(Debug)]
enum Count {
    Zero,
    One,
    Many(usize),
}

fn example(c: Count) {
    use Count::*;
    match c {
        x @ Zero | x @ One => println!("{:?}", x), // what is the type of `x` here?
        x => println!("{:?}", x),
    }
} 

It may also be worth explicitly calling out what the type is for those catch-all patterns as well as in cases of match guards.

foo @

This may be swerving into refinement type territory, but I naturally wanted to not type the foo @ in the previous example:

match c {
    Zero | One => println!("{:?}", c),
    // ...

I feel this is a pretty hidden and uncommon aspect of patterns, and it'd be nice to just be able to intuit the type based on the pattern without adding the explicit binding. That might even mean we could do:

if let Count::Many(..) = c {
    println!("{}", c.0);
}
Show resolved Hide resolved text/0000-enum-variant-types.md
and `impl Trait for Enum::Variant` are forbidden. This dissuades inclinations to implement
abstraction using behaviour-switching on enums (for example, by simulating inheritance-based
subtyping, with the enum type as the parent and each variant as children), rather than using traits
as is natural in Rust.

This comment has been minimized.

@shepmaster

shepmaster Nov 11, 2018

Member

I'm a fan of the proposed style, but it might be worth stating why Rust the language wants to dissuade this pattern.

- Passing a known variant to a function, matching on it, and use `unreachable!()` arms for the other
variants.
- Passing individual fields from the variant to a function.
- Duplicating a variant as a standalone `struct`.

This comment has been minimized.

@shepmaster

shepmaster Nov 11, 2018

Member

I disagree that this goal is going to be as widely achieved by this RFC as I would like due to the following point:

the variant types proposed here have identical representations to their enums

That means that if I have an enum with large variants:

enum Thing {
    One([u8; 128]),
    Two(u8),
}

Even the "small" variants (e.g. Thing::Two) are still going to take "a lot" of space.

This comment has been minimized.

@eddyb

eddyb Nov 11, 2018

Member

If space is a concern then we could have it so variant types only convert to their enum by-value, so e.g. a &Thing::Two wouldn't be a valid &Thing.

That's weaker than something more akin to refinement typing, but maybe it's enough?

This comment has been minimized.

@Centril

Centril Nov 11, 2018

Contributor

@eddyb I think that's already the case; the RFC doesn't state anywhere, as far as I can tell, that &Thing::Two is a valid &Thing. Also note that the RFC explicitly states that Thing::Two and Thing having the same layout is not a guarantee so we could change the layout to be more space efficient.

@petrochenkov

This comment has been minimized.

Contributor

petrochenkov commented Dec 10, 2018

Good feature in general, with enough movitation, but detailed specification is very questionable.
We should rather figure out how make type inference fallback work in the previous version of the RFC.

@eddyb

This comment has been minimized.

Member

eddyb commented Dec 10, 2018

@petrochenkov What are the pain points? Usually the problem with type inference fallbacks is the interaction with the trait system, but we can collapse "variant inference variables" to the enum type on contact with the trait system AFAICT.

@varkor

This comment has been minimized.

Member

varkor commented Dec 10, 2018

We should rather figure out how make type inference fallback work in the previous version of the RFC.

This is definitely the design I'm leaning towards now. However, I couldn't find any concrete description of the problems with inference in the previous RFC other than "we still haven't figured out default type parameters". Are there problems with the existing version of default type parameters and type inference for enum variants now?

@nrc writes:

Enum variants would add another case of this and so we wanted to be certain that the current approach is good and there are no weird interactions. IIRC, there was also some very minor backwards incompatibility.

But "making sure there are no weird interactions" is a difficult goal: it's hard to rule out any at all, save for thinking about it collectively.


I'm going to update the RFC to suggest using type inference, similar to the previous RFC, but with some slight ergonomic adjustments.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 19, 2018

I have been thinking a lot about this RFC lately. I use to think that it was a good idea and would be the first step to sealed trait or sum-enum.

Turns out I now think it goes against Rust design and is fixing the wrong problem.
Let's take Either as an example:

enum Either<A, B> { L(A), R(B) }

Enums are tagged unions. L and R are tags. Just a way for enum to differentiate its content. Not a type...

From all I can read, I don't think people fundamentally wants tag to become type. They basically want to avoid the weird pattern where the tag is having the same name as your type and having to wrap the content of your variant in a struct:

struct A(int);
struct B(int, A);
struct C(String);
enum MyEnum {
   A(A),
   B(B),
   C(C),
   ...
}

As mentionned by @Centril a good example of that is syn::Expr.
They also want an easy way to define a subset of the enum (An enum that contain only A and B lets say).

Also what I want is implement a trait X for an enum without having to do it with a big pattern match but instead having the enum dispatching the method call of the trait X to the variant (assuming all the variant implement X).

Turns out all of this can be achieved with fairly simple macros. enum_dispatch is a very good example of it.

If we want that in the language, and I think it would be really nice, then we could just have some macros in the std lib like:

#[union] Expr { Box, If, While, Cast }

Which desugar to pretty much the current syn::Expr.

I am really failing to see any other reason why people would say that want the tag of the enum to become a type.

I am really not in favor of this RFCs, I think it's focusing on fixing a problem which is not one (tags are not typed) and introduce unreasonable and unintuitive limitations.

@alexreg

This comment has been minimized.

alexreg commented Dec 20, 2018

@thibaultdelor That solves one of the issues that motivates this RFC, but not all. For the issue it solves, see my pre-RFC. It would be a lot more elegant to incorporate a solution like that into the language... the enum_dispatch solution is far less ergonomic, especially, when it comes to matching.

Also, there is an argument "tag's do not have to be types", but equally tags can be types, and are furthermore a natural fit for types, so I'm not sure it's a strong case...

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 20, 2018

That solves one of the issues that motivates this RFC, but not all.

What does it not solve?

I didn't mentionned all the examples of the RFC but from what I could see all the problems have as the root cause the lack of nice syntax to define a simple sum type. To me the RFC fails to identify this root cause.

It would be a lot more elegant to incorporate a solution like that into the language...

Agree with that, also far more difficult in term of backward compatibility... Your RFC introduce some change in the language. My solution doesn't require any change to the language or stdlib. Can be just a crate.

Also, there is an argument "tag's do not have to be types", but equally tags can be types [...]

We are not talking about just adding a type here, we are talking about changing the compiler and introducing concepts in the languages. This DOES have an impact on compexity, learnability. etc...

Think about an enum being like

struct MyEnum<T>{
  tag: &str,
  content: T
}

That's what an enum is, with some extra compiler checks. tag is a field. All we are talking about is trying to hide the tag from the user and change the syntax of Rust so instead of defining a tag, with the same syntax, you actually define a type and hide the tag from the user. All about syntax.

The same thing can be achieved with a macro. I like the macro, because it becomes clear to the user that no new concept have been introduced, just a better syntax for existing thing.

far less ergonomic, especially, when it comes to matching

I don't see whats hard or counter intuitive about:

if let MyVariant(o) = my_enum {
   o.do_something
}
@alexreg

This comment has been minimized.

alexreg commented Dec 20, 2018

What does it not solve?

For one, you can't do things like:

fn foo(Enum::Variant t) { ... }

Agree with that, also far more difficult in term of backward compatibility... Your RFC introduce some change in the language. My solution doesn't require any change to the language or stdlib. Can be just a crate.

There are no significant backwards compatibility issues. And yes, this is the issue with all evolution of the language and all language RFCs, but the pros and cons have to be weighed up.

We are not talking about just adding a type here, we are talking about changing the compiler and introducing concepts in the languages. This DOES have an impact on compexity, learnability. etc...

Changing the compiler is no big deal. Introducing new concepts and increasing complexity or obstructing learnability can be. I argue that this RFC (and my pre-RFC) actually make things more intuitive and learnable, and reduce the burden on the learner -- at least, for someone learning Rust as a first programming language and having no prior expectations about enums. Being able to use the names of existing types as tags makes a lot of intuitive sense to me at least.

I don't see whats hard or counter intuitive about:

if let MyVariant(o) = my_enum {
   o.do_something
}

Okay, "far less" is probably an exaggeration. I would however like to see this sort of thing baked into the language because it's really quite a small addition in surface area, the increase in syntax/keywords is essentially nil, and it avoids the use of proc macros, which slightly obfuscate things from the user. That said, I'm happy a proc-macro solution like this exists for now! It has a lot of utility.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 20, 2018

Anticipating someone mentionning Scala. Scala enums are very very different from Rust ones not sure we can relate to it.

@bchallenor mentionned Either. Either is not an enum, it's an an sealed abstract class with two sub class Left and Right. That makes total in OOP to design it like this, in Rust things are much different. Scala really doesn't care about dynamic dispatch...

Now, I have been doing Scala for 5 years now and I have never ever seen a function taking a Left in its signature...

The point being that Left is just a tag that just matters when dealing with deconstructing the "enum" but then you are interested in the content, not the tag itself...

@varkor

This comment has been minimized.

Member

varkor commented Dec 20, 2018

That's what an enum is, with some extra compiler checks. tag is a field.

An enum is a sum type: the disjunction of a collection of other types. The variants of the enum are the types over which the enum is disjunctive. This is a well-known concept in programming languages and provides direct motivation for treating the variants as types.

With this attitude, the conceptual solution provided by this RFC is very natural (as evidenced by previous RFCs in similar veins). I think the main concerns are implementation-focused, rather than conceptual concerns. Your concerns seem to be centred around your understanding of what an enum is, but I suggest that this proposal will make a lot more sense if you think of them from the perspective of sum types.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 20, 2018

For one, you can't do things like:

fn foo(Enum::Variant t) { ... }

That's exactly what I am saying... you don't need that at all. If you need variant to be a type, wrap a struct into an enum variant. Variants are flags only useful to deconstruct the enum. What you want to pass to a function is the content of the variant.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 20, 2018

An enum is a sum type

I was referring to the internal representation. For the user it's a sum type. For the compiler it's something with a tag attached to tell it how to treat it.

The variants of the enum are the types over which the enum is disjunctive. This is a well-known concept in programming languages and provides direct motivation for treating the variants as types.

Nope, if you really want to talk theory and "well-known concept" an enum is a degenerate tagged union of unit types, used only for their flags.

In theory we've got everything we need in Rust and all those concepts are represented.

What matters is the syntax now. In practice, in Rust, an enum is a tagged union where the first part of the variant definition is the variant name (or tag) and the second part is the content (Name(Content)).

I am arguing that we don't need to make the variant Name a type. That doesn't make sense conceptually. There will always be a variant name, whether it's hidden or not.

What we want is to make it easier for the user and have the option of not specifying the variant name.

That's what this macro does:
#[union] Expr { Box, If, While, Cast }

And this doesnt have the weird side effects of having some kind of half-types that you can't implement.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 20, 2018

Ultimately, I think what we want is those two pieces of code to be strictly equivalent:

mod MyEnum{
   struct A(i32);
   struct B(i32, i32);
   struct C(String);
}
enum MyEnum {
   MyEnum::A,
   MyEnum::B,
   MyEnum::C
}

and

enum MyEnum {
   A(i32),
   B(i32, i32),
   C(String)
}

This RFC looks like is doing half the way to that. The second half being "how to allow enum MyEnum { MyExistingType }". The second half being, IMO, the most valuable and also the hardest (because not backward compatible). On top of that, this RFC introduce some unintutive limitations.

From a user perspective, the benefit of this RFC is that you don't need to do

struct A(i32);
enum MyEnum{A(A)};

You can do:

enum MyEnum{A(i32)};

and then you can do pattern matching using if let a @ MyEnum::A(_) = my_enum instead of if let MyEnum::A(a) = my_enum, which to me is not a massive improvement.

So the real improvement is how you declare your enum, which as I say can be done with a macro and won't have the limitations.

@varkor

This comment has been minimized.

Member

varkor commented Dec 21, 2018

If you need variant to be a type, wrap a struct into an enum variant.

You can get around the problems with enum variants at the moment, sure — but they're not particularly ergonomic or natural. The change proposed here is a natural way to extend the type system minimally to support a common pattern in Rust code.

an enum is a degenerate tagged union of unit types

A tagged union is another name for a sum type.

I am arguing that we don't need to make the variant Name a type. That doesn't make sense conceptually.

I think if you read a bit more about sum types (the Wikipedia article you link to actually has a good explanation), you'll see that this concept precisely matches up with the interpretation of a sum type / tagged union / disjoint union / etc. It's a type C = A + B, where the enum is C and A and B are individual types. They're not just names.

I use to think that it was a good idea and would be the first step to sealed trait or sum-enum.

Note that anonymous enums are an orthogonal feature to this one (though they can complement each other).

Ultimately, I think what we want is those two pieces of code to be strictly equivalent:

This RFC does address one concern with code like that, but the aim of the RFC is not to make them equivalent (in fact, the RFC is specifically written so that they are not: see @shepmaster's comments above).

and then you can do pattern matching using if let a @ MyEnum::A(_) = my_enum instead of if let MyEnum::A(a) = my_enum, which to me is not a massive improvement.

This kind of pattern matching (which is more evident in match rather than if let anyway) is not the only advantage here. You can often avoid the pattern matching entirely. The examples in the RFC give some more motivations.

But regardless, this RFC isn't meant to solve "all enum-related problems". It is focusing on a very particular ergonomics shortcoming that provides cleaner patterns for dealing with enums (with a solid basis from the theory of sum types). I agree that there are other improvements that would be nice to have regarding enums, but I think they're orthogonal to the problem that's being tackled here. It's better to make smaller, incremental improvements than try to tackle all the problems at once.

@varkor

This comment has been minimized.

Member

varkor commented Dec 21, 2018

I've updated the RFC to propose a type-inference based approach to determining the type of variants/enums now, which is more in line with the previous RFC.

Enum variants would add another case of this and so we wanted to be certain that the current approach is good and there are no weird interactions. IIRC, there was also some very minor backwards incompatibility.

This RFC should address those and issues, and summarise how this RFC is different to #1450.

I've been giving it some thought and I couldn't figure out how variant inference might conflict or have weird interactions with the other kinds of inference (nor did a few others I asked). It could be the case that type inference is simply more well-established now than it was when #1450 was proposed, or I might not be being imaginative enough. Either way, that's one of the points of the RFC process, so I figure I'll leave it as an open question and if anyone can spot issues, we can address them — without anything concrete to go on now, though, it's hard to resolve that concern.

I've also left a little summary of differences to #1450, but GitHub is not being happy sharing the RFC content at the moment, so I may update this with more differences once I can access it again.


Regarding @shepmaster's desire for space-optimised variants, I think this is probably a fundamental tradeoff with performance. It would be nice to enable a space-optimised use case, but I think this could be left for a future RFC. This RFC doesn't commit to a representation, but suggests an initial one. This should leave us open to experiment in the future. (E.g. maybe we could have some kind of attribute to control this.)

@varkor

This comment has been minimized.

Member

varkor commented Dec 21, 2018

Regarding @shepmaster's desire for space-optimised variants, I think this is probably a fundamental tradeoff with performance. It would be nice to enable a space-optimised use case, but I think this could be left for a future RFC.

I've just had an interesting thought about this, though it may be slightly off-topic.

Anonymous enums/sum types will presumably represent each of their variants as standalone types (as that's what they are), rather than repackaging them somehow. If there was then a built-in way to convert an enum to an anonymous enum (i.e. take all the variants of an enum as types and put them into a new anonymous enum), this new type would have a space-optimised representation. This would naturally facilitate both kinds of representations.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 22, 2018

Meh, i am obviously explaining badly, because all the counter arguments are completely missing my point....

Just to make it clear, I am 100% in favor of, and really want to see happening, simple sum types as in U = A + B
The current syntax of Rust, misleadingly named enum, is far from being ideal.

The current RFCs build on that and try to bend the current syntax so it looks a bit more like your usual sum type.. Except that:

  • you still have no easy way of adding two existing types
  • the types created by each variant are specials and follow different rules
  • There's no plan on how we transition to a more long term solution.

You could argue that it's a step in the right direction to which i would say that if you don't know what the final state, you don't know if it's the right step or if you just build and enforce things that you will want to get rid of.

On the other side, a macro would solve most of the problems you are already solving and doesn't introduce new concepts.

Rust is already complex and I am tired of explaining all the little quirks to my colleagues. This would be an other quirk which to me is bringing much less benefits than what people think in practice.

Something like Haskell sum-type, or Scala sealed class, would make much much more difference difference to the language but that's not what this RFC is proposing.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 22, 2018

I would be really keen on comparing a real world piece of code, one refactored by me, the other with this RFC.

@leonardo-m

This comment has been minimized.

leonardo-m commented Dec 22, 2018

I feel that the design ideas of varkor here are better, I have had functions that return (or take) just an enum branch unconditionally. An example, here the insert() never returns None:

http://rosettacode.org/wiki/Pattern_matching#Rust

But feelings aren't enough to design a language and that isn't idiomatic Rust code (because it allocates too much), so I agree with thibaultdelor that it could be a good idea to compare how a more sizeable chunk of real world code could come out applying varkor or thibaultdelor ideas (one example could come from applying the ideas to parts of rustc source code, even if compiler-code is a rather special kind of code).

@varkor

This comment has been minimized.

Member

varkor commented Dec 22, 2018

Just to make it clear, I am 100% in favor of, and really want to see happening, simple sum types as in U = A + B
The current syntax of Rust, misleadingly named enum, is far from being ideal.

The current RFCs build on that and try to bend the current syntax so it looks a bit more like your usual sum type.. Except that:

Right, so you're looking for a more general syntax for algebraic data types that allow you both to define new types (à la enums at the moment) or use existing types as variants?

Regardless of what an ideal future syntax would look like, the fact is that there is a syntax for enums at the moment and that's not going to go away, even if new ways of dealing with ADTs exists in the future. I think this enhancement to the existing enum syntax removes a papercut in a natural way (rather than being a "quirk"). For those that don't want to take advantage of this, there's no drawback to having this as a feature — existing patterns will continue to work — but it does solve a real problem now in a minimal way. I don't see that having new syntax or features for dealing with ADTs in the future is held back by implementing this proposal.

Do you think that the proposal here is forwards incompatible or leads to confusing behaviour for type system extensions in the future?

On the other side, a macro would solve most of the problems you are already solving and doesn't introduce new concepts.

I don't think this introduces new concepts, other than "enum variants are types", which is a simple concept to understand. Do you think there are other conceptual difficulties? I think many/most people would expect enum variants to be types (especially those coming from functional backgrounds).

@Centril

This comment has been minimized.

Contributor

Centril commented Dec 22, 2018

@thibaultdelor

Something like Haskell sum-type, or Scala sealed class, would make much much more difference difference to the language but that's not what this RFC is proposing.

I'm confused by this. Rust enums are equivalent to Haskell data types in that both are sums of product types (the variants) which are commonly known as algebraic data types (ADTs). They are also both nominally typed.

For example, we can translate:

enum Foo {
    Bar(Alpha, Beta),
    Baz(Gamma),
    Quux
}

directly into:

data Foo = Bar Alpha Beta | Baz Gamma | Quux

There is no semantic distinction here (aside from a lack of μ in Rust) it's just a difference in surface syntax.

@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 22, 2018

@Centril As I have written, all the concepts in Rust are here, it's all about syntax. Depending on how simple is the syntax, you encourage different developer behaviors.

To take your example you can argue that your translation is what this RFC would lead to, whereas Rust in the current state it would translate to :

data Foo = Alpha Beta | Gamma | ()

Bar, Baz and Quuz being just name bindings... In Rust, this name are used to do pattern matching and for construction.

This is fundamental in what I am saying. The type of the variant in current rust can be seen as the type of its content. It already has a type.

This RFCs change the semantic without changing the syntax, which I am fine with.

@varkor
The RFC is not backward compatible but I am questioning what it is trying to solve VS what it actually solves.

All the arguments I hear in favors are things like

  • "What if my method can only return a Baz(Gamma)"
    Me: Why the hell aren't you just return Gamma?". Baz isn't useful except for pattern matching orconstructing the enum.
  • "What if my method can only return / take just a subset of my Enum?
    Me: Why don't you create an enum which is a subset of your enum? If that's what your method returns then that's what your method should declare.

The reason why people don't do it is because sum-types in Rust have a heavy syntax which discourage it.

This RFCs, makes it slighty easier by changing the meaning of enum and say "Now enum is a shortcut for creating several types that we are adding together; their id now being their type name, and changing the behaviour of pattern matching so you can match on both the name with Baz(gamma) => or by type using b @ Baz(_) =>.
If that was stopping there I would be fine. But then it add a limitation on those which is "You can't implement anything on those types" which to me is a huge drawback and make the only benefit of this RFC becoming "You don't have to wrap your struct in a variant for types that don't need impl".

Basically, a slightly shorter syntax and being able to pattern match by type. (Arguably It doesn't make much difference to the user how it's matched)

With a macro I could do:

struct Bar(Alpha, Beta);
struct Baz(Gamma);
struct Quux{};
sum!(Foo, [Bar, Baz, Quux]);
// maybe create a subset too: sum!(FooSub, [Bar, Quux]);

// Pattern match
if let Bar(a) = my_foo {
   println("{}", a.0)
} 

Many different option on what this macro does or its syntax, but here I guess it would generate:

enum Foo {
    Bar(Bar),
    Baz(Baz),
    Quux(Quux)
}

This solves some problems, can be done right now and doesn't have limitations on the struct since you defined them explicitely. It's also FAR from being ideal and has this weird relation between the variant name and its type.

I dislike how enum have been implemented in Rust but yes it's not going to leave. I am just worried that we build on top of it with a solution that doesn't fix much, could be fixed in a different way and doesn't make the future any brighter.

One of those 2 things would make me change my mind:

  • Removing the limitation on Variant types
  • Have a clear path to a SumTypeV2
@Centril

This comment has been minimized.

Contributor

Centril commented Dec 22, 2018

@thibaultdelor

To take your example you can argue that your translation is what this RFC would lead to, whereas Rust in the current state it would translate to :

data Foo = Alpha Beta | Gamma | ()

No, it would not.

data Foo = Alpha Beta | Gamma | () is not legal Haskell because () is not an identifier. The syntax of Haskell data types as defined in the Haskell2010 report when simplified is roughly:

data UIdent $(LIdent)* = $(UIdent $(Type)*)|*

Furthermore, the translation to Alpha Beta and Gamma is wrong because Alpha and Gamma are types in current Rust whereas in the Haskell definition they are suddenly constructors in the value namespace.

The semantically accurate transformation (forgetting about recursive types) of a Rust enum to a Haskell data type is:

enum TypeName {
    VariantName(Type, ..., Type),
    ...,
    VariantName(Type, ..., Type),
}

into:

data TypeName
  = VariantName Type ... Type
  | ...
  | VariantName Type ... Type

Bar, Baz and Quuz being just name bindings... In Rust, this name are used to do pattern matching and for construction.

In Haskell as well:

data List a = Nil | Cons a (List a)

instance Functor List where
  fmap f list = case list of Nil -> Nil; Cons x xs -> Cons (f x) (fmap f xs)
@thibaultdelor

This comment has been minimized.

thibaultdelor commented Dec 22, 2018

Indeed you are right. Doesn't really change my point.

@varkor

This comment has been minimized.

Member

varkor commented Dec 22, 2018

Me: Why the hell aren't you just return Gamma?". Baz isn't useful except for pattern matching orconstructing the enum.

This works only when your variant contains at most one type (e.g. Baz(Gamma), rather than Baz(Alpha, Beta, Gamma)) — otherwise you have to constantly repackage them into a tuple, or pull them into a separate struct, which is just extra boilerplate and loses the semantic intent.

One of those 2 things would make me change my mind:

Removing the limitation on Variant types

I'm not strongly opposed to allowing impls on variants; I was intentionally being conservative. If there are strong feelings in favour of permitting impls, I can add it to the RFC. I'll add it as an unresolved question.

@alexreg

This comment has been minimized.

alexreg commented Dec 22, 2018

I'm not strongly opposed to allowing impls on variants; I was intentionally being conservative. If there are strong feelings in favour of permitting impls, I can add it to the RFC. I'll add it as an unresolved question.

I personally would like to see it very much. Otherwise variant types feel like "2nd class" types from the start. It can be handled as part of specialisation, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment