New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Types for enum variants #1450

Closed
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
@nrc
Copy link
Member

nrc commented Jan 7, 2016

Edit: this RFC is now only about variant types, the unions stuff is removed in favour of #1444

This is something of a two-part RFC, it proposes

  • making enum variants first-class types,
  • untagged enums (aka unions).

The latter is part of the motivation for the former and relies on the former to
be ergonomic.

In the service of making variant types work, there is some digression into
default type parameters for functions. However, that needs its own RFC to be
spec'ed properly.

Variant types and untagged enums
This is something of a two-part RFC, it proposes

* making enum variants first-class types,
* untagged enums (aka unions).

The latter is part of the motivation for the former and relies on the former to
be ergonomic.

In the service of making variant types work, there is some digression into
default type parameters for functions. However, that needs its own RFC to be
spec'ed properly.

@nrc nrc added the T-lang label Jan 7, 2016

@nrc nrc self-assigned this Jan 7, 2016

@nrc nrc referenced this pull request Jan 7, 2016

Merged

unions #1444


Importing an enum imports it into both the value and type namespace. Importing
a variant imports it only into the value namespace. To maintain backwards
compatibility, this will remain the default. In order to import an enum variant

This comment has been minimized.

@sgrif

sgrif Jan 7, 2016

Contributor

Is this actually required for backwards compatibility? I don't think there's currently a way you can have a variant in scope and have a type with the same name.

This comment has been minimized.

@nrc

nrc Jan 7, 2016

Member

Huh, looks like you're right, that is good news!

This comment has been minimized.

@petrochenkov

petrochenkov Jan 8, 2016

Contributor

I can't check the code right now, but it looks like variants are already imported into both namespaces (and I do remember that they are defined in both namespaces):

use E::V;

enum E {
    V { field: u8 }
}

type V = u8; // Conflicts as expected

fn main() {
    let e = V{field: 0}; // Compiles as expected 
    match e {
        V{..} => {} // Compiles as expected 
    }
}

This comment has been minimized.

@nikomatsakis

nikomatsakis Jan 8, 2016

Contributor

I can't check the code right now, but it looks like variants are already imported into both namespaces (and I do remember that they are defined in both namespaces):

I was wondering about that, I know we tried to prepare for the possibility that variants would become types...

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 7, 2016

#[repr(union)]
enum X {
    A {
        alpha: u8,
    },
    B(u8),
}

struct Y {
    x: X,
}

Given y: Y, how do I access alpha?

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Jan 7, 2016

@mahkoh

fn foo(y: Y) {
    let a = unsafe { y.x as X::A };
    println!("alpha: {}", a.alpha);
}
@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 7, 2016

@nrc This means that modifying any field in a union in a struct requires

  1. Copying the union field out of the struct,
  2. Making the modification,
  3. Copying the field back into the struct.

In the presence of nested unions, repeat. This seems to be extremely inefficient and unergonomic since unions are very often used as struct fields.

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 7, 2016

Furthermore, what happens if the union doesn't implement Copy?

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Jan 7, 2016

Should work by reference too:

fn foo(mut y: Y) {
    let a = unsafe { &mut y.x as &mut X::A };
    *a = 42;
}

I should add casts of the reference types to the RFC too.

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 7, 2016

Also I believe that one of your arguments against the other RFC was that it requires unsafe code even if you already know the variant. This can be easily solved by taking the idea from this RFC and applying it to the other RFC:

union X {
    f1: T1,
    f2: T2,
}

let x: X = ...
let y: &T1 = unsafe { &x as &T1 };
@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Jan 7, 2016

Let's look at a simple example of a union I already deal with in winapi.

https://github.com/retep998/winapi-rs/blob/master/src/wincon.rs#L17-L25

The original in C looks like:

typedef struct _KEY_EVENT_RECORD {
    BOOL bKeyDown;
    WORD wRepeatCount;
    WORD wVirtualKeyCode;
    WORD wVirtualScanCode;
    union {
        WCHAR UnicodeChar;
        CHAR   AsciiChar;
    } uChar;
    DWORD dwControlKeyState;
} KEY_EVENT_RECORD;

Under this RFC in Rust it would look like:

STRUCT!{struct KEY_EVENT_RECORD {
    bKeyDown: ::BOOL,
    wRepeatCount: ::WORD,
    wVirtualKeyCode: ::WORD,
    wVirtualScanCode: ::WORD,
    uChar: KEY_EVENT_RECORD_uChar,
    dwControlKeyState: ::DWORD,
}}
#[repr(union)] enum KEY_EVENT_RECORD_uChar {
    UnicodeChar(::WCHAR),
    AsciiChar(::CHAR),
}

Since I cannot conveniently import UnicodeChar into the global namespace (union variants are not in the global namespace in C so there may be other identifiers with the same name) that means users would have to fully specify that type, which is kinda ugly:

let x: KEY_EVENT_RECORD = ...;
let y = x.uChar as KEY_EVENT_RECORD_uChar::UnicodeChar;

Meanwhile in C someone could just do x.uChar.UnicodeChar. They can even mutate the variant in place instead of having to do x.uchar = KEY_EVENT_RECORD_uChar::UnicodeChar(foo).

With my current macro solution the user can call inherent methods to get (mutable) references to the variant, which is almost as good as direct field access. Any RFC for unions needs to provide syntax that is better than what I currently have, which means direct field access is really important.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Jan 7, 2016

@retep998 I don't think that direct field access is a very Rust-y solution - it's an unsafe operation which doesn't indicate why/in what way it is unsafe. I think the casting should extend to (mutable) references, and within a scope you can bring in enum variants as types, so you have a clear two line solution instead of an unclear one solution, which seems not too bad:

fn foo(x: KEY_EVENT_RECORD) {
    use KEY_EVENT_RECORD_uChar::*;

    let y = unsafe { &mut x.uChar as &mut UnicodeChar };
    y.0 = 42; 
}

You could stick that all on one line if you like, but I admit that gets pretty ugly.

expression - both the variant type and the enum type. If there is no further
information to infer one or the other type, then the type checker uses the enum
type by default. This is analogous to the system we use for integer fallback or
default type parameters.

This comment has been minimized.

@eddyb

eddyb Jan 7, 2016

Member

This is great, pretty much what I expected and @nikomatsakis also confirmed it's what he would do (although him and @aturon aren't sure it's enough i.e. compared to full subtyping).

This comment has been minimized.

@aturon

aturon Jan 7, 2016

Member

Note that it gets a bit more complicated if we also support nested enums, though probably the fallback would just be to the root in that case.

This comment has been minimized.

@eddyb

eddyb Jan 7, 2016

Member

Yes, I wanted to add that with nested enums we would have a chain of potential types to infer to, but I realized that nested enums aren't in the scope of this RFC.

let _: Foo::Variant2 = a; // Compile-time error
let _: Foo::Variant2 = b; // Compile-time error
let _ = a as Foo::Variant2; // Compile-time error
let _ = b as Foo::Variant2; // Runtime error

This comment has been minimized.

@eddyb

eddyb Jan 7, 2016

Member

I don't see checked downcasting here. I would really prefer:

match b {
    x: Foo::Variant2 => {...}
    _ => { /*not Variant2*/ }
}

Then the user can do something useful instead of panicking, and I believe this is necessary if we want to experiment with modelling the DOM as an ADT hierarchy.

This comment has been minimized.

@nrc

nrc Jan 8, 2016

Member

Oh yes, I want this, it is super-important, but I forgot to put it in. I expect we can keep the current syntax: x @ Foo::Variant2 => { ... }. I'm not sure if we then apply the same inference approach to x or x always has type Variant2.

This comment has been minimized.

@eddyb

eddyb Jan 8, 2016

Member

I believe the inference approach should work, if we pin the variant type on variant-specific operations (like accessing a field).

This comment has been minimized.

@Stebalien

Stebalien Jan 8, 2016

Contributor

I'm strongly against panicking on cast for anything but debugging.

process_bytes(m as MyBytes)
}
} else {
unsafe { m as MyInt }.0

This comment has been minimized.

@eddyb

eddyb Jan 7, 2016

Member

A block, including unsafe {...} can only be an rvalue which means this copies MyInt.

While here it wouldn't make much of a difference, showing (m as MyInt).0, maybe even taking a reference thereof or assigning to it, would make the lvalue property clearer.

However, as currently only produces rvalues, so I'm not sure what the overall intent is.

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 7, 2016

I don't think that direct field access is a very Rust-y solution - it's an unsafe operation which doesn't indicate why/in what way it is unsafe

The same applies to calling an unsafe method.

You could stick that all on one line if you like

You have to do either that or introduce another scope due to lexical lifetimes. More realistically, the code would look like this:

fn f(mut x: Struct1) {
    {
        use KEY_EVENT_RECORD_uChar::*;

        let y = unsafe { &mut x.f as &mut Variant1 };
        y.0 = 42; 
    }
    g(&x);
}

And when you have nested enums then it gets really funny.

inheritance: if we allow nested enums, then there are many more possible types
for a variant, and generally more complexity. If we allow data bounds (c.f.,
trait bounds, e.g., a struct is a bound on any structs which inherit from it),
then perhaps enum types should be considered bounds on their variant types.

This comment has been minimized.

@eddyb

eddyb Jan 7, 2016

Member

Perhaps we can retain only trait bounds and use e.g. V: VariantOf<E> - which is what I've used in my version of the examples in @nikomatsakis' blog post introducing many of these ideas.

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 7, 2016

Tbqh, if this is the union accepted union implementation, then I'll just continue to do what I do right now: Add two methods per variant (variant, variant_mut) and transmute the passed reference. Code like that already works and it will be shorter than this. The only thing gained by this implementation is that one can have a type with the correct size and alignment. But once ctfe becomes better, that will be possible without this implementation too.

@ubsan

This comment has been minimized.

Copy link
Contributor

ubsan commented Jan 7, 2016

I love this proposal, thank you for coming up with it. It still needs a little work, but it is the best thing I've seen so far and works well in the fledgeling memory model that will be released soon.

@LaylConway

This comment has been minimized.

Copy link

LaylConway commented Jan 8, 2016

Here is an alternative proposal for syntax on unions, anything but final but as a rough idea:

struct KEY_EVENT_RECORD {
    bKeyDown: ::BOOL,
    wRepeatCount: ::WORD,
    wVirtualKeyCode: ::WORD,
    wVirtualScanCode: ::WORD,
    uChar: KEY_EVENT_RECORD_uChar,
    dwControlKeyState: ::DWORD,
}

c_union KEY_EVENT_RECORD_uChar {
    unicode_char: ::WCHAR,
    ascii_char: ::CHAR,
}

// Accessing:
let mut record = foo();
{
    // Immutable Ref
    let unicode_c = unsafe { record.wChar.as_unicode_char() };
}
{
    // Mutable Ref
    let ascii_c = unsafe { record.wChar.as_ascii_char_mut() };
}

Setting is still undetermined. My main issue with this syntax would be that adding methods feels too magic-y for rust, but I feel they're important to communicate what's unsafe about what's being done. I want to try to fit in a #[derive(_)] in there to mitigate that, but c_union wouldn't have any use without the derive so it feels wrong to add it.

@ubsan

This comment has been minimized.

Copy link
Contributor

ubsan commented Jan 8, 2016

I could see this going a few ways:

// This is a simplified struct from WinAPI
#[repr(union)]
enum tagVARIANTvariant {
    llVal(LONGLONG),
    lVal(LONG),
    bVal(BYTE),
    iVal(SHORT),
}

#[repr(union)]
enum tagVARIANTtag {
    __tagVARIANT {
        vt: VARTYPE,
        variant: tagVARIANTvariant,
    },
    // other fields here I've taken out
}

struct tagVARIANT {
    tag: tagVARIANTtag,
}

fn test() {
    let x: tagVARIANT;
    // This is the safest bet. However, it's also a pain to use, and it's unlikely that FFI people
    // will go along with it
    (x.tag as tagVARIANTtag::__tagVARIANT).vt = LLVAL;
    (x.tag as tagVARIANTtag::__tagVARIANT).variant = tagVARIANTvariant::llval(100);

    // We could also automatically import the types. This is backwards compatible in the `as` field,
    // although I'm less certain about the = field. This would also make normal enums nicer to use,
    // and I've heard talk of doing this with match statements.
    (x.tag as __tagVARIANT).vt = LLVAL;
    (x.tag as __tagVARIANT).variant = llval(100);

    // This is the last suggestion, and my least favorite. However, if we can't find a solution, we
    // may have to go with it :/
    x.tag.__tagVARIANT.vt = LLVAL;
    x.tag.__tagVARIANT.variant = tagVARIANTvariant::llval(100);
}

To clarify what I'm talking about with match statements, some have talked about:

enum E {
    Var0,
    Var1,
}

match E { // this would be okay, despite never `use`ing Var0 or Var1
    Var0 => {},
    Var1 => {},
}
@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 8, 2016

There is also a need for non-anonymous variants. E.g.

#[repr(C)]
struct TagVariant {
    vt: VARTYPE,
    variant: tagVARIANTvariant,
}

#[repr(union)]
enum tagVARIANTtag {
    __tagVARIANT: TagVariant,
    // other fields here I've taken out
}
@ubsan

This comment has been minimized.

Copy link
Contributor

ubsan commented Jan 8, 2016

@mahkoh That's covered by tagVARIANTvariant.

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 8, 2016

@ubsan So every access additionally requires a .0?

@ubsan

This comment has been minimized.

Copy link
Contributor

ubsan commented Jan 8, 2016

@mahkoh If you're doing it that way, yes, I assume.

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 8, 2016

This RFC deals with two orthogonal issues:

  • Allowing enum variants to be used as types
  • Allowing the discriminant to be removed from enums

Therefore it should be split into two RFCs.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Jan 8, 2016

I agree with @mahkoh, as I believe that enum variants as types is less controversial of a feature, and it's an important step towards empowering ADT hierarchies.


## impls

`impl`s may exist for both enum and variant types. There is no explicit sharing

This comment has been minimized.

@japaric

japaric Jan 8, 2016

Member

Would this let us fix rust-lang/rust#5244? Example:

enum Option<T> {
    Some(T),
    None,
}

// we add this
impl<T> Copy for Option<T>::None {}

// then, either this just works
let x: [Option<String>; 10] = [None; 10];

// or this works (can this be written without the temporary `t`?)
let t: [Option<String>::None; 10] = [None; 10];
let x: [Option<String>; 10] = t;

This comment has been minimized.

@retep998

retep998 Jan 8, 2016

Member

It would likely require that the variant types have the same size as the enum itself, so they'd have to have unused padding for things like the discriminant.

This comment has been minimized.

@mahkoh

mahkoh Jan 8, 2016

Contributor

That is the case according to the RFC.

This comment has been minimized.

@eddyb

eddyb Jan 8, 2016

Member

The correct solution here is not using Copy directly, i.e. by either allowing constants, or ADTs of Copy-able types.

While @japaric's proposal may be equivalent to the latter option, let x: [Option<String>; 10] = [None; 10]; would not work with this RFC as written because None would infer to Option<String> which is not Copy - and there is no conversion from [None<T>; N] to [Option<T>; N] (but there could be? not sure what we can do here).

This comment has been minimized.

@nikomatsakis

nikomatsakis Jan 8, 2016

Contributor

On Fri, Jan 08, 2016 at 07:14:59AM -0800, Jorge Aparicio wrote:

Would this let us fix rust-lang/rust#5244?

Yes, perhaps, that's one of the appealing things about having variants
be types.

This comment has been minimized.

@nikomatsakis

nikomatsakis Jan 8, 2016

Contributor

On Fri, Jan 08, 2016 at 10:06:19AM -0800, Eduard-Mihai Burtescu wrote:

While @japaric's proposal may be equivalent to the latter option, let x: [Option<String>; 10] = [None; 10]; would not work with this RFC as written because None would infer to Option<String> which is not Copy - and there is no conversion from [None<T>; N] to [Option<T>; N] (but there could be? not sure what we can do here).

In the RFC as written, I think something like [None: None<String>; 32] would be required?

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 8, 2016

There is a lot here. I'll just try to write some comments as I digest (and catch up on the already lengthy comment thread).

First, I'm not a big fan of foo as Variant downcasting. It's good for unions but kind of "bad style" elsewhere and I'm not sure we need to add it; I'd prefer to encourage people to write if let or match, just as they do today.

Second, I would love to have variants be types, but the integration described in the RFC sounded a bit rougher than I would like. I guess I have to process it more, as the RFC has a number of particular mechanisms, and it's hard for me to judge if they are good choices or not. This may also be something we can also expect to hammer on and experiment with in the implementation phase, but it'd be good to think hard about the rough edges that we expect to emerge.

Finally, on the topic if unions in particular. It's certainly the case that a #[repr(union)] annotation feels more natural on an enum than on a struct, but I also find something in #1444 very appealing. Certainly it's a much more targeted addition, which I like. I also like that you use field projection foo.bar to select which variant you want -- it feels like an "active selection", whereas matching does not. (as, admittedly, feels good too -- but then as seems to be sugar for match, and match feels wrong, since I expect that in a match the compiler will pick which arm, not me.)

@bchallenor

This comment has been minimized.

Copy link

bchallenor commented Jan 21, 2018

Is there any scope for revisiting this now that a year has passed? Coming from Scala, I'm used to having this feature, and I seem to keep running into it.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Feb 12, 2018

I concur with @bchallenor... this seems like a highly desirable feature, and one that we're at a good point to implement, if I'm not mistaken.

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Feb 23, 2018

The "postponed" issue for this RFC was never created, so I've made one - #2347.

@alexreg

This comment has been minimized.

Copy link

alexreg commented May 15, 2018

Any chance of this being revisited soon?

@nielsle

This comment has been minimized.

Copy link

nielsle commented May 15, 2018

For the sake of posterity. The following pattern combined with crates such as https://github.com/JelteF/derive_more and https://github.com/DanielKeep/rust-custom-derive allows you to do some of the stuff mentioned in the RFC.

struct Variant1( i32, String );
struct Variant2 { f1: i32, f2: String };

pub enum Foo {
    Variant1(Variant1),
    Variant2(Variant2),
}

So if the RFC is revived, then the pattern should probably be discussed under alternatives.

delapuente added a commit to delapuente/qasm-rust-sim that referenced this pull request Aug 8, 2018

Opaque declarations, and measure and reset operations.
The instructions of QASM are classified in a hierarchy, I've tried to
implement them in the form of several enums but extending enums is not
supported in Rust (yet).

https://stackoverflow.com/questions/25214064/can-i-extend-an-enum-with-additional-values

The hierarchy is mostly related to QuantumOperation and UnitaryOperation.
A UnitaryOperation is always a QuantumOperation but not the other way around.

I went for the wrapping alternative which will make getting the value a bit
tedious. I hope to be able of implementing some traits to ease this task but
without rust-lang/rfcs#1450 it could be difficult.

In reality, this hierarchy is not strictly neccessary, I could have
grouped all the instructions in the same enum and let the grammar accept
those programas that use the different operations correctly, but I wanted
to experiment with a proper separation of instruction types. Let's see.

Perhaps, a different approach would be following this Thin Traits proposal:
http://smallcultfollowing.com/babysteps/blog/2015/10/08/virtual-structs-part-4-extended-enums-and-thin-traits/
@alexreg

This comment has been minimized.

Copy link

alexreg commented Aug 22, 2018

@nrc Any chance of getting this RFC reopened now, and potentially merged soon? I have a couple of things I'm working on now, but I'd be willing to take this on afterwards perhaps.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Aug 22, 2018

@alexreg You'd probably need to ask @nikomatsakis about the status of what'd be needed to implement this.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Aug 22, 2018

Okay, hopefully @nikomatsakis will see this comment and reply here.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Aug 22, 2018

"Safe unions", non-lexical &move liveness, and non-lexical enum variant type are all related. I'd love to have all 3!

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Aug 22, 2018

I have a question regarding enum variants as types, is this going to be correct?

fn main() {
    use std::mem::size_of;
    assert_eq!(size_of::<Option<u32>>(), 8);
    assert_eq!(size_of::<Some<u32>>(), 4);
}
@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Aug 22, 2018

@leonardo-m I don't think so, since we'd want to allow coercing &Some<u32> to &Option<u32>, which means it must already be the full size of the enum and have the tag set to 1.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Aug 22, 2018

@Ericson2314 Is there a proposal for "safe unions"? I don't know what that means. Unless perhaps it's similar to one of the ideas suggested here. No idea what the other two mean either. :-P

@jeffreydecker

This comment has been minimized.

Copy link

jeffreydecker commented Sep 24, 2018

I would love to see this get in at some point in the future. Allowing for variants to impl traits on their own would be a nice alternative to requiring a match to essentially do the same.

I'm beginning to use enum variants to help represent states in a state machine and this would go a long way in making the experience a bit better, mainly with regard to code organization.

The pattern referenced above by @nielsle is valid and I have tried it out. IMO it's not nearly as readable/understandable or even maintainable as supporting variants as first class types would be. I've actually stuck with using pure enum variants as opposed to that pattern because of this.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Sep 24, 2018

@jeffreydecker I think a bunch of us would like the feature! Fancy writing an RFC? ;-) You could easily get some help from folks here or on Discord, I reckon.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Sep 24, 2018

The blocking thing for this feature is properly understanding default type parameters and the interaction with the various kinds of fallbacks to defaults (e.g., numeric fallbacks). I think once we understand that a bit better, then we could re-open this RFC (or something like it).

@alexreg

This comment has been minimized.

Copy link

alexreg commented Sep 24, 2018

Ah, fair enough. We already have default type parameters though, so I don't see the problem.

@burdges

This comment has been minimized.

Copy link

burdges commented Sep 25, 2018

There is another concern in that refinement types would ideally extend enum variant types and be useful for formal verification, so an RFC would ideally consider those possible future directions.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Sep 25, 2018

I think that would be coming a very long way off, if ever... probably not worth much thought at this point.

@varkor

This comment has been minimized.

Copy link
Member

varkor commented Nov 10, 2018

I've submitted an RFC to follow up on this one and permit enum variants to be treated as types:
Enum variant types.

@nrc nrc referenced this pull request Nov 10, 2018

Open

Enum variant types #2593

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment