Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Unify and nest structs and enums #24

Closed
wants to merge 3 commits into from

9 participants

@nrc
Collaborator

Another alternative to RFC #5 and an extension/variant of RFC #11.

Unify enums and structs by allowing enums to have fields, and structs to have
variants. Allow nested enums/structs. Virtual dispatch of methods on struct/enum
pointers. Remove struct variants. Treat enum variants as first class. Possibly
remove nullary structs and tuple structs.

@cmr

@nick29581 you didn't modify the template

@nrc nrc Unify and nest structs and enums
Another alternative to RFC #5 and an extension/variant of RFC #11.

Unify enums and structs by allowing enums to have fields, and structs to have
variants. Allow nested enums/structs. Virtual dispatch of methods on struct/enum
pointers. Remove struct variants. Treat enum variants as first class. Possibly
remove nullary structs and tuple structs.
ef2ff45
@nrc
Collaborator

Whoops! Thanks @cmr , fixed now.

@bjz

Not sure if this is the right place to express this, but I really think our sum types should use a different keyword to C-style enums. Perhaps the union keyword?.

union Foo<T, U> {
    A(T, U),
    B { t: T },
    C,
}
enum Foo: c_uint {
    A = 0x01,
    B,
    C = 0x10,
    D,
}

Whilst the intention behind using enum was to make C and C++ developers more at home I feel like it causes too much confusion to be worth it. The only precedent I have found is in Haxe. Using the union keyword would at least still retain the 'friendliness factor'.

Below is an exchange on #ada I had a couple of days ago. Whilst I think the person in question was being overly antagonistic and somewhat close minded, it is a good example of the constant confusion I am greeted with when outsiders first encounter Rust's enums. I know I was certainly confused at first.

<Lucretia> had a skim through the rust tutorial, not convinced
<bjz> what do you mean?
<Lucretia> enums as Lists? what drugs are they on?
<Lucretia> see the tutorial
<bjz> oh the linked list tutorial
<Lucretia> the way to implement a list with an enum
<bjz> have you used haskell?
<Lucretia> at uni
<bjz> or an ML?
<Lucretia> didn't get it
<Lucretia> ope
<bjz> an enum is a sum type
<Lucretia> was given a haskell tutorial from here
<bjz> or a 'variant type'
<bjz> or tagged union
<bjz> lots of names for it
<bjz> http://en.wikipedia.org/wiki/Tagged_union
<bjz> using enums for lists isn't really ideomatic in Rust
<bjz> but they are very useful for things like abstract syntax trees
<Lucretia> yeah, I wouldn't say an enum is that at all and neither does that link
<bjz> you can think of them as nini dynamic type systems, but you have to check what type it is before you can do operations on them
<bjz> what do you mean?
<Lucretia> you say an enum is a tagged_union, that link does not say that at all, just searched for enum on that page - it uses an enum to determine the type of a variant record
<bjz> yeah, I don't like the use of the 'enum' keyword either
<bjz> but the semantics are the interesting bit
<Lucretia> but to say that an enum and a list go together in the way they do is just wrong
<Lucretia> I remember "Cons" from uni - the term only, the meaning, not at all
<bjz> think of it like this: you can express what the semaintics of a C or Java enum with rust's enum
<bjz> but you can also express a lot more
<bjz> C/Java's enum is a subset of what you can do with Rust's enum
<Lucretia> but an enumeration is a set of values, that's it, it's not a list, no matter how you twist things, it's just not
<Lucretia> it's an orthogonal and separate concept
<bjz> the is a enumerated set of *types*
<bjz> I would highly recommend learning some haskell
<bjz> it would probably make lots of this stuff more clear
<bjz> lists and trees spring naturally out of sum types
<bjz> (no matter what you call them)

Edit: Perhaps I should make a separate RFC for this - sorry if I am derailing things.

@cmr

On first (and second) read-through, it sits really unwell with me, and the amount of open questions is worrying. I feel this is a drastic increase in complexity for a feature that honestly should be rare.

@bjz

@cmr Yeah, I am rather confused by the RFC :/

I understand if structs and enums were unified because structs are basically just single-variant enums, but I am still unclear about the motivation for this specific proposal.

@SimonSapin

@bjz I agree that the name "enum" is not great for full sum types (since almost everybody else is using that name for (roughly) sums of unit types), but that discussion does not belong here. I think you should write another RFC.

@bjz

@SimonSapin Yeah, sorry. Glad to hear that at least somebody feels the same way though.

@nrc
Collaborator

@cmr - yeah, it is a pretty big change, and the change is complex, but I'm not sure it adds complexity - it certainly removes features from the language and in that way removes complexity. Although I can see that it does make enums harder to grok. The number of open question is indicative that it is a big change and I wanted to get early feedback (I could have sat on this for a week and not had any questions, but I don't think that is a good approach).

I hope that as well as addressing the inheritance motivation we also address the motivation for refinement types, so we are killing more than one bird here. Also, I think that if it is not used a lot, that is more motivation to fit into existing structures, which I think this does (somewhat) rather than adding new structures, even if the new ones are simpler.

@nrc
Collaborator

@bjz the motivation is the same as PR #5 - to efficiently support the DOM and COM-like stuff. I get to that in a very roundabout way, sorry. I hope this is a more Rust-y way to provide a solution than PR #5 is. I.e., I generalise existing features until things work, rather than adding new features.

@cmr
@nrc
Collaborator

@cmr - no, by using nested enums you can specify a subset, albeit only subsets specified when defining the enum, not any subset (so it is not a total replacement, but I hope can be used to satisfy the common case).

@bill-myers

I think this is better than #5, but I must say I like my own #11 better, which shares the core ideas of using enums, having first class variants, etc.

An issue I see in this RFC (and not in #11) is that structs can be both instantiated and inherited. This makes the language less expressive because there is no easy way to distinguish the types "exactly struct S1" and "struct S1 or any derived struct".

My proposal in #11 is to instead make structs non-inheritable, which means that one has to create an enum with an empty variant instead of a struct with a struct variant like in RFC, which allows to natrually distinguish the types above (the former is denoted by the empty variant name, and the latter with the enum name).

The other major difference with #11 is that this RFC uses virtual methods, and allows overriding non-abstract methods while #11 exclusively uses traits for inheritance, and only allows overriding abstract methods.

To sum it up, the idea of #11 is that traits can be implemented using the "impl as match" syntax, which means "derive a trait implementation whose methods match on all variants and call the corresponding method in the impl for the variant" (where the match is likely implemented as vtable dispatch) or explicitly, but you cannot implement a trait explicitly if a base enum also implements it explicitly.

I think that leads to an easier to understand and cleaner language, because it forces to give a names to sets of virtual methods, unifies virtual dispatch and enum matching, allows external implementations of virtual methods, and by only allowing to override abstract functions, makes method lookup far simpler.

A key insight in this area is that the compiler can convert a match in the same crate of the type into a virtual method dispatch by extracting match arms into functions and assigning a vtable slot, and viceversa can implement virtual functions by matching on type tags, and aside from external ABI interoperability constraints, it is in fact an implementation detail to decide which to use; thus, we should unify those notions.

@nrc
Collaborator

@bill-myers re inheritable structs - you don't need a struct with a struct variant, just a struct - in fact struct variants would disappear. We could tweak this so that structs were not inheritable, but I think there is value in being able to instantiate non-leaf 'classes' in an inheritance hierarchy - we definitely need this for the DOM. In fact, if we take #11, I think we would have to change this.

re virtual methods, again, I think it is a requirement to allow overriding of non-abstract methods. The rest is just a different syntax really. I'm not really sure if involving traits is an advantage or disadvantage - in particular, it is not clear to me where we would get thin pointers and where fat pointers. I think it is important for it to be clear from the syntax when you fall off a fast path.

I agree that match and virtual dispatch are the same from the implementation point of view. I toyed with the idea of only doing dispatch via match, but I think the syntax would be cumbersome. Unless I misunderstand your proposal, #11 does not really unify since you still have separate match statements and impl ... as match? I guess I don't feel too strongly that we should not have two mechanisms here since the use cases are kind of different, but I could be persuaded.

@bill-myers

To put it with an example, regarding struct inheritance:

struct A
{
    struct B {...}
}

Is equivalent to:

enum A
{
    struct AStruct,
    struct B {...}
}

So you don't need to be able to override structs.

But in the former syntax there is no way to distinguish between "exactly A" and "A or B", making the language less expressive.

In the latter syntax, the former is called AStruct, and the latter is called A.

Now of course you could introduce syntax like "&struct A" to make the distinction, but that complicates the language unnecessarily.

That's why I think allowing to inherit structs is bad.

@bill-myers

Regarding overriding methods, the pseudocode:

class A
{
    virtual void foo() {...}    
}

class A1 : A {}
class A2 : A {}

class B : A
{
    override void foo() {...}
}

Is equivalent to:

abstract class A
{
    void a_foo() {...}
    void foo();
}

class AStruct : A
{
    void foo() {a_foo();}
}

class A1 : A {void foo() {a_foo();}}
class A2 : A {void foo() {a_foo();}}

class B : A
{
    void foo() {...}
}

So there is no need to allow overriding non-abstract methods.

In the first snippet, calling "foo" on B could technically refer to both the "foo" on A and the "foo" on B and you now need an explicit notion of virtual dispatch to distinguish between them, while in the second it can only refer to the "foo" on B because the one on "A" is abstract.

The implication here is that a human reader cannot get confused and think that the "foo" on A is being called rather than the "foo" on B, because the one on A is abstract.

Plus, you need an "override" keyword and concept.

Calling the version of foo in A from B is easily done with "a_foo" in the second snippet without needing to introduce "super.foo()" or "A::foo()".

The second snippet is more verbose, but one could add some syntax sugar to make it less verbose (namely, allowing to implement A1 and A2 at once).

This is to some extent a matter of taste, but I think the second snippet makes a simpler language and fits more with current Rust.

@bill-myers

Unless I misunderstand your proposal, #11 does not really unify since you still have separate match statements and impl ... as match? I guess I don't feel too strongly that we should not have two mechanisms here since the use cases are kind of different, but I could be persuaded.

"impl as match" is proposed to be syntax sugar for implementing each method by doing a match on all variants and calling the corresponding method, plus the exception allowing you to override a trait implemented as "impl as match".

I must say I don't like the exception, but I'm not sure how to do it otherwise; the idea is that the exception is fine, because "impl as match" guarantees that there is no difference between calling the function on the parent or on the derived class, since the one in the parent just redirects using match to the one on the derived class.

I suppose we could instead specify that the compiler detects when a trait is implemented using a straight redirecting match, and treats it as "impl as match", although that's not so great either.

[of course, the idea is that the compiler then optimizes the matches to use a vtable in most cases]

I'm not really sure if involving traits is an advantage or disadvantage - in particular, it is not clear to me where we would get thin pointers and where fat pointers.

There's no difference: enum pointers are thin, and trait object pointers are fat.

The idea of invoking traits is that "virtual methods" are put in a trait instead, which is separately implemented on each variant, and where the implementation on the enum does "virtual dispatch" to the impl for the variant corresponding to the dynamic type (either as a built-in language concept of virtual dispatch, or using "impl as match" syntax sugar).

This allows to give a name to sets of virtual methods that must be implemented or overridden together, and makes it naturally possible to define things like "impl as match" that would otherwise have to take raw method names.

@bill-myers

Here is a motivating example for forcing virtual methods to be in traits and not allowing to override them.

Let's say you have a web browser with an object hierarchy that supports renderToOpenGL and renderToPixmap, which are supposed to render the same image, but one as an OpenGL texture, and the other as an array of bytes.

You are currently printing by printing the pixmap, but that sucks, so you add a renderToPostscript function, hook it so the postscript is sent to the printer, and implement it for a base class.

With this RFC, or if you were using C++ or Java, your application now compiles, but it is totally broken, because you forgot to implement renderToPostscript for derived classes, so printing a document now no longer looks the same as the on-screen document (since you are instead overriding renderToPixmap).

If instead one were forced to put those methods in a Render trait, then you will be immediately faced with the prospect of changing a trait, and if you do so, all impls will fail to compile until you provide an implementation of the new method.

Let's say you decide instead to add a new RenderToPostscript trait and implement it for the base class.

If overriding trait impls is allowed, then your application will once again compile, and once again be totally broken, since you forgot to implement it for derived classes.

If overriding concrete impls is not allowed, then your implementation will only be for one concrete class, and your program will not compile because you forgot to implement it for the other concrete classes.

0000-enum-struct.md
((350 lines not shown))
+virtual call and is only necessary if the static type implements `Drop`.
+
+## Initialisers
+
+Need to think a bit about struct initialisers. We should require all fields to
+be specified. We should support constructors too. I'm not sure how we support
+'struct' initialisers for enums - which should not be instantiable. Since there
+is no kind of cross-module inheritance, perhaps it is not an issue since fields
+can always be accessed.
+
+## Calling overridden methods
+
+If a method is overridden, we should still be able to call it. C++ uses `::`
+syntax to allow this. In the example above we use `Foo::bar(self)` to indicate
+static dispatch of an overridden method. I'm not sure if this is currently
+valid Rust or if it is the optimal tsolution. But it looks nice to me and we
@jaredly
jaredly added a note

typo tsolution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@nrc
Collaborator
nrc commented

Hmm, I fear I have missed something here - what I propose adds behaviour for pointer-to-enum that did not previously exist, but I neglected the enum value case. We certainly don't want all the variants of an enum like this to be the same size, so then matching on an enum value could not be supported. That seems bad. I'm not sure if there is a solution. I guess that is a distinction between enums and inheritance, and perhaps makes me feel less bad about the duplication of behaviour there with an approach like #5.

@nikomatsakis
Owner

@nick29581 I haven't followed the entire conversation. I had a hard time understanding the proposal, I fear, but my biggest fear was precisely what you seemed to be hitting on here -- I didn't quite get how the by-value enum case fits in.

@esummers

We certainly don't want all the variants of an enum like this to be the same size, so then matching on an enum value could not be supported.

@nick29581 Maybe enums could be fixed size or unsized based on context. A pointer to an enum that doesn't allow the variant to be changed could be unsized. All other instances could be sized. Basically the variant tag is immutable, but the fields inside could be mutable or immutable.

@dobkeratops

Interesting to see all this.

[1] would any of this facilitate a future enhancement (or unsafe hacks) where immutable enums could be compacted (the tag implies the size of the type, by type-specific lookup; different sized variants can be placed back to back, eg tree leaves and nodes , reducing the number of pointers required to do that sort of thing). of course i can do that now in C or in rust unsafe code.

[2] if you lose 'tuple-structs' (i dont mind, tuples and real structs are more useful), are the enum variants still going to be able to look the same .. a tag/variant name and a tuple.. i think those are very handy, even though i haven't wanted to use 'tuple-structs' elsewhere

[3] are you going to be declaring actual virtual functions like in C++ classes? it looked to me like you could keep the idea of traits describing vtables, and just use inheritance of structs to say a trait can assume some fields (and vica versa, perhaps inherit or embed the trait in a struct and it would check whats compatable)

[4] I likeed the idea of keeping vtables more general , eg adding sugar for accessing them and composing with a struct pointer for a specific call - allowing for layouts and uses beyond whats been formalized in various languages (like 'class-objects' that hold a vtable and metadata applied to a collection of other objects, or using vtable swaps as state-machines.. that could all be done safely if you had propper types for them). I was pleased with the hacks one seemed to be able to do already with transmute. (i saw eddyb's many and had a bash at emulating c++ layout myself)

i'm definitely keen on plain struct single inheritance, thats just shortening the paths to the most common data.

@nrc
Collaborator
nrc commented

@esummers I don't think that addresses the issue - the problem is with values only (pointer-to-variant is not really an issue). The problem is that some variants might be very small and some very large and we don't want to pad the smaller ones to the size of the larger ones. We must have a size for values to be able to compile, so unsized there isn't an option.

@nrc
Collaborator
nrc commented

@dobkeratops 1 - I don't think this would facilitate that, but we would probably need something like that to enable this, that is the downside I noted a few comments up and which I didn't think about initially.

2 - yes, enum variants could still be a name + tuple combo. Struct varitants could still be used, they would just be the same as regular structs, so its not that the idea disappears, only that it is redundant. The syntactic change to a program would just be adding the struct keyword.

3 - Yes, we add virtual fns to impls. We explicitly wanted to avoid traits for this because using traits requires a fat pointer and we want thin ones here. Having this as an optimisation is against Rust's guiding principle of predictable performance. Also, having fields in traits (in any way) further blurs the distinction between data and behaviour. Since we already allow functions (behaviour) for impls for structs, we don't make things worse this way.

4 - this is probably a matter of taste. It is certainly flexible and in some ways elegant. But I am not a fan, I prefer a language to be easier to use and present abstractions for that kind of thing. Having to use unsafe code/transmute for a relatively common and safe use case, seems bad to me. Its not clear to me if you can guarantee the performance characteristics we require that way either, but perhaps you can.

@dobkeratops

4- well with just a bit of safe sugar - it wouldn't be an unsafe codepath to do this. Some intrinsic functions..

fn vtable<St,Tr>() -> VTable<St,Tr>,
fn as_trait_obj<St,Tr>(s:&St,v:&VTable<S,T>)->&Tr   // St=Struct, Tr=Trait

maybe sleeker syntax is possible (... for .. ) get's the vtable - symetrical with 'impl for..' .... and could any tuple (&St, &VTable) be vcallable, '&Ttrait' is just something that coerces to..
https://gist.github.com/dobkeratops/9841737 this was my own hack to make C++ style vtables as it stands now, (i'm sure others have done similar and there might be better ways to do it)

I would see adding this type of sugar as leveraging more of rusts' existing character rather than retrofitting a completely different vtable system centred on structs

@esummers

The problem is that some variants might be very small and some very large and we don't want to pad the smaller ones to the size of the larger ones. We must have a size for values to be able to compile, so unsized there isn't an option.

@nick29581 I guess I didn't really mean unsized. I meant sized to the variant instead of sized to the enum. A pointer to an enum could be sized to the variant and everything else sized with padding to the largest variant. I think that once we just have a reference we don't care about the other variants because we can never become one of those. Basically the size is statically determined when it is constructed based on the size of the variant (but only when it is a pointer). When using virtual inheritance, you will always pass by reference.

Maybe I have a flaw in my reasoning somewhere, but I mean sized to variant when passed by reference.

EDIT: I was assuming heap allocations when using virtual inheritance (so size on stack doesn't matter), but maybe that is a bad assumption.

@nrc
Collaborator
nrc commented

@esummers - the problematic case is given an enum E and a function fn(x: E) you need to know how much space to leave for x on the stack. We solve this at the moment by leaving the same amount of space for every variant, that is the maximum amount. But if we want to allocate lots of the small variants and they are all padded to the size of the largest variant, then that is a waste. You can't size items differently if they are values or pointers-to-values, since you can dereference the pointer to get a value and then you need the size to be the same as if you had started with a value.

@nrc
Collaborator
nrc commented

@dobkeratops I'm afraid this is just going to come down to taste. You are right that we can avoid adding a language feature this way, but I don't think it is worth it in exchange for lots of ugly boilerplate all over the place.

If I understand your example correctly, you are still passing a tuple - so it is two words per pointer, not one.

@dobkeratops

for lots of ugly boilerplate
Something similar could be given in standard library code - and other features(vtable sugar/single inheritance) would streamline them

If I understand your example correctly, you are still passing a tuple
not quite: - in my example, the structure layout is the same as a C++ class: the vtable is "hacked out" by cast::transmute in make_class! and stored in the member 'vtable'. You reference the whole with one pointer.

then a temporary 'trait object' is created for a vcall ,by the member function '.as_trait_obj()'. I'm assuming that will inline. (I should add #[inline]). TBAA would cover opt.
That could be 'deref' to streamline the calls visually, but i'd already used that for field accesss (fields are behind '.data')

eddyb's sample is more interesting, it creates a type "Many" along similar lines that has multiple vtable interfaces carried for one struct .. mine could be seen as a special case of that.

Well I dont know whats easier to to implement in the compiler, or what would get more demand. I guess people are familiar with C++ behaviour, and virtuals,single-inheritance + traits wouldn't be so different to virtuals+multiplle-inheritance... but this method would keep one vtable concept and make it more versatile

@bill-myers

Regarding sizing, the simplest and default approach should be to have a fixed size like current enums (and thus pad the smaller variants).

As an extension, one can add an "unsized" keyword that makes enums unsized (which of course requires to have implemented DST before).

However, note that with unsized enums, you must either disallow assigning to an &mut or ~ of an enum, or throw a run-time error if the run-time variant is different (since assignment is impossible if the new variant is of a different size).

This is the same restriction that languages like Java or C# have (note that Java or C# also disallow assigning non-overridable classes, which is unnecessary and a bad idea in Rust).

You can allow to pass unsized enums by value by padding them, if inheritance is closed; if inheritance is open, then you cannot pass them by value (unless you autobox, but I guess we don't want that).

@nrc
Collaborator
nrc commented

Ah, that might be nice. I think we would indeed prevent dereferencing of DST and pointers to struct objects, so that side of things would all work. We would just need to add the keyword as you suggest to indicate the unsized-ness and forbid referring to such values by their enum (as opposed to variant) type.

Padding (even for closed inheritance, as here) is a non-starter in general, since some variants might be hugely bigger than others (e.g, in the DOM).

@nrc
Collaborator
nrc commented

As a note (which I'll incorporate into the RFC later), JDM pointed out that having all 'classes' in one lexically nested block is impractical. We also need to allow specifying 'classes' in sub-modules (so they can be in different files). Both problems are solvable, but need to be addressed.

@nrc
Collaborator

Superseded by #142

@nrc nrc closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Mar 30, 2014
  1. @nrc

    Unify and nest structs and enums

    nrc authored
    Another alternative to RFC #5 and an extension/variant of RFC #11.
    
    Unify enums and structs by allowing enums to have fields, and structs to have
    variants. Allow nested enums/structs. Virtual dispatch of methods on struct/enum
    pointers. Remove struct variants. Treat enum variants as first class. Possibly
    remove nullary structs and tuple structs.
Commits on Apr 1, 2014
  1. @nrc

    Added JDM's example

    nrc authored
  2. @nrc

    fix a typo

    nrc authored
This page is out of date. Refresh to see the latest.
Showing with 465 additions and 0 deletions.
  1. +465 −0 0000-enum-struct.md
View
465 0000-enum-struct.md
@@ -0,0 +1,465 @@
+- Start Date: 2014-03-31
+- RFC PR #:
+- Rust Issue #:
+
+
+# Summary
+
+Unify enums and structs by allowing enums to have fields, and structs to have
+variants. Allow nested enums/structs. Virtual dispatch of methods on struct/enum
+pointers. Remove struct variants. Treat enum variants as first class. Possibly
+remove nullary structs and tuple structs.
+
+The motivation for this is to provide an alternative to Java-style single
+inheritance. I.e., efficient sharing of fields, thin pointers, and virtual
+method dispatch. Along the way we simplify the language by unifying two language
+items and making obsolete a few more.
+
+Despite being a fairly radical proposal, I believe this is mostly backwards
+compatible.
+
+# Motivation
+
+Supporting efficient, heterogeneous data structures such as the DOM. Precisely
+we need a form of code sharing which satisfies the following constraints:
+
+* Cheap field access from internal methods;
+* Cheap dynamic dispatch of methods;
+* Cheap downcasting;
+* Thin pointers;
+* Sharing of fields and methods between definitions;
+* Safe, i.e., doesn't require a bunch of transmutes or other unsafe code to be usable.
+
+Example (Java-like pseudo-code):
+
+```
+class Element {
+ Element parent, left-sibling, right-sibling;
+ Element[] children;
+
+ foo();
+
+ template() {
+ x = foo();
+ ...
+ }
+}
+
+class Element1 : Element {
+ Data some-data;
+
+ template() {
+ return some-data;
+ }
+}
+
+final class Element2 : Element {
+ ...
+}
+```
+
+
+# Detailed design
+
+## Extend enums with fields
+
+For example,
+
+```
+enum E {
+ f1: T1,
+ Var1(T5, T6),
+ f2: T2
+}
+```
+
+## Extend structs with Variants
+
+For example,
+
+```
+struct S {
+ f3: T3,
+ Var2(T5, T6),
+ f4: T4
+}
+```
+
+## I.e., unify structs and enums
+
+With the above extensions, enums and structs are basically the same (they have
+the same syntax (modulo the keyword), we would allow the same type
+parameterisation, etc.). The difference is that only structs can be instantiated
+(as opposed to one of the variants; you could think of enums being abstract
+structs). So we could have values of `Var1` and `Var2` (but not `f1`, etc.), and
+`S`, but not `E`. When instantiating `S`, we must specify values for fields `f3`
+and `f4`. Values of `Var1` have named fields `f1` and `f2` and unamed fields of
+types `T5` and `T6`, all must be specified when instantiating `Var1` (questions
+- what should the syntax look like? How do we specify constructors for the `E`
+part?).
+
+## Allow nested enums/structs
+
+For example,
+
+```
+enum E1 {
+ enum E2 {
+ ...
+ }
+ struct S1 {
+ enum E3 { ... }
+ struct S2 { ... }
+ }
+}
+```
+
+Nesting does not introduce a scope, so from the same scope as `E1` is declared,
+we can refer to `E1`, `E2`, `S1`, `E3`, and `S2` (modulo privacy, see open
+questions). Nested items inherit fields from outer items. So, `S2` would inherit
+fields declared in `E1` and `S1`.
+
+
+## Treat variants as 'first class'
+
+As well as instantiating variants, we allow the use of variants (whether
+structs, enums, tuples, or nullary) as types and allow impls for them. In
+combination with nested enums this is a partial replacement for 'refinement'
+types (that is, specifying a type on a subset of the variants of an enum).
+However, this is not the main motivation. The idea is that a variant (probably a
+struct variant) will replace a base class in a class hierarchy; an enum would
+replace an abstract base class and a struct would replace a non-abstract base
+class or leaf (concrete) class. Making variants first class makes it possible to
+refer to enum/struct objects other than the top level by type, and to provide
+methods for them in impls.
+
+
+## Virtual dispatch of methods for struct/enum objects
+
+We allow methods in impls for struct/enum objects (that is, references to
+struct/enum types) to be marked as `virtual` (allows overriding) and/or
+`override` (overrides a method). Methods declared on outer items are inherited
+by nested iterms. E.g., from the example above, a method declared on `E1` will
+be inherited by `E2` and `S2` (and others). If a method is declared `virtual`,
+then impls for nested items may override that method. If and only if a method is
+marked `override` then it must override a method declared in an outer item.
+Methods for enums may be declared without a body (as pure abstract/virtual
+methods in Java/C++ or required methods on traits). These must be overriden by
+any non-enum nested items. (Question - should we extend this to structs - i.e.,
+allow pure virtual methods for structs and track these and not allow
+instantiation of such structs?).
+
+
+## V-tables, thin pointers, and down-casting
+
+Struct/enum objects are referred to using thin pointers. Virtual dispatch is
+implemented using Java-style (or C++ with virtual single inheritance and without
+multiple inheritance) v-tables. That is, `&S1` or `~S1` is implemented as a
+pointer to a structure consisting of a pointer into a v-table (which identifies
+the dynamic type) and values for all fields of the dynamic type. Method call is
+implemented via the v-table. Since we identify the dynamic type, we can allow
+safe dynamic downcasting. This can be done by a match statement, continuing the
+example above:
+
+```
+fn f(x: &E1) {
+ match x {
+ y @ &S2 {..} => { ... } // y is effectively a downcast of x to S2
+ y @ &S1 {..} => { ... } // y is effectively a downcast of x to S1
+ _ => { ... } // x isn't an instance of S1 or S2
+ }
+}
+```
+
+We would allow the usual pattern matching too.
+
+Question - might be handy to allow skipping the `{..}` for structs, then again,
+hopefully downcasting won't be commonly used so maybe we don't need to.
+
+
+## Remove struct variants
+
+Unification of structs and enums makes struct variants obsolete. For example,
+
+```
+enum E {
+ Variant1{f: T}
+}
+```
+
+can be written as
+
+```
+enum E {
+ struct Variant1{f: T}
+}
+
+```
+
+Therefore, we may as well remove struct variants (they are currently
+feature-gated).
+
+
+## Coercions (subtyping)
+
+Nesting of enums/structs should give (probably implicit) coercions of
+references. E.g., (again, from the above example), `&S2` <: `&S1` <: `&E1`.
+There is no subtyping between values, to avoid the slicing problems (er, is this
+right? Or my imagination? I think we do get into problems with the expectation
+of virtual dispatch, but not being able to, safely, but probably I need to think
+more about this).
+
+We should forbid dereference of pointers to non-leaf items. This is not
+backwards compatible, since for a non-nested enum (as currently present in the
+language), we would allow dereference of references to such enums. We could
+safely allow dereference inside a match expression (as in the downcast example,
+above) and hopefully that covers most of the current use cases. This would need
+a bit of investigation.
+
+
+# Example
+
+The first example in Java-ish syntax would be written as:
+
+```
+enum Element {
+ parent: RC<Element>,
+ children: ~[RC<Element>],
+ left: RC<Element>,
+ right: RC<Element>,
+
+ struct Element1 {
+ x: uint,
+ y: uint,
+ },
+
+ struct Element2 {
+ x: uint,
+ y: uint,
+ }
+}
+
+impl Element {
+ virtual fn foo(&self) -> uint;
+
+ fn template(&self) {
+ let x = self.foo();
+ ...
+ }
+}
+
+impl Element1 {
+ override fn foo(&self) -> uint { self.x + self.y }
+}
+
+impl Element2 {
+ override fn foo(&self) -> uint { self.x + self.y }
+}
+```
+
+None of this prevents the usual use of traits and impls, which hopefully are an
+alternative to multiple inheritance. For example, `nsIConstraintValidation` is a
+mixin class in the Gecko DOM implementation. It could be implemented in Rust
+as something like:
+
+```
+impl Element {
+ virtual fn bar(&self) -> uint;
+}
+
+trait NSICompositor {
+ fn x(&self) -> uint;
+ fn y(&self) -> uint;
+ fn bar(&self) -> uint { self.x() + self.y() }
+}
+
+impl NSICompositor for Element1 {
+ fn x(&self) -> uint { self.x }
+ fn y(&self) -> uint { self.y }
+}
+
+impl Element1 {
+ override fn bar(&self) -> uint { NSICompositor::bar(self) }
+}
+
+impl NSICompositor for Element2 {
+ fn x(&self) -> uint { self.x }
+ fn y(&self) -> uint { self.y }
+}
+
+impl Element2 {
+ override fn bar(&self) -> uint { NSICompositor::bar(self) }
+}
+```
+
+# JDM's example
+
+From https://gist.github.com/jdm/9900569
+
+```
+struct Node {
+ parent: Rc<Node>,
+ first_child: Rc<Node>,
+
+ struct TextNode {},
+
+ enum Element {
+ attrs: HashMap<str, str>,
+
+ struct HTMLImageElement {}.
+
+ struct HTMLVideoElement {
+ cross_origin: bool
+ }
+ }
+}
+
+impl Element {
+ fn set_attribute(&mut self, key: &str, value: &str)
+ {
+ self.before_set_attr(key, value);
+ //...update attrs...
+ self.after_set_attr(key, value);
+ }
+
+ virtual fn before_set_attr(&mut self, key: &str, value: &str);
+ virtual fn after_set_attr(&mut self, key: &str, value: &str);
+}
+
+impl HTMLImageElement {
+ override fn before_set_attr(&mut self, key: &str, value: &str)
+ {
+ if (key == "src") {
+ //..remove cached image with url |value|...
+ }
+ Element::before_set_attr(self, key, value);
+ }
+}
+
+impl HTMLVideoElement {
+ override fn after_set_attr(&mut self, key: &str, value: &str)
+ {
+ if (key == "crossOrigin") {
+ self.cross_origin = value == "true";
+ }
+ Element::after_set_attr(self, key, value);
+ }
+}
+
+fn process_any_element(element: &Element) {
+ // ...
+}
+
+fn foo() {
+ let videoElement: Rc<HTMLVideoElement> = ...;
+ process_any_element(videoElement);
+
+ let node = videoElement.first_child;
+
+ match node {
+ element @ Rc(Element{..}) => { ... }
+ _ => {
+ let text = match node {
+ text @ Rc(TextNode {..}) => Some(text),
+ _ => None,
+ }
+ }
+ }
+}
+```
+
+
+# Alternatives
+
+RFC 5 - virtual structs
+
+RFC 11 - Alternative to virtual struct and functions by extending enums
+
+RFC 9 - RFC for "fat objects" for DSTs
+
+There's also a version of RFC 5 using macros etc. to add fewer language features.
+
+
+# Unresolved questions
+
+## Trait methods
+
+I think requiring indication of overridable and overriding methods is a good
+thing (both Java and C++ have keywords or annotations for this). However, we
+don't require them for methods in traits - should we? Or should we not require
+them for structs/enums for consistency? If we do want them for traits should
+they be in the trait or the impl? Trait seems to make more sense, but impl is
+what I propose here for structs/enums. I would like to have a consistent story
+here.
+
+
+## Remove tuple structs, nullary structs
+
+Unifying structs and enums and making variants first class makes enum structs
+and empty structs obsolete. They can be replaced by an enum with a single tuple
+variant or a single nullary variant, respectively. By combining with privacy
+annotations we might get a nice separation between interface and implementation.
+On the other hand it requires an extra name (maybe we should allow anonymous
+enums?) and a bit more syntax. One use case for tuple structs is new types. Not
+sure if the interface/implementation separation helps there or whether the extra
+`enum` keyword, name, and braces are just extra boilerplate. I think removing
+some language items would be nice.
+
+
+## Privacy
+
+I think all fields should be private by default on enums and structs, and
+variants should be public. We should allow `pub` and `priv` annotations to
+change these defaults. But we need to think about this a bit more deeply.
+
+
+## Destructors
+
+How should they work? I feel the C++ approach is too much of a foot gun. We
+should always be able to infer whether or not a destructor is virtual. Need to
+work out how exactly implementing the drop trait interacts with nested enums. We
+need to cope with the situation where a struct/enum object with static type T1
+and dynamic type T2 goes out of scope and T2 implements `Drop` and T1 doesn't -
+we still need to call T2::drop (and then call the destructors of any types
+between T2 and T1). One solution could be that if a struct implements `Drop`
+then so must the outer struct/enum. Calling `drop` is then just a regular
+virtual call and is only necessary if the static type implements `Drop`.
+
+## Initialisers
+
+Need to think a bit about struct initialisers. We should require all fields to
+be specified. We should support constructors too. I'm not sure how we support
+'struct' initialisers for enums - which should not be instantiable. Since there
+is no kind of cross-module inheritance, perhaps it is not an issue since fields
+can always be accessed.
+
+## Calling overridden methods
+
+If a method is overridden, we should still be able to call it. C++ uses `::`
+syntax to allow this. In the example above we use `Foo::bar(self)` to indicate
+static dispatch of an overridden method. I'm not sure if this is currently
+valid Rust or if it is the optimal solution. But it looks nice to me and we
+need something for such a situation.
+
+## Generics
+
+Not sure exactly how generics would work right now. I assume generics in outer
+items are available (and not overridable/shadowable) in inner items. All actual
+type parameters must be specified or inferred when an item is instantiated or
+used for a type (which is a little counter-intuitive). E.g.,
+
+```
+struct S1<X> {
+ struct S2<Y> {
+ ...
+ }
+}
+```
+
+When we use `S2` we would have to use `S2<T1, T2>`. Or perhaps we should say we
+require at least as many type variables in inner items as outer and implicitly
+substitute and outer type variables are not available inside inner items (i.e.,
+in the example above, `X` and `Y` are implicitly linked and `X` can't be used
+inside `S2`. We would use `S2<T>`). Or perhaps we should make the substitution
+explicit somehow (this would be my preferred solution, but I'm not sure how to
+express it).
Something went wrong with that request. Please try again.