Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Using enums like traits #2618

Closed
wants to merge 8 commits into from
Closed

Conversation

porky11
Copy link

@porky11 porky11 commented Dec 25, 2018

@Centril Centril added T-lang Relevant to the language team, which will review and decide on the RFC. A-syntax Syntax related proposals & ideas A-data-types RFCs about data-types A-traits Trait system related proposals & ideas A-sum-types Sum types related proposals. A-typesystem Type system related proposals & ideas A-machine Proposals relating to Rust's abstract machine. labels Dec 26, 2018
@SimonSapin
Copy link
Contributor

Now enums implicitly generate structs [for each variant] like this:

This should probably be in the summary.


This RFC seems similar to #1450.


This new approach will not create enum types just to match them directly afterwards, when it's still known, and removes a little unnecessary overhead.

But this comes with problems. When adding a new event type, it has to be added in many places instead of just once in the enum declaration and once in the match. The match even will care if you forget the branch. So this small performance benefit is most likely not worth it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The match even will care if you forget the branch" what does this mean?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match in rust generates a warning, when the match is not exhausting, so when adding a new variant to the enum, you won't forget to add this branch in match. That's nothing new, but an advantage opposed to the more efficient approach I mentioned.
Maybe I should reformulate it this.

text/0000-impl-enum.md Outdated Show resolved Hide resolved

```rust
trait Trait {
fn test<T: Enum>(&self, arg: Enum);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be arg: T? Doesn't make sense otherwise.

text/0000-impl-enum.md Outdated Show resolved Hide resolved
Enum types can just be used like trait types now:

* It's preferred to add a `dyn` modifier, when using an enum directly.
* It's also possible to use enums for generics additional to traits.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this exactly mean?

@clarfonthey
Copy link
Contributor

This basically seems like using enums like sealed traits. All of the prior art regarding sealed traits, and sealed traits as an alternative, should probably be included in the RFC. Alongside the prior art from Kotlin, which allows sealed classes natively in the language.

@L-as
Copy link

L-as commented Dec 26, 2018

I think this is a nice concept, but some things are still unclear to me, and I wish the reference-level explanation were fleshed out. I can see that you think the guide-level explanation is clear enough, but I can't agree; some things are still a bit unclear to me, such as how matching works, what the deal is with using enums as associated types, etc.
The reference-level explanation is explained as such in the template file in this repository:

This is the technical portion of the RFC. Explain the design in sufficient detail that:

    Its interaction with other features is clear.
    It is reasonably clear how the feature would be implemented.
    Corner cases are dissected by example.

The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.

Also, if I understand it correctly, and you want to be able add associated enums to traits, I do not believe such a feature should be in this RFC, but instead in another RFC. In addition, I don't believe associated enums is a feature that should even be added, since it doesn't make much sense; it would be much better if traits could have associated traits (and if traits could also be generic arguments, like lifetimes and types and now constants too, but that may also be another RFC), since associated traits would also work with enums with this proposal.

@porky11
Copy link
Author

porky11 commented Dec 26, 2018

This basically seems like using enums like sealed traits. All of the prior art regarding sealed traits, and sealed traits as an alternative, should probably be included in the RFC. Alongside the prior art from Kotlin, which allows sealed classes natively in the language.

I have to look at this.

Also, if I understand it correctly, and you want to be able add associated enums to traits, I do not believe such a feature should be in this RFC, but instead in another RFC. In addition, I don't believe associated enums is a feature that should even be added, since it doesn't make much sense; it would be much better if traits could have associated traits (and if traits could also be generic arguments, like lifetimes and types and now constants too, but that may also be another RFC), since associated traits would also work with enums with this proposal.

I also thought first, this should be included in a different RFC, but I couldn't get my example working in a generic way without these. Maybe they should be added to future instead.

@chpio
Copy link

chpio commented Dec 27, 2018

@porky11 please update the rendered link, preferably to a branch

@H2CO3
Copy link

H2CO3 commented Dec 27, 2018

This seems to be a massive change. I'm really concerned about backwards compatibility, especially in the light of the sentence "This would ensure backwards compability" in the Alternatives section. This also severely increases the complexity of the language, which is even more worrying.

In the example, you write:

The event types are known at compile time, so it would be more efficient to write it like this:

In terms of what kind of resource is it more efficient? Execution time? Sure, the event types are known at compile time, but events can still be dispatched in any order at any time, can't they? Do you mean that update doesn't have to switch on the enum discriminant? That might be true, however since the variant is passed in as a constant to the function, I would expect basic inlining and dead code elimination passes in the compiler to eliminate that runtime cost.

will not create enum types just to match them directly afterwards

Why would you create an enum if not for matching on it? The main point of the enum as a language construct is to express alternatives, and the way to extract knowledge about those alternatives is pattern matching against variants. Deeming this useless doesn't sound like a considerate argument to me.

As to the feature itself, I have several conceptual problems with it. It tries to treat two very different constructs (enums and traits) the same, while splitting the behavior of a seemingly unified concept (structs) in two ("regular" vs "variant" structs). This has multiple issues:

  • There's a tried and true user interface design principle, which applies to programming language design as well. It goes like this: things that look similar should work similarly, and things that look different should work differently. (The goal being ease of understanding and learning.) This proposal goes against both parts.
  • Incidentally, it also violates the single responsibility principle. Defining an enum would now define additional struct types and a trait implicitly. There's a lot going on here; for one I would be very surprised (and upset) if this prevented me from defining a struct because I happened to use its name in a variant, or anything like this.

Considering the matching example:

if let Struct2(...) = Struct1 {
    ...
}

while let Struct2(...) = Struct1 {
    ...
}

When two structs implement the same enum, only the matching branch stays. All other branches are removed at compile time.

I don't get what exactly is meant by this. Is this code at the top level? Within a function? How can a type of Struct1 be bound to a pattern of type Struct2? Or is Struct1 now a variable? If so, what is its type? Neither the example code nor the accompanying explanation are clear and sufficiently detailed. If Struct1 is just an instance of the containing enum type, how is this better than letting already-existing const propagation and DCE in the compiler remove the branch? That works today, as demonstrated by this playground.

Since enums cannot be extended after definition, it's allowed to make traits, which contain generic functions, into objects, in case the generics are required to be enums: […] Using multiple dynamic enums for traits, that will be made into an object, should be avoided.

I'm sorry but I don't understand this at all. What is required to be an enum? What are we making into trait objects? The enum itself? Or something else? What is a "dynamic enum", and in what way does it differ from regular enums? What does it mean to "use multiple dynamic enums for traits", and why should it be avoided?

It may be useful to define a trait, specialized for multiple different enums.

What does this do, what kind of behavior does it have, and why is it useful?

To sum up, this RFC would introduce a high amount of complexity and non-intuitive behavior, bringing with itself elements that look and work very differently from what long-time users of the language are accustomed to. It is also (apparently) a breaking change, and quite a big break in fact, because it affects two of the most fundamental parts of the type system, traits and generics, in many ways, as well as the creation and behavior of enum types, which are also a frequently-used feature. I think that so much breakage is unwarranted even between editions.

Unfortunately, the explanations are often unclear and missing important details as well. This makes it even harder to appreciate the motivation. Overall, in the light of this I don't think the benefits weigh up to these significant downsides at all.

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My overall feeling is that this RFC needs several times more detail than it currently has, especially wrt. the reference. I also think that #2593 is a more limited proposal that seems to achieve some of the goals set out but with less complexity.


Enum types can just be used like trait types now:

* It's preferred to add a `dyn` modifier, when using an enum directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what this entails from the phrasing.

// update implementation to match new version of Update
enum Event = Event;
fn update<E: Event>(&self, event: E) -> bool {
match event {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that some sort of type-case is going on here... could you elaborate?
If I didn't know that Event is a bound here I would interpret this as dependent typing...

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

It seems the explanation in the previous part is already detailed enough for most parts and moving parts down here will just make it less clear.
Copy link
Contributor

@Centril Centril Dec 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all. It's not even clear at a high level from the guide what this does let alone what impact it would have on Rust's static semantics, dynamic semantics, and syntax.

let value: EnumName::Variant1 = EnumName::Variant1;
```

But here it's confusing, why this version is selected, so I decided against.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborate, why is it messy?

@porky11
Copy link
Author

porky11 commented Dec 30, 2018

This seems to be a massive change. I'm really concerned about backwards compatibility, especially in the light of the sentence "This would ensure backwards compability" in the Alternatives section. This also severely increases the complexity of the language, which is even more worrying.

Yes, these are true: It's a massive change and increases the complexity. But backwards compatibility should not be a problem. Everything, that currently works, would just work slightly different internally and probably more efficient. Changes to already working code are not needed.

In the example, you write:

The event types are known at compile time, so it would be more efficient to write it like this:

In terms of what kind of resource is it more efficient? Execution time? Sure, the event types are known at compile time, but events can still be dispatched in any order at any time, can't they? Do you mean that update doesn't have to switch on the enum discriminant? That might be true, however since the variant is passed in as a constant to the function, I would expect basic inlining and dead code elimination passes in the compiler to eliminate that runtime cost.

The enums will need less memory. In case, the enum variants differ in size very much, this may even be recognizable in some cases. It will also be more efficient, because it's not necessary to match at runtime anymore.
You are right: Inlining may also be a solution. But in real world cases, you would maybe want to pass the enum through a function hierarchy, before matching, and want every function to be optimized at compile time.
This may cause larger compile times and langer binaries, but giving the programmer the option to easily switch between different kinds of efficency is a good thing normally.

will not create enum types just to match them directly afterwards

Why would you create an enum if not for matching on it? The main point of the enum as a language construct is to express alternatives, and the way to extract knowledge about those alternatives is pattern matching against variants. Deeming this useless doesn't sound like a considerate argument to me.

It's intended to match on the enum. But since it's already known at compile time, which enum variant is chosen, it's possible to select the branch at compile time, too.

As to the feature itself, I have several conceptual problems with it. It tries to treat two very different constructs (enums and traits) the same, while splitting the behavior of a seemingly unified concept (structs) in two ("regular" vs "variant" structs). This has multiple issues:

There is no difference between variant structs and normal structs.
When defining enum variants, they will just be structs with only public fields and public inside the enum namespace.

* There's a tried and true user interface design principle, which applies to programming language design as well. It goes like this: things that look similar should work similarly, and things that look different should work differently. (The goal being ease of understanding and learning.) This proposal goes against both parts.

Enums and traits in this proposal work similar. That's why I chose to use the same syntax.
Dynamic enums/traits are just generic types, where a specialized version is chosen at runtime.
Generic enums/traits are just structs, which select specific versions of behaviour at runtime.

* Incidentally, it also violates the single responsibility principle. Defining an `enum` would now define additional struct types and a trait implicitly. There's a lot going on here; for one I would be very surprised (and upset) if this prevented me from defining a struct because I happened to use its name in a variant, or anything like this.

Considering the matching example:

if let Struct2(...) = Struct1 {
    ...
}

while let Struct2(...) = Struct1 {
    ...
}

When two structs implement the same enum, only the matching branch stays. All other branches are removed at compile time.

In this case, when both structs are enum variants, the valid branch would never match, so it's never called. This can be ensured at compile time here, so this branch can be removed.

I don't get what exactly is meant by this. Is this code at the top level? Within a function? How can a type of Struct1 be bound to a pattern of type Struct2? Or is Struct1 now a variable? If so, what is its type? Neither the example code nor the accompanying explanation are clear and sufficiently detailed. If Struct1 is just an instance of the containing enum type, how is this better than letting already-existing const propagation and DCE in the compiler remove the branch? That works today, as demonstrated by this playground.

So no, the pattern of one can't be bound to the other. Only if they implement the same enum, the match will not be an error, but instead just be discarded.
Using const propagation, the compiler could not remove the branch, when using it in a nested function.
The function needs to be declared inline in order to get this work.
I'm not sure, if inlining is a better option here. It will almost achieve the same effect as my approach. The problem with inlining is, the code is inlined for each function call, even if the arguments are exactly the same. This would not happen, when using the generic function approach.

Since enums cannot be extended after definition, it's allowed to make traits, which contain generic functions, into objects, in case the generics are required to be enums: […] Using multiple dynamic enums for traits, that will be made into an object, should be avoided.

I'm sorry but I don't understand this at all. What is required to be an enum? What are we making into trait objects? The enum itself? Or something else? What is a "dynamic enum", and in what way does it differ from regular enums? What does it mean to "use multiple dynamic enums for traits", and why should it be avoided?

A dynamic enum is the current version of an enum.
That's not what I'm talking about. I made a mistake here.
Multiple generic enums should be avoided, because every combination of variants may generate a new function implementation.
But when not using all combinations, it wouldn't be a problem anyway, I think, like it isn't with generics.

It may be useful to define a trait, specialized for multiple different enums.

What does this do, what kind of behavior does it have, and why is it useful?

This is necessary to get my example work in a generic way. I could not find a different way.

To sum up, this RFC would introduce a high amount of complexity and non-intuitive behavior, bringing with itself elements that look and work very differently from what long-time users of the language are accustomed to. It is also (apparently) a breaking change, and quite a big break in fact, because it affects two of the most fundamental parts of the type system, traits and generics, in many ways, as well as the creation and behavior of enum types, which are also a frequently-used feature. I think that so much breakage is unwarranted even between editions.

Yeah, the complexity is also the main drawback for this. But I think, this is a pretty intuitive behavour and also shouldn't break anything.

Unfortunately, the explanations are often unclear and missing important details as well. This makes it even harder to appreciate the motivation. Overall, in the light of this I don't think the benefits weigh up to these significant downsides at all.

I try to make it clearer. I hope, this already helped.

@H2CO3
Copy link

H2CO3 commented Jan 1, 2019

The enums will need less memory. In case, the enum variants differ in size very much, this may even be recognizable in some cases.

There's already a working Clippy lint for that. It also comes with the suggested solution: boxing the large variant(s). It doesn't need to be any more elaborate than that.

But in real world cases, you would maybe want to pass the enum through a function hierarchy, before matching, and want every function to be optimized at compile time.

I'm not sure what you mean by "you want every function to be optimized at compile time". Surely the compiler optimizes every function at compile time (if you so request, e.g. using the --release flag to cargo). How is inlining "less compile-time"? It also seems to me that you aren't aware that inlining occurs automatically (in optimized builds), and not only for functions marked #[inline]. It can (and usually does) also happen across multiple levels of function calls. Optimizers these days are very smart.

Later you assert:

Using const propagation, the compiler could not remove the branch, when using it in a nested function.
The function needs to be declared inline in order to get this work.

So basically this is not true, and even if it were, a simple, local, well-understood #[inline] attribute is far more preferable to a huge amount of complication to the type system.

Overall it seems to me that you are heavily underestimating the standard optimization capabilities of modern compilers (including rustc and its backend, LLVM). I suspect that the vast majority of the improvements you assign to this RFC are already realized by existing optimization strategies in the compiler. This makes the perceived gains marginal compared to the increase in language complexity.

There is no difference between variant structs and normal structs.

The very RFC text says that this is not true:

How matching works

Matching variant structs works different to matching normal structs:

Enums and traits in this proposal work similar.

They might under this proposal, that's exactly my complaint. Currently they don't, and that is the right thing to do, but which the RFC wants to change. One could probably bend any two unrelated language features to work similarly, but that's not a feat unto itself. There's little point in merely duplicating features, and forcing the same behavior upon them even if they had been initially designed to solve different problems is actively harmful.

Dynamic enums/traits are just generic types, where a specialized version is chosen at runtime.

What's the difference between dynamic enums and today's (regular) enums then?

Generic enums/traits are just structs, which select specific versions of behaviour at runtime.

What kind of "behavior", precisely? Trait methods? Enum variants? Doesn't the former just have the same effect as a trait object, and the latter that of the existing enum type?

This is necessary to get my example work in a generic way. I could not find a different way.

I do believe that, it's just that I missed a more detailed (possibly technical) explanation as to why that is the case.

@alercah alercah mentioned this pull request Jan 2, 2019
@nikomatsakis
Copy link
Contributor

@rfcbot fcp close

In general, I do think that enums and traits have interesting overlap -- there is a clear desire to have types for enum variants (as in #2593) and it seems like enums and sealed traits are conceptually very similar. Despite that, I am moving to close this RFC, for a few reasons:

  • the text itself is quite vague -- it is a bit hard for me to tell precisely problem is being attacked here (and I see @Centril made some similar comments in the past).
  • I don't think that major rework of enums currently aligns well with our current priorities or those priorities we are likely to set in the upcoming roadmap.
  • there are other pending RFCs (e.g., Enum variant types #2593) which seem to be more clearly specified and likely have substantial overlap. I would encourage the author to take a look at Enum variant types #2593 and see what overlap exists.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 3, 2019

Team member @nikomatsakis has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Jan 3, 2019
@porky11
Copy link
Author

porky11 commented Jan 3, 2019

There's already a working Clippy lint for that. It also comes with the suggested solution: boxing the large variant(s). It doesn't need to be any more elaborate than that.

This would then be possible without the need of boxing values. But that's not a large benefit. I just wanted to mention it.

I'm not sure what you mean by "you want every function to be optimized at compile time". Surely the compiler optimizes every function at compile time (if you so request, e.g. using the --release flag to cargo). How is inlining "less compile-time"? It also seems to me that you aren't aware that inlining occurs automatically (in optimized builds), and not only for functions marked #[inline]. It can (and usually does) also happen across multiple levels of function calls. Optimizers these days are very smart.

Overall it seems to me that you are heavily underestimating the standard optimization capabilities of modern compilers (including rustc and its backend, LLVM). I suspect that the vast majority of the improvements you assign to this RFC are already realized by existing optimization strategies in the compiler. This makes the perceived gains marginal compared to the increase in language complexity.

If the compiler is already capable of creating specialized versions of every function, that uses a variant, known at compile time, this RFC really isn't needed. But I don't think, it is. This would need a lot of rust specific knowledge, which LLVM doesn't have anymore.

There is no difference between variant structs and normal structs.

The very RFC text says that this is not true:

How matching works

Matching variant structs works different to matching normal structs:

If structs are defined to implement a trait, they just have additional capabilities. They can be matched as normal and additionally with other structs, that implement the same trait, which will just remove the match arm at compile time.

Enums and traits in this proposal work similar.

They might under this proposal, that's exactly my complaint. Currently they don't, and that is the right thing to do, but which the RFC wants to change. One could probably bend any two unrelated language features to work similarly, but that's not a feat unto itself. There's little point in merely duplicating features, and forcing the same behavior upon them even if they had been initially designed to solve different problems is actively harmful.

As said, they solve kind of the same problem: Generating specialized versions of functions at compile time.

Dynamic enums/traits are just generic types, where a specialized version is chosen at runtime.

What's the difference between dynamic enums and today's (regular) enums then?

There is no difference at all.

Generic enums/traits are just structs, which select specific versions of behaviour at runtime.

What kind of "behavior", precisely? Trait methods? Enum variants? Doesn't the former just have the same effect as a trait object, and the latter that of the existing enum type?

"behaviour" is, what the funciton does. Not sure, what you want to know here exactly.

This is necessary to get my example work in a generic way. I could not find a different way.

I do believe that, it's just that I missed a more detailed (possibly technical) explanation as to why that is the case.

It's needed for some type to implement some Enum, but the Enum to be required to be implemented may be different.
It's similar if you want some generic trait for doing something like this:

trait Doer {
    trait Requirement;
    fn do_something<T: Self::Requirement>(&self, &mut T);
}

This way it would be allowed to define multiple traits, where the method accepts types with multiple different trait requirements. This is not really useful yet, since you would rather add the generic type directly to the trait. But if it was possible to ensure, that there is a fixed set of types, that can implement requirement, it's useful to set the Requirement, so it's possible to generate trait objects for traits with generic functions.
This may also be a method to write an optimized version in a generic way, but would require a larger change from using enums.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 11, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Jan 11, 2019
@rfcbot
Copy link
Collaborator

rfcbot commented Jan 21, 2019

The final comment period, with a disposition to close, as per the review above, is now complete.

By the power vested in me by Rust, I hereby close this RFC.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. closed This FCP has been closed (as opposed to postponed) and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Jan 21, 2019
@rfcbot rfcbot closed this Jan 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-data-types RFCs about data-types A-machine Proposals relating to Rust's abstract machine. A-sum-types Sum types related proposals. A-syntax Syntax related proposals & ideas A-traits Trait system related proposals & ideas A-typesystem Type system related proposals & ideas closed This FCP has been closed (as opposed to postponed) finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants