Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extensible enums #757

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 28, 2015

A mechanism for declaring an enum to be extensible, which prevents exhausting matching in downstream crates.

Rendered text.

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Jan 28, 2015

👍

@cmr

This comment has been minimized.

Copy link
Member

cmr commented Jan 28, 2015

Easy, solves a real problem, I've wanted this before. 👍

@Kimundi

This comment has been minimized.

Copy link
Member

Kimundi commented Jan 28, 2015

It seems weird to have pattern matching behave differently depending on whether the enum is external to the crate or not - afaik apart from coherence we don't have such behavior anywhere else, and it would hinder refactoring of code that moves an enum into its own crate.

So I'd be for consistently treating such an enum as non-exhaustivly matchable even in the crate that defines it.

@seanmonstar

This comment has been minimized.

Copy link
Contributor

seanmonstar commented Jan 28, 2015

:+9000:

@DiamondLovesYou

This comment has been minimized.

Copy link

DiamondLovesYou commented Jan 28, 2015

Ditto @seanmonstar

@reem

This comment has been minimized.

Copy link

reem commented Jan 28, 2015

Sounds great.

I wonder if this could be extended to allow an enum which can be extended by its users, i.e. they can add variants to an existing enum. I'm unsure how this would work in terms of representation, but it would be really useful for generic error types, where the only real solution today is using Any and downcasting.

# Detailed design

Enum grammar is extended to permit a list of variants to be terminated
with `..`. An enum declared with `..` is considered *extensible*.

This comment has been minimized.

@nrc

nrc Jan 28, 2015

Member

Excuse the bikeshedding, but I'd like to propose ... rather than .. because a) .. is much more heavily overloaded in Rust already than ... and b) ... actually means 'elided things' in English writing.

This comment has been minimized.

@nrc

nrc Jan 28, 2015

Member

I guess .. is consistent with elided fields in struct patterns (fwiw, I don't like that either, especially now we have .. for ranges).

This comment has been minimized.

@liigo

liigo Jan 29, 2015

Contributor

+1 for ...
On Jan 29, 2015 7:50 AM, "Nick Cameron" notifications@github.com wrote:

In text/0000-extensible-enums-and-structs.md
#757 (comment):

  • _ => ...,
    +}
    +```

+Due to the presence of this wildcard, you can safely add variants
+without fear of breaking source compatibility with downstream clients.
+
+Note that extensibility is ignored within the current
+crate. Therefore, your library may internally write a match that
+exhaustively covers all sources of error. You don't need to worry
+about source compatibility with yourself, after all.
+
+# Detailed design
+
+Enum grammar is extended to permit a list of variants to be terminated
+with... An enum declared with.. is considered extensible.

Excuse the bikeshedding, but I'd like to propose ... rather than ..
because a) .. is much more heavily overloaded in Rust already than ...
and b) ... actually means 'elided things' in English writing.


Reply to this email directly or view it on GitHub
https://github.com/rust-lang/rfcs/pull/757/files#r23735568.

More language syntax.

# Alternatives

This comment has been minimized.

@nrc

nrc Jan 28, 2015

Member

I'm not sure it is better, but an alternative design would be for matches in the same crate to also require a wildcard match arm. I realise this is not necessary, but it would make things be more consistent.

This comment has been minimized.

@nikomatsakis

nikomatsakis Jan 29, 2015

Author Contributor

Yes, true.

This comment has been minimized.

@cristicbz

cristicbz Feb 3, 2015

Edit: I saw the comment below only now, sorry. Consider my question answered.
I also buy the consistency argument: in particular since moving modules between crates should really not break pattern matching expressions. Maybe place that relaxation on modules rather than crates?

This comment has been minimized.

@nikomatsakis

nikomatsakis Feb 4, 2015

Author Contributor

On Tue, Feb 03, 2015 at 02:39:32AM -0800, Cristi Cobzarenco wrote:

I also buy the consistency argument: in particular since moving modules between crates should really not break pattern matching expressions. Maybe place that relaxation on modules rather than crates?

I'm not sure why you would think this. There are already things that
are affected by moving between crates: impls come to mind.

@gsingh93

This comment has been minimized.

Copy link

gsingh93 commented Jan 29, 2015

I don't completely understand. If a library adds more errors to an enum, I don't want my code to compile until I handle all of the errors, right? Isn't putting a catch all _ -> ... dangerous?

@flaper87

This comment has been minimized.

Copy link

flaper87 commented Jan 29, 2015

👍 It doesn't sound invasive and it provides more flexibility to one of the types that's broadly used.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Jan 29, 2015

Maybe the downstream crate really wants to match every variant. Using _ => {} makes it easy to miss one match when updating code. Perhaps it should be something like .. => {}, and emits a warning if this .. arm is reachable.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Jan 29, 2015

@kennytm hmm, that's an interesting thought. I do agree that it'd be nice to give downstream crates the option to try and exhaustively match everything, even if they must provide a backup.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Jan 29, 2015

@reem I think that enums which are extensible by end-users is kind of a separate feature. That is typically the role of a trait object. There might be some overlap with virtual structs. Certainly, though, to support user-defined variants, we would need to guarantee that the base type is considered unsized so that it is only manipulated by pointer (which was a key ingredient in most of the virtual struct designs). Those designs however were typically targeted at non-extensible use cases, though it seems that there is demand for extensible cases as well.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Jan 29, 2015

@gsingh93 there is certainly some tension between the desire for stability -- your code keeps compiling even with newer versions of my library -- and your desire for completeness. I think perhaps something like what @kennytm is suggesting offers a compromise path, but we'd need to find an acceptable syntax.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Jan 29, 2015

(Using .. as a substitute for _ is clearly ambiguous, e.g. Foo(..) vs Foo(_) have distinct meanings today.)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Jan 29, 2015

@nick29581 the reason I chose .. is just that it's what we use everywhere else (e.g., Foo(..) in patterns). I think using ... feels like a kind of random inconsistency. I'd rather restrict the use of ... to inclusive ranges of various kinds (or, alternatively, restrict the use .. to exclusive ranges, but that ship has sailed). Point is, I don't want to have to guess how many dots!

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Jan 29, 2015

Updated the RFC to include an alternative design (no exhaustive matching) and unresolved question (can we have a nice syntax for "wildcard that should never match").

Regarding the first point -- whether to draw a distinction between crate-local matches and external ones -- I am on the fence. I like the extra expressive power of being able to do exhaustive matches locally, but I am not sure if real-world use cases (like IoError) will ever take advantage of it.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Jan 29, 2015

In ye olde draft RFC (under the more sophisticated variant) abstract enum types would have this behavior (though per-mod, not per-crate).

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Jan 29, 2015

@glaebhoerl yes, I considered per-module, but per-crate seems superior to me, since the enum is public and hence it seems likely that matches against that enum will occur outside the module where it was declared (moreover, per-crate is the largest scope you can get without affecting semver guarantees what-so-ever). There is probably a use case for one module within a crate "defending itself" against another adding variants, but it seems less important, and is counterbalanced by the fact that being able to write exhaustive matches is legitimately useful for maintaining overall consistency within a package (it happens somewhat regularly that I find bugs due to over-eager use of _).

@jfager

This comment has been minimized.

Copy link

jfager commented Jan 29, 2015

There's still source incompatibility when I forget to make an enum extensible and then break downstream when I realize I might need to add variants in the future, right? So people should always make enums extensible unless they're really really sure the current set of variants is exhaustive? What's the perception of how frequently one or the other is the case? Should 'extensible' be the default and something like 'sealed' get added instead?

Another alternative is to allow downstream consumers to explicitly protect themselves by not throwing an 'unreachable pattern' error for terminal _ match arms. This allows consumers who want to break when their match is no longer exhaustive to just not provide that catch-all.

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Jan 29, 2015

@jfager It seems to me that most enums would probably end up "closed" or sealed or whatever it'd be called. For example, we're never going to want to add any more variants to Option, Result, Ordering, etc. The most common use case for extensible enums that I can see is for error handling like IoError or SqlState. Having to break backwards compatibility to "extensify" an enum seems like not that much of a problem in practice, since you're presumably doing it because you want to add more variants, so you can just fold the extensification change in with the addition of more variants.

@jfager

This comment has been minimized.

Copy link

jfager commented Jan 29, 2015

@sfackler Option, Result, Ordering, and many others in the std lib are basic, core enums that I don't think are representative of the sort of enums people usually create in their own code. I don't think that sealed enums would be rare by any stretch, but I think it's questionable whether they're more common than extensible ones:

  • Errors are going to get created in user code all the time.
  • Standards, like http status codes, which grow over time.
  • Games might decide to add new monster types or whatever.
  • Supported machine architectures, operating systems, browser types, etc.

To your point about just folding it into adding more variants: sure, but that's still part of a breaking change. The point being that there's no way to make an enum extensible w/o breaking downstream, so maybe the default needs to get thought through more.

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jan 29, 2015

@nikomatsakis Would having an arm such as this work:

_ => static_assert!(false)

When the _ case is unreachable it could disable asserts within it. I'm not sure if there's an equivalent macro for warnings.

@mahkoh

This comment has been minimized.

Copy link
Contributor

mahkoh commented Jan 29, 2015

As it's written right now this RFC turns a compile-time error into a potential run-time bug. -1 for this reason. Making it a warning doesn't solve anything. Downstream often does not know if a warning is serious or not and upstream will have to fix the warning for the same reason.

Any change that aims to make Rust more semver compatible should be part of a comprehensive solution that addresses all points in rust-lang/rust#17152.

@gsingh93

This comment has been minimized.

Copy link

gsingh93 commented Jan 29, 2015

My point is the same as @mahkoh's, which is why I'm still against this.

@gsingh93

This comment has been minimized.

Copy link

gsingh93 commented Jan 29, 2015

@Diggsey, if you did that it wouldn't compile, and you'd be right back where you started.

@jmesmon

This comment has been minimized.

Copy link

jmesmon commented Jan 29, 2015

While I'd really like extensibility, I'm not seeing how we can safely allow the crate defining the enum to have exhaustive matches:

If another crate B extends an enum in A, and then some code in B passes an instance of one of the new variants in that enum into the original defining crate A, A will need to have wild card matches.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Feb 4, 2015

@nikomatsakis Reordering enum variants can probably be solved by sorting them, however with extensible enums, that doesn't work so well unless the author sticks to some scheme of annotations on the variants (stability attributes work here, since we can just sort by version first and then alphabetically/whatever). #[repr(sorted)] would be one way of doing this.

Still has the size issue though. #[extensible(max_size=foo)] enum Bar{...} is one hacky way of solving this.

But I agree that binary incompatibilities isn't something we need to be worried about at the moment. Eventually we should have some form of ABI stabilization, but right now that seems far off.

@jfager

This comment has been minimized.

Copy link

jfager commented Feb 4, 2015

I already mentioned this in a previous comment, but I'd like to push harder for a different approach: moving the decision about whether adding variants to an enum breaks downstream code to the downstream code itself, by not considering the _ => case unreachable.

So for

enum Foo {
    First,
    Second,
    // Third // add this later
}

consumer code that doesn't want to break when the Third variant gets added would just look like:

match foo {
    Foo::First => ...
    Foo::Second => ...
    _ => ... // would error as unreachable in current rust
}

This is consistent with how you already write code for enums with cases you're willing to ignore; conceptually this just extends those cases to those that might not exist yet.

The advantage of this over the RFC approach is simplicity. The RFC is talking about adding new syntax, adding complexity around how things behave across crate or mod boundaries, and adding more new syntax or lints for clients who would still want to break. All this and enum authors still have to pick right the first time or risk breaking consumer code when they need to update, or conversely default to making enums extensible.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Feb 4, 2015

I implemented this as a plugin.

It's not perfect yet, but that can be fixed.

Of course, this has the obvious flaw that it only works if the user imports the plugin. Library authors can reregister the lint pass, however this still requires the library to be imported with #[plugin]

@ghost

This comment has been minimized.

Copy link

ghost commented Feb 5, 2015

I am in favour. I was a little hesitant about this feature being baked-in to the language but I do see the appeal.

I am not sure I agree we should be satisfied with how structs would (not) be affected by the same problem - I think for data types such as models in Web applications, the similar issue will prove itself to be as significant. Using a convention such as an extra private field is not something I would expect user code to follow easily.

@nielsle

This comment has been minimized.

Copy link

nielsle commented Feb 7, 2015

Perhaps you could introduce enum inheritance. That allows you to distinguish between Foo and ExtendedFoo, but I am not sure if it covers all use cases.

enum Foo {A, B}

// ExtendedFoo inherits from  Foo
fn bar<ExtendedFoo:  Foo>(x: ExtendedFoo) {

    match x {
        Foo::A => ...,
        Foo::B => ...,
        _ => ...,
    }
}
@bombless

This comment has been minimized.

Copy link

bombless commented Feb 7, 2015

I think there would be pretty strong pressure against making enums extensible without good
reason.

Agree with that.

-1 for enum inheritance, if you really need to check another type, just check it.

For example when I handle C language data structrues:

struct Struct(HashMap<String, (usize, usize)>);
struct Union(HashMap<String, (usize, usize)>);
enum Type { 
    Struct(Struct), 
    Union(Union), 
    Primitive(usize), 
    Pointer(Rc<Type>), 
    Unknown(TypeName) 
} 

This is a scenario where you think inheritance is needed at first glance. And if Rust allow enum inheritance I'll probably do that, which is wrong.

When you modify an enum's definition, almost everywhere it is used should be pointed out by compiler, this behavior is significant and should not be changed.

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Feb 8, 2015

@jfager I don't understand: With your proposal, how would a library author force its clients to write the extra _ => ... match arm?

(I am not opposed to your proposal, but it does not seem to address the full problem here.)

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Feb 8, 2015

@bombless FYI: you left some important context out of your quotation.

At least to me, the way you have written that quotation, it could be interpreted as "there will be strong pressure against the language change from this RFC without better motivation."

But the author of the quote is the author of the RFC, and the sentence preceding that quote was "Nobody is proposing removing exhausting matching."; I believe @nikomatsakis 's intention was to say that "there will be strong pressure against a library making one of its enum definitions extensible." I.e., you're only supposed to do it in scenarios like the one laid out in the RFC itself.

(I do not think @bombless was actively trying to distort niko's words; the quote just struck me as a difficult to understand, and so I am trying to fix that.)

@jfager

This comment has been minimized.

Copy link

jfager commented Feb 8, 2015

@pnkfelix In current Rust, an enum client can either exhaustively match the current set of variants and break when new variants are added, or partially match the current set of variants and not break when new variants are added. There is no way to exhaustively match the current set of variants and not break when new variants are added. I'm saying we should add that ability.

I'm further claiming that if we add that ability, the need for this rfc and its attendant complexity is reduced, as clients then have a way to make themselves source compatible with upstream changes if that's what they want. While it's true that doesn't give library authors a way to force clients to stay source compatible, I think that's fine, as it's completely reasonable for clients to want to break regardless of the library author's intent.

If some way for the library author to signal intent is still desired, I think an attribute and a lint as @mahkoh suggested would be preferable to a full-blown language feature, especially if that language feature is going to come with a bunch of extra baggage like different behavior around crate boundaries and new counter-features to continue allowing people to do exhaustive matching.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Feb 11, 2015

Having chewed this over in my sub-concious a little more, I'm thinking that an attribute + lint (as suggested by @markoh) would be a good approach, rather than adding new syntax. To make things more concrete: adding #[extensible] to an enum would allow match statements which match it to have a wildcard arm, even if the other arms are exhaustive. A lint, error by default, would check that all uses in a different crate have such an arm.

I like this because the extensible-ness doesn't affect the dynamic semantics of the enum and the only way it affects the static semantics (by forcing a wildcard arm) can't cause a soundness issues, i.e., if type checking is wrong, there still won't be an error. Together these signal to me that the feature should not have first class syntax.

I like a lint because it seems a reasonable position to take to value exhaustiveness more highly than avoiding future errors (although I believe the other preference is more reasonable), So, it seems justifiable to allow this switch to be flicked. I believe we can justify any future backwards compatibility issues by including error-by-default-lints as errors for the sake of compatibility.

@cmr

This comment has been minimized.

Copy link
Member

cmr commented Feb 11, 2015

Making it a lint would make it near-useless for the stdlib unless
disabling this lint required a feature gate, because otherwise you'd be
able to subvert our stability guarantees.

One thing I've become increasingly disturbed about with this RFC is the
inability to exhaustively match against an extensible enum. I want the
ability to opt-in to instabilty for the sake of robustness.

On Wed, Feb 11, 2015, at 00:20, Nick Cameron wrote:

Having chewed this over in my sub-concious a little more, I'm thinking
that an attribute + lint (as suggested by @markoh[1]) would be a good
approach, rather than adding new syntax. To make things more concrete:
adding #[extensible] to an enum would allow match statements which
match it to have a wildcard arm, even if the other arms are
exhaustive. A lint, error by default, would check that all uses in a
different crate have such an arm.

I like this because the extensible-ness doesn't affect the dynamic
semantics of the enum and the only way it affects the static semantics
(by forcing a wildcard arm) can't cause a soundness issues, i.e., if
type checking is wrong, there still won't be an error. Together these
signal to me that the feature should not have first class syntax.

I like a lint because it seems a reasonable position to take to value
exhaustiveness more highly than avoiding future errors (although I
believe the other preference is more reasonable), So, it seems
justifiable to allow this switch to be flicked. I believe we can
justify any future backwards compatibility issues by including
error-by-default-lints as errors for the sake of compatibility.

— Reply to this email directly or view it on GitHub[2].

Links:

  1. https://github.com/markoh
  2. #757 (comment)
@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Feb 11, 2015

@cmr Stable Rust releases will ship with unstable features disabled explicitly to avoid allowing people to "opt into instability" because if you allow people to opt in, they'll do it. How is this case materially different?

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Feb 11, 2015

@cmr, being able to subvert the stability guarantees doesn't seem like an issue to me - you can always do it if you are determined enough. The fact that you have to opt in, seems like its enough of a disclaimer to me.

I'm not sure I understand your second point, that seems to be asking for exactly the lint that is proposed.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Mar 5, 2015

Closing as postponed; RFCs issue

@aturon aturon closed this Mar 5, 2015

@aturon aturon added the postponed label Mar 5, 2015

@nikomatsakis

This comment has been minimized.

Copy link
Contributor Author

nikomatsakis commented Mar 5, 2015

@cmr we decided to postpone this for the time being, but let me leave just one quick note. I agree making this into a lint has the potential to make it near useless, but this is already a concern we should address. One of the changes I've been meaning to propose to address this particular point is that rustc should have a command line flag to override and cap all lints at "warn" status so that a lint can never block compilation. cargo can then use this for dependencies. This would ensure that while you get the full benefit of the lints in the current crate you are working on, you are not blocked from using other people's work. Note also that the way the lint was proposed, you would still be required to put a _ case, so there would be some defined fallback and behavior if additional variants are added, even if you also get a compilation warning.

UPDATE: Edited slightly for clarity.

@Virtlink

This comment has been minimized.

Copy link

Virtlink commented Apr 15, 2015

Enums being non-extensible by default, with opt-in extensibility, is a good idea.

However, for structs it's the wrong way round. Adding an implementation detail such as a private field to an all public struct mustn't cause third-party code to break.

Instead, by default structs should be extensible and not allow public instantiation. Then adding a private field to a struct is never a breaking change.

Of course, there would be an opt-in that makes the struct non-extensible and allows public instantiation. This opt-in also requires all fields to be public. Adding a private field would cause a compile-time error.

One way to implement this: the compiler by default adds a zero-sized hidden dummy field to every struct. When the user opts-in, the dummy field is not added and all fields must be public.

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Apr 15, 2015

@Virtlink would you also make destructuring bind and FRU illegal without the opt-in? See e.g.

rust-lang/rust#21407

Seems like a deal-breaker to me. But maybe I am missing something

An analogous #[extensible] attribute for (all pub) structs that enforces your rule via a lint (and lint-warns on destructuring bind and FRU too) seems like a way around that, and is is a non-breaking change to boot.

@bombless

This comment has been minimized.

Copy link

bombless commented Apr 16, 2015

It's funny that in Rust we already can build a enum that always has potential to grow.

mod some_module {
    enum _Enum { A, B, _Dummy }
    pub type Enum = self::_Enum;
    pub use self::_Enum::{A, B};
}
fn to_string(e: some_module::Enum) -> String {
    match e {
        some_module::A => 'A'.to_string(),
        some_module::B => 'B'.to_string(),
        _ => "_not_supported".to_string()
    }
}

I think it's related to rust-lang/rust#22261 though.

@Virtlink

This comment has been minimized.

Copy link

Virtlink commented Apr 16, 2015

@pnkfelix Indeed, function record updates outside the module would only be allowed with the opt-in, as public FRU would break as soon as a private field is added. If the author wants to add a private field to the struct later, then it wasn't his intention to allow FRU in the first place. Or if it was, removing the opt-in makes the author aware he's now making a breaking change.

Destructuring outside the module is still possible with a wildcard, as this won't break if a private field is added:

pub struct Foo { pub value: u32, }

// Somewhere outside the module:
let Foo { value: v, .. } = Foo::new();

Again, only with the opt-in has the author stated that no private fields will ever be added and is it safe to destructure without the wildcard.


In short my proposal is to treat every struct as if it contains private fields (even empty structs and all-public structs), unless explicitly stated (opt-in) that the struct will never contain private fields. Everything else (destructuring, FRU, instantiation) follows from that.

sgrif added a commit to diesel-rs/diesel that referenced this pull request Nov 29, 2015

Ensure that `Error` cannot be exhaustively matched on
This is going to be the "generic thing which can fail in a myriad of
ways" type. I want to be able to have specific useful cases which you
can recover from (e.g. no `Box<std::error::Error>`), but I don't want
people to exhaustively match on it.

Techincally adding a variant to this is still a violation of
[RFC #1105](https://github.com/rust-lang/rfcs/blob/master/text/1105-api-evolution.md),
but if you're actually matching against `__Nonexhaustive` and expected
it not to break eventually, you knew what was coming.

If some form of [RFC #757](rust-lang/rfcs#757)
ever lands, we'll switch to using that instead.

@aturon aturon referenced this pull request Apr 6, 2016

Closed

Private enum variants #32770

@petrochenkov petrochenkov added T-lang and removed postponed labels Feb 24, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.