-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future-proofing enums/structs with #[non_exhaustive] attribute #2008
Changes from 1 commit
ac2b8be
25d8980
aae8a31
c9c82cf
c358154
61a4b14
3d56889
aa2502f
d7c62c4
09eeb1c
60cf588
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,390 @@ | ||
- Feature Name: non_exhaustive | ||
- Start Date: 2017-05-24 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
|
||
This RFC introduces the `#[non_exhaustive]` attribute for enums, which indicates | ||
that more variants may be added to an enum in the future. Adding this hint | ||
will force downstream crates to add a wildcard arm to `match` statements | ||
involving the enum, ensuring that adding new variants is not a breaking change. | ||
|
||
This is a post-1.0 version of [RFC 757], modified to use an attribute instead of | ||
a custom syntax. | ||
|
||
# Motivation | ||
|
||
The most common use of this feature is error types. Because adding features to a | ||
crate may result in different possibilities for errors, it makes sense that more | ||
types of errors will be added in the future. | ||
|
||
For example, the rustdoc for [`std::io::ErrorKind`] shows: | ||
|
||
```rust | ||
pub enum ErrorKind { | ||
NotFound, | ||
PermissionDenied, | ||
ConnectionRefused, | ||
ConnectionReset, | ||
ConnectionAborted, | ||
NotConnected, | ||
AddrInUse, | ||
AddrNotAvailable, | ||
BrokenPipe, | ||
AlreadyExists, | ||
WouldBlock, | ||
InvalidInput, | ||
InvalidData, | ||
TimedOut, | ||
WriteZero, | ||
Interrupted, | ||
Other, | ||
UnexpectedEof, | ||
// some variants omitted | ||
} | ||
``` | ||
|
||
Because the standard library continues to grow, it makes sense to eventually add | ||
more error types. However, this can be a breaking change if we're not careful: | ||
|
||
```rust | ||
use std::io::ErrorKind::*; | ||
|
||
match error_kind { | ||
NotFound => ..., | ||
PermissionDenied => ..., | ||
ConnectionRefused => ..., | ||
ConnectionReset => ..., | ||
ConnectionAborted => ..., | ||
NotConnected => ..., | ||
AddrInUse => ..., | ||
AddrNotAvailable => ..., | ||
BrokenPipe => ..., | ||
AlreadyExists => ..., | ||
WouldBlock => ..., | ||
InvalidInput => ..., | ||
InvalidData => ..., | ||
TimedOut => ..., | ||
WriteZero => ..., | ||
Interrupted => ..., | ||
Other => ..., | ||
UnexpectedEof => ..., | ||
} | ||
``` | ||
|
||
If we were to add another variant to this enum, this `match` would fail, | ||
requiring an additional arm to handle the extra case. But, if force users to | ||
add an arm like so: | ||
|
||
```rust | ||
match error_kind { | ||
// ... | ||
_ => ..., | ||
} | ||
``` | ||
|
||
Then we can add as many variants as we want! | ||
|
||
## How we do this today | ||
|
||
We force users add this arm for [`std::io::ErrorKind`] by adding a hidden | ||
variant: | ||
|
||
```rust | ||
#[unstable(feature = "io_error_internals", | ||
reason = "better expressed through extensible enums that this \ | ||
enum cannot be exhaustively matched against", | ||
issue = "0")] | ||
#[doc(hidden)] | ||
__Nonexhaustive, | ||
``` | ||
|
||
Because this feature doesn't show up in the docs, and doesn't work in stable | ||
rust, we can safely assume that users won't use it. | ||
|
||
A lot of crates take advantage of `#[doc(hidden)]` variants to tell users that | ||
they should add a wildcard branch to matches. However, the standard library | ||
takes this trick further by making the variant `unstable`, ensuring that it | ||
cannot be used in stable Rust. Outside the standard library, here's a look at | ||
[`diesel::result::Error`]: | ||
|
||
```rust | ||
pub enum Error { | ||
InvalidCString(NulError), | ||
DatabaseError(String), | ||
NotFound, | ||
QueryBuilderError(Box<StdError+Send+Sync>), | ||
DeserializationError(Box<StdError+Send+Sync>), | ||
#[doc(hidden)] | ||
__Nonexhaustive, | ||
} | ||
``` | ||
|
||
Even though the variant is hidden in the rustdoc, there's nothing actually stopping a | ||
user from using the `__Nonexhaustive` variant. This code works totally fine on | ||
stable rust: | ||
|
||
```rust | ||
use diesel::Error::*; | ||
match error { | ||
InvalidCString(..) => ..., | ||
DatabaseError(..) => ..., | ||
NotFound => ..., | ||
QueryBuilderError(..) => ..., | ||
DeserializationError(..) => ..., | ||
__Nonexhaustive => ..., | ||
} | ||
``` | ||
|
||
This is obviously unintended, and this is currently the best way to make | ||
non-exhaustive enums outside the standard library. Plus, even the standard | ||
library remarks that this is a hack. Recall the hidden variant for | ||
[`std::io::ErrorKind`]: | ||
|
||
```rust | ||
#[unstable(feature = "io_error_internals", | ||
reason = "better expressed through extensible enums that this \ | ||
enum cannot be exhaustively matched against", | ||
issue = "0")] | ||
#[doc(hidden)] | ||
__Nonexhaustive, | ||
``` | ||
|
||
Using `#[doc(hidden)]` will forever feel like a hack to fix this problem. | ||
Additionally, while plenty of crates could benefit from the idea of | ||
non-exhaustiveness, plenty don't because this isn't documented in the Rust book, | ||
and only documented elsewhere as a hack until a better solution is proposed. | ||
|
||
## Opportunity for optimisation | ||
|
||
Currently, the `#[doc(hidden)]` hack leads to a few missed opportunities | ||
for optimisation. For example, take this enum: | ||
|
||
```rust | ||
pub enum Error { | ||
Message(String), | ||
Other, | ||
} | ||
``` | ||
|
||
Currently, this enum takes up the same amount of space as `String` because of | ||
the non-zero optimisation. If we add our non-exhaustive variant: | ||
|
||
```rust | ||
pub enum Error { | ||
Message(String), | ||
Other, | ||
#[doc(hidden)] | ||
__Nonexhaustive, | ||
} | ||
``` | ||
|
||
Then this enum needs an extra bit to distinguish `Other` and `__Nonexhaustive`, | ||
which is ultimately never used. This will likely add an extra 8 bytes on a | ||
64-bit system to ensure alignment. | ||
|
||
More importantly, take the following code: | ||
|
||
```rust | ||
use Error::*; | ||
match error { | ||
Message(ref s) => /* lots of code */, | ||
Other => /* lots of code */, | ||
_ => /* lots of code */, | ||
} | ||
``` | ||
|
||
As a human, we can determine that the wildcard match is dead code and can be | ||
removed from the binary. Unfortunately, Rust can't make this distinction because | ||
we could still *technically* use that wildcard branch. | ||
|
||
Although these options will unlikely matter in this example because | ||
error-handling code (hopefully) shouldn't run very often, it could matter for | ||
other use cases. | ||
|
||
# Detailed design | ||
|
||
An attribute `#[non_exhaustive]` is added to the language, which will (for now) | ||
fail to compile if it's used on anything other than an enum definition. | ||
|
||
Within the crate that defines the enum, this attribute is essentially ignored, | ||
so that the current crate can continue to exhaustively match the enum. The | ||
justification for this is that any changes to the enum will likely result in | ||
more changes to the rest of the crate. Consider this example: | ||
|
||
```rust | ||
use std::error::Error as StdError; | ||
|
||
#[non_exhaustive] | ||
pub enum Error { | ||
Message(String), | ||
Other, | ||
} | ||
impl StdError for Error { | ||
fn description(&self) -> &str { | ||
match *self { | ||
Message(ref s) => s, | ||
Other => "other or unknown error", | ||
} | ||
} | ||
} | ||
``` | ||
|
||
It seems undesirable for the crate author to use a wildcard arm here, to | ||
ensure that an appropriate description is given for every variant. In fact, if | ||
they use a wildcard arm in addition to the existing variants, it should be | ||
identified as dead code, because it will never be run. | ||
|
||
Outside the crate that defines the enum, users should be required to add a | ||
wildcard arm to ensure forward-compatibility, like so: | ||
|
||
```rust | ||
use mycrate::Error; | ||
|
||
match error { | ||
Message(ref s) => ..., | ||
Other => ..., | ||
_ => ..., | ||
} | ||
``` | ||
|
||
And it should *not* be marked as dead code, even if the compiler does mark it as | ||
dead and remove it. | ||
|
||
## Changes to rustdoc | ||
|
||
Right now, the only indicator that rustdoc gives for non-exhaustive enums is a | ||
comment saying "some variants omitted." This shows up whenever variants are | ||
marked as `#[doc(hidden)]`, and rustdoc should continue to emit this message. | ||
|
||
However, after this message (if any), it should offer an additional message | ||
saying "more variants may be added in the future," to clarify that the enum is | ||
non-exhaustive. It also hints to the user that in the future, they may want to | ||
fine-tune their match code to include future variants when they are added. | ||
|
||
These two messages should be distinct; the former says "this enum has stuff | ||
that you shouldn't see," while the latter says "this enum is incomplete and may | ||
be extended in the future." | ||
|
||
# How We Teach This | ||
|
||
Changes to rustdoc should make it easier for users to understand the concept of | ||
non-exhaustive enums in the wild. | ||
|
||
Additionally, in the chapter on enums, a section should be added specifically | ||
for non-exhaustive enums. Because error types are common in almost all crates, | ||
this case is important enough to be taught when a user learns Rust for the first | ||
time. | ||
|
||
# Drawbacks | ||
|
||
* The `#[doc(hidden)]` hack in practice is usually good enough. | ||
* An attribute may be more confusing than a dedicated syntax. | ||
* `non_exhaustive` may not be the clearest name. | ||
* It's unclear if this attribute should apply to other aspects of the language | ||
than just enums. | ||
|
||
# Alternatives | ||
|
||
* Provide a dedicated syntax for enums instead of an attribute. This would | ||
likely be done by adding a `..` variant, as proposed by the original | ||
[extensible enums RFC][RFC 757]. | ||
* Allow creating private enum variants, giving a less-hacky way to create a | ||
hidden variant. | ||
* Document the `#[doc(hidden)]` hack and make it more well-known. | ||
|
||
# Unresolved questions | ||
|
||
Should there be a way to warn downstream crate users when their match is | ||
non-exuhaustive? What if a user wants to add a warning to their matches using | ||
non-exhaustive enums, to indicate that more code should be added to handle a new | ||
variant? | ||
|
||
Should the below extensions be added? | ||
|
||
## Extensions to structs | ||
|
||
It makes sense to eventually allow this attribute on structs as well. Unlike | ||
enums, it's very easy to make a struct non-exhaustive: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like a good idea to me. However in this case non-exhaustive seems a less relevant keyword word because it is usually used to describe matching. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd also note here that the term "extensible" was used on the previous RFC. |
||
|
||
```rust | ||
pub struct Thing { | ||
pub number_of_eggs: usize, | ||
pub friction_constant: f64, | ||
pub can_swim: bool, | ||
non_exhaustive: (), | ||
} | ||
``` | ||
|
||
Because the `non_exhaustive` field is private, downstream crates cannot | ||
construct a value of `Thing` or exhaustively match its fields, even though they | ||
can access other fields. However, this seems less ergonomic than: | ||
|
||
```rust | ||
#[non_exhaustive] | ||
pub struct Thing { | ||
pub number_of_eggs: usize, | ||
pub friction_constant: f64, | ||
pub can_swim: bool, | ||
} | ||
``` | ||
|
||
Because then the upstream crate won't have to add the `non_exhaustive` field | ||
when constructing a value, or exhaustively matching patterns. | ||
|
||
## Extensions to traits | ||
|
||
Tangentially, it also makes sense to have non-exhaustive traits as well, even | ||
though they'd be non-exhaustive in a different way. Take this example from | ||
[`byteorder`]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This concept seems sufficiently different to deserve a different attribute. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would say
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right that they seem sufficiently different, but I see them all as solving the same problem: instead of preventing a user from implementing/using something by creating a hidden field, this simply states that not all of the dependencies can be listed by the user. |
||
|
||
```rust | ||
pub trait ByteOrder: Clone + Copy + Debug + Default + Eq + Hash + Ord + PartialEq + PartialOrd { | ||
// ... | ||
} | ||
``` | ||
|
||
The `ByteOrder` trait requires these traits so that a user can simply write a | ||
bound of `T: ByteOrder` without having to add other useful traits, like `Hash` | ||
or `Eq`. | ||
|
||
This trait is useful, but the crate has no intention of letting other users | ||
implement this trait themselves, because then adding an additional trait | ||
dependency for `ByteOrder` could be a breaking change. | ||
|
||
The way that this crate solves this problem is by adding a hidden trait | ||
dependency: | ||
|
||
```rust | ||
mod private { | ||
pub trait Sealed {} | ||
impl Sealed for super::LittleEndian {} | ||
impl Sealed for super::BigEndian {} | ||
} | ||
|
||
pub trait ByteOrder: /* ... */ + private::Sealed { | ||
// ... | ||
} | ||
``` | ||
|
||
This way, although downstream crates can use this trait, they cannot actually | ||
implement things for this trait. | ||
|
||
This pattern could again be solved by using `#[non_exhaustive]`: | ||
|
||
```rust | ||
#[non_exhaustive] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think |
||
pub trait ByteOrder: /* ... */ { | ||
// ... | ||
} | ||
``` | ||
|
||
This would indicate to downstream traits that this trait might gain additional | ||
requirements (dependent traits or methods to implement), and as such, cannot be | ||
implemented downstream. | ||
|
||
[RFC 757]: https://github.com/rust-lang/rfcs/pull/757 | ||
[`std::io::ErrorKind`]: https://doc.rust-lang.org/1.17.0/std/io/enum.ErrorKind.html | ||
[`diesel::result::Error`]: https://docs.rs/diesel/0.13.0/diesel/result/enum.Error.html | ||
[`byteorder`]: https://github.com/BurntSushi/byteorder/tree/f8e7685b3a81c52f5448fd77fb4e0535bc92f880 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that if you expect users to handle every case then this isn't for you. You should not use this attribute and bump your major version. A warning on not handling a particular case is also incredibly likely to be low-value noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair! I mostly carried this over from #757 because it was discussed a lot in the RFC there.