Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future-proofing enums/structs with #[non_exhaustive] attribute #2008

Merged
merged 11 commits into from
Aug 26, 2017
390 changes: 390 additions & 0 deletions text/0000-non-exhaustive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,390 @@
- Feature Name: non_exhaustive
- Start Date: 2017-05-24
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

This RFC introduces the `#[non_exhaustive]` attribute for enums, which indicates
that more variants may be added to an enum in the future. Adding this hint
will force downstream crates to add a wildcard arm to `match` statements
involving the enum, ensuring that adding new variants is not a breaking change.

This is a post-1.0 version of [RFC 757], modified to use an attribute instead of
a custom syntax.

# Motivation

The most common use of this feature is error types. Because adding features to a
crate may result in different possibilities for errors, it makes sense that more
types of errors will be added in the future.

For example, the rustdoc for [`std::io::ErrorKind`] shows:

```rust
pub enum ErrorKind {
NotFound,
PermissionDenied,
ConnectionRefused,
ConnectionReset,
ConnectionAborted,
NotConnected,
AddrInUse,
AddrNotAvailable,
BrokenPipe,
AlreadyExists,
WouldBlock,
InvalidInput,
InvalidData,
TimedOut,
WriteZero,
Interrupted,
Other,
UnexpectedEof,
// some variants omitted
}
```

Because the standard library continues to grow, it makes sense to eventually add
more error types. However, this can be a breaking change if we're not careful:

```rust
use std::io::ErrorKind::*;

match error_kind {
NotFound => ...,
PermissionDenied => ...,
ConnectionRefused => ...,
ConnectionReset => ...,
ConnectionAborted => ...,
NotConnected => ...,
AddrInUse => ...,
AddrNotAvailable => ...,
BrokenPipe => ...,
AlreadyExists => ...,
WouldBlock => ...,
InvalidInput => ...,
InvalidData => ...,
TimedOut => ...,
WriteZero => ...,
Interrupted => ...,
Other => ...,
UnexpectedEof => ...,
}
```

If we were to add another variant to this enum, this `match` would fail,
requiring an additional arm to handle the extra case. But, if force users to
add an arm like so:

```rust
match error_kind {
// ...
_ => ...,
}
```

Then we can add as many variants as we want!

## How we do this today

We force users add this arm for [`std::io::ErrorKind`] by adding a hidden
variant:

```rust
#[unstable(feature = "io_error_internals",
reason = "better expressed through extensible enums that this \
enum cannot be exhaustively matched against",
issue = "0")]
#[doc(hidden)]
__Nonexhaustive,
```

Because this feature doesn't show up in the docs, and doesn't work in stable
rust, we can safely assume that users won't use it.

A lot of crates take advantage of `#[doc(hidden)]` variants to tell users that
they should add a wildcard branch to matches. However, the standard library
takes this trick further by making the variant `unstable`, ensuring that it
cannot be used in stable Rust. Outside the standard library, here's a look at
[`diesel::result::Error`]:

```rust
pub enum Error {
InvalidCString(NulError),
DatabaseError(String),
NotFound,
QueryBuilderError(Box<StdError+Send+Sync>),
DeserializationError(Box<StdError+Send+Sync>),
#[doc(hidden)]
__Nonexhaustive,
}
```

Even though the variant is hidden in the rustdoc, there's nothing actually stopping a
user from using the `__Nonexhaustive` variant. This code works totally fine on
stable rust:

```rust
use diesel::Error::*;
match error {
InvalidCString(..) => ...,
DatabaseError(..) => ...,
NotFound => ...,
QueryBuilderError(..) => ...,
DeserializationError(..) => ...,
__Nonexhaustive => ...,
}
```

This is obviously unintended, and this is currently the best way to make
non-exhaustive enums outside the standard library. Plus, even the standard
library remarks that this is a hack. Recall the hidden variant for
[`std::io::ErrorKind`]:

```rust
#[unstable(feature = "io_error_internals",
reason = "better expressed through extensible enums that this \
enum cannot be exhaustively matched against",
issue = "0")]
#[doc(hidden)]
__Nonexhaustive,
```

Using `#[doc(hidden)]` will forever feel like a hack to fix this problem.
Additionally, while plenty of crates could benefit from the idea of
non-exhaustiveness, plenty don't because this isn't documented in the Rust book,
and only documented elsewhere as a hack until a better solution is proposed.

## Opportunity for optimisation

Currently, the `#[doc(hidden)]` hack leads to a few missed opportunities
for optimisation. For example, take this enum:

```rust
pub enum Error {
Message(String),
Other,
}
```

Currently, this enum takes up the same amount of space as `String` because of
the non-zero optimisation. If we add our non-exhaustive variant:

```rust
pub enum Error {
Message(String),
Other,
#[doc(hidden)]
__Nonexhaustive,
}
```

Then this enum needs an extra bit to distinguish `Other` and `__Nonexhaustive`,
which is ultimately never used. This will likely add an extra 8 bytes on a
64-bit system to ensure alignment.

More importantly, take the following code:

```rust
use Error::*;
match error {
Message(ref s) => /* lots of code */,
Other => /* lots of code */,
_ => /* lots of code */,
}
```

As a human, we can determine that the wildcard match is dead code and can be
removed from the binary. Unfortunately, Rust can't make this distinction because
we could still *technically* use that wildcard branch.

Although these options will unlikely matter in this example because
error-handling code (hopefully) shouldn't run very often, it could matter for
other use cases.

# Detailed design

An attribute `#[non_exhaustive]` is added to the language, which will (for now)
fail to compile if it's used on anything other than an enum definition.

Within the crate that defines the enum, this attribute is essentially ignored,
so that the current crate can continue to exhaustively match the enum. The
justification for this is that any changes to the enum will likely result in
more changes to the rest of the crate. Consider this example:

```rust
use std::error::Error as StdError;

#[non_exhaustive]
pub enum Error {
Message(String),
Other,
}
impl StdError for Error {
fn description(&self) -> &str {
match *self {
Message(ref s) => s,
Other => "other or unknown error",
}
}
}
```

It seems undesirable for the crate author to use a wildcard arm here, to
ensure that an appropriate description is given for every variant. In fact, if
they use a wildcard arm in addition to the existing variants, it should be
identified as dead code, because it will never be run.

Outside the crate that defines the enum, users should be required to add a
wildcard arm to ensure forward-compatibility, like so:

```rust
use mycrate::Error;

match error {
Message(ref s) => ...,
Other => ...,
_ => ...,
}
```

And it should *not* be marked as dead code, even if the compiler does mark it as
dead and remove it.

## Changes to rustdoc

Right now, the only indicator that rustdoc gives for non-exhaustive enums is a
comment saying "some variants omitted." This shows up whenever variants are
marked as `#[doc(hidden)]`, and rustdoc should continue to emit this message.

However, after this message (if any), it should offer an additional message
saying "more variants may be added in the future," to clarify that the enum is
non-exhaustive. It also hints to the user that in the future, they may want to
fine-tune their match code to include future variants when they are added.

These two messages should be distinct; the former says "this enum has stuff
that you shouldn't see," while the latter says "this enum is incomplete and may
be extended in the future."

# How We Teach This

Changes to rustdoc should make it easier for users to understand the concept of
non-exhaustive enums in the wild.

Additionally, in the chapter on enums, a section should be added specifically
for non-exhaustive enums. Because error types are common in almost all crates,
this case is important enough to be taught when a user learns Rust for the first
time.

# Drawbacks

* The `#[doc(hidden)]` hack in practice is usually good enough.
* An attribute may be more confusing than a dedicated syntax.
* `non_exhaustive` may not be the clearest name.
* It's unclear if this attribute should apply to other aspects of the language
than just enums.

# Alternatives

* Provide a dedicated syntax for enums instead of an attribute. This would
likely be done by adding a `..` variant, as proposed by the original
[extensible enums RFC][RFC 757].
* Allow creating private enum variants, giving a less-hacky way to create a
hidden variant.
* Document the `#[doc(hidden)]` hack and make it more well-known.

# Unresolved questions

Should there be a way to warn downstream crate users when their match is
non-exuhaustive? What if a user wants to add a warning to their matches using
non-exhaustive enums, to indicate that more code should be added to handle a new
variant?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that if you expect users to handle every case then this isn't for you. You should not use this attribute and bump your major version. A warning on not handling a particular case is also incredibly likely to be low-value noise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair! I mostly carried this over from #757 because it was discussed a lot in the RFC there.


Should the below extensions be added?

## Extensions to structs

It makes sense to eventually allow this attribute on structs as well. Unlike
enums, it's very easy to make a struct non-exhaustive:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good idea to me. However in this case non-exhaustive seems a less relevant keyword word because it is usually used to describe matching. incomplete, extensible or will_extend now start to sound better and work for both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also note here that the term "extensible" was used on the previous RFC.


```rust
pub struct Thing {
pub number_of_eggs: usize,
pub friction_constant: f64,
pub can_swim: bool,
non_exhaustive: (),
}
```

Because the `non_exhaustive` field is private, downstream crates cannot
construct a value of `Thing` or exhaustively match its fields, even though they
can access other fields. However, this seems less ergonomic than:

```rust
#[non_exhaustive]
pub struct Thing {
pub number_of_eggs: usize,
pub friction_constant: f64,
pub can_swim: bool,
}
```

Because then the upstream crate won't have to add the `non_exhaustive` field
when constructing a value, or exhaustively matching patterns.

## Extensions to traits

Tangentially, it also makes sense to have non-exhaustive traits as well, even
though they'd be non-exhaustive in a different way. Take this example from
[`byteorder`]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This concept seems sufficiently different to deserve a different attribute.

Copy link
Member

@whitequark whitequark May 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say #[extensible] applies here just as well--as in extending the set of supertraits. In case we go for .. for enum and struct (which I think is a good idea), we could go with:

trait A: B + C + .. {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. extensible does sound good. Not only for extending traits but also extending the provided methods. And when I do look at it that way it does seem fairly related.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that they seem sufficiently different, but I see them all as solving the same problem: instead of preventing a user from implementing/using something by creating a hidden field, this simply states that not all of the dependencies can be listed by the user.


```rust
pub trait ByteOrder: Clone + Copy + Debug + Default + Eq + Hash + Ord + PartialEq + PartialOrd {
// ...
}
```

The `ByteOrder` trait requires these traits so that a user can simply write a
bound of `T: ByteOrder` without having to add other useful traits, like `Hash`
or `Eq`.

This trait is useful, but the crate has no intention of letting other users
implement this trait themselves, because then adding an additional trait
dependency for `ByteOrder` could be a breaking change.

The way that this crate solves this problem is by adding a hidden trait
dependency:

```rust
mod private {
pub trait Sealed {}
impl Sealed for super::LittleEndian {}
impl Sealed for super::BigEndian {}
}

pub trait ByteOrder: /* ... */ + private::Sealed {
// ...
}
```

This way, although downstream crates can use this trait, they cannot actually
implement things for this trait.

This pattern could again be solved by using `#[non_exhaustive]`:

```rust
#[non_exhaustive]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thinkno_extern_impl is a better name, because exhaustivenes feels weird for traits.

pub trait ByteOrder: /* ... */ {
// ...
}
```

This would indicate to downstream traits that this trait might gain additional
requirements (dependent traits or methods to implement), and as such, cannot be
implemented downstream.

[RFC 757]: https://github.com/rust-lang/rfcs/pull/757
[`std::io::ErrorKind`]: https://doc.rust-lang.org/1.17.0/std/io/enum.ErrorKind.html
[`diesel::result::Error`]: https://docs.rs/diesel/0.13.0/diesel/result/enum.Error.html
[`byteorder`]: https://github.com/BurntSushi/byteorder/tree/f8e7685b3a81c52f5448fd77fb4e0535bc92f880