Rust has a rich type system which allows to express our program primitives, entities, notions, logic and semantics mostly in types, rather than in data/values, which is known as a "programming with types" concept. The benefits of this are obvious: the more compiler knows about our problem - the more false programs it will decline. Or, rephrased: the more we describe about the program in types - the more we reduce the probability for the program to be incorrect.
"Programming with types" inevitably implies its own idioms and patterns. The most common are described below.
Consider the following example, which demonstrates a possible bug:
#[derive(Clone)]
struct Post {
id: u64,
user_id: u64,
title: String,
body: String,
}
fn repost(post: &Post, new_author_id: u64) -> Post {
let mut new_post = post.clone();
new_post.id = new_author_id; // Oops!
new_post
}
Here the problem occurs because our entities are expressed in values, so compiler makes no difference between Post::id
and Post::user_id
as they have the same type.
Let's express those entities in types:
mod post {
#[derive(Clone, Debug, PartialEq)]
pub struct Id(u64);
#[derive(Clone, Debug, PartialEq)]
pub struct Title(String);
#[derive(Clone, Debug, PartialEq)]
pub struct Body(String);
}
mod user {
#[derive(Clone, Debug, PartialEq)]
pub struct Id(u64);
}
#[derive(Clone)]
struct Post {
id: post::Id,
user_id: user::Id,
title: post::Title,
body: post::Body,
}
fn repost(post: &Post, new_author_id: user::Id) -> Post {
let mut new_post = post.clone();
new_post.id = new_author_id; // Does not compile!
new_post
}
Now, compiler is able to cut off this type of bugs totally at compile time, and to be quite informative with errors:
error[E0308]: mismatched types
--> src/main.rs:27:19
|
27 | new_post.id = new_author_id;
| ^^^^^^^^^^^^^ expected struct `post::Id`, found struct `user::Id`
|
= note: expected type `post::Id`
found type `user::Id`
This is what is called "newtype pattern". Newtypes are a zero-cost abstraction - there is no runtime overhead. Additionally, you may enforce desired invariants on values of the type (for example, Email
type may allow only valid email address strings to be its values, and another good example is uom
crate). Also, newtype pattern makes code more understandable for developers, as domain knowledge is reflected in types, so is described and documented more explicitly.
The downside of using newtype pattern is a necessity of writing more boilerplate code, because you should provide common traits implementations by yourself (like Clone
, Copy
, From
/Into
/AsRef
/AsMut
), as without them the type won't be ergonomic in use. However, most of them can be derived automatically with std
capabilities or third-party derive-crates (like derive_more
), so the cost is acceptable in most cases. Furthermore, the excellent nutype
crate pushes this idea even further, aiming to provide the best ergonomics for newtype pattern without compromising any guarantees it gives.
For better understanding newtype pattern, read through the following articles:
- Rust Design Patterns: Newtype
- Rust By Example: 14.7. New Type Idiom
- Alexis King: Parse, don’t validate (ru)
- Stefan Baumgartner: Refactoring in Rust: Abstraction with the Newtype Pattern
- Official
nutype
crate docs
Newtype pattern prevents us from invalid use of data. But what about behavior? Can we enforce some behavioral invariants at compile time, so compiler is able to cut off incorrect behavior totally?
Not always, but yes in some cases. One possible way is to use typestates to represent (in types) a sequence of states our type is able to be in, and to declare transitions (via functions) between these states. Doing so will allow compiler to cut off incorrect state transitions at compile time.
A real-world example of applying this idiom in Rust would be the awesome state_machine_future
crate.
For better understanding typestates, read through the following articles:
- David Teller: Typestates in Rust
- Cliff L. Biffle: The Typestate Pattern in Rust
- Ana Hobden: Pretty State Machine Patterns in Rust
- Will Crichton: Type-level Programming in Rust
- Sergey Potapov: Builder with typestate in Rust
- State Pattern - Design Patterns
- Azriel Hoh: Compile Time Correctness: Type State
Estimated time: 1 day
For the Post
type described above, assume the following behavior in our application:
+-----+ +-------------+ +-----------+
| New |--publish()-->| Unmoderated |--allow()-->| Published |
+-----+ +-------------+ +-----------+
| |
deny() delete()
| +---------+ |
+------>| Deleted |<-------+
+---------+
Implement this behavior using typestates idiom, so that calling delete()
on New
post (or calling deny()
on Deleted
post) will be a compile-time error.
Write simple tests for the task.
After completing everything above, you should be able to answer (and understand why) the following questions:
- Why expressing semantics in types is good? What are the benefits and downsides?
- What is newtype pattern? How does it work? Which guarantees does it give?
- What is typestates pattern? How does it work? Which guarantees does it give?