Skip to content

Latest commit

 

History

History
 
 

2_1_type_safety

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Step 2.1: Rich types ensure correctness

Rust has a rich type system which allows to express our program primitives, entities, notions, logic and semantics mostly in types, rather than in data/values, which is known as a "programming with types" concept. The benefits of this are obvious: the more compiler knows about our problem - the more false programs it will decline. Or, rephrased: the more we describe about the program in types - the more we reduce the probability for the program to be incorrect.

"Programming with types" inevitably implies its own idioms and patterns. The most common are described below.

Newtype

Consider the following example, which demonstrates a possible bug:

#[derive(Clone)]
struct Post {
    id: u64,
    user_id: u64,
    title: String,
    body: String,
}

fn repost(post: &Post, new_author_id: u64) -> Post {
    let mut new_post = post.clone();
    new_post.id = new_author_id;  // Oops!
    new_post
}

Here the problem occurs because our entities are expressed in values, so compiler makes no difference between Post::id and Post::user_id as they have the same type.

Let's express those entities in types:

mod post {
    #[derive(Clone, Debug, PartialEq)]
    pub struct Id(u64);

    #[derive(Clone, Debug, PartialEq)]
    pub struct Title(String);

    #[derive(Clone, Debug, PartialEq)]
    pub struct Body(String);
}
mod user {
    #[derive(Clone, Debug, PartialEq)]
    pub struct Id(u64);
}

#[derive(Clone)]
struct Post {
    id: post::Id,
    user_id: user::Id,
    title: post::Title,
    body: post::Body,
}

fn repost(post: &Post, new_author_id: user::Id) -> Post {
    let mut new_post = post.clone();
    new_post.id = new_author_id;  // Does not compile!
    new_post
}

Now, compiler is able to cut off this type of bugs totally at compile time, and to be quite informative with errors:

error[E0308]: mismatched types
  --> src/main.rs:27:19
   |
27 |     new_post.id = new_author_id;
   |                   ^^^^^^^^^^^^^ expected struct `post::Id`, found struct `user::Id`
   |
   = note: expected type `post::Id`
              found type `user::Id`

This is what is called "newtype pattern". Newtypes are a zero-cost abstraction - there is no runtime overhead. Additionally, you may enforce desired invariants on values of the type (for example, Email type may allow only valid email address strings to be its values, and another good example is uom crate). Also, newtype pattern makes code more understandable for developers, as domain knowledge is reflected in types, so is described and documented more explicitly.

The downside of using newtype pattern is a necessity of writing more boilerplate code, because you should provide common traits implementations by yourself (like Clone, Copy, From/Into/AsRef/AsMut), as without them the type won't be ergonomic in use. However, most of them can be derived automatically with std capabilities or third-party derive-crates (like derive_more), so the cost is acceptable in most cases. Furthermore, the excellent nutype crate pushes this idea even further, aiming to provide the best ergonomics for newtype pattern without compromising any guarantees it gives.

For better understanding newtype pattern, read through the following articles:

Typestates

Newtype pattern prevents us from invalid use of data. But what about behavior? Can we enforce some behavioral invariants at compile time, so compiler is able to cut off incorrect behavior totally?

Not always, but yes in some cases. One possible way is to use typestates to represent (in types) a sequence of states our type is able to be in, and to declare transitions (via functions) between these states. Doing so will allow compiler to cut off incorrect state transitions at compile time.

A real-world example of applying this idiom in Rust would be the awesome state_machine_future crate.

For better understanding typestates, read through the following articles:

Task

Estimated time: 1 day

For the Post type described above, assume the following behavior in our application:

+-----+              +-------------+            +-----------+
| New |--publish()-->| Unmoderated |--allow()-->| Published |
+-----+              +-------------+            +-----------+
                           |                          |
                         deny()                    delete()
                           |       +---------+        |
                           +------>| Deleted |<-------+
                                   +---------+

Implement this behavior using typestates idiom, so that calling delete() on New post (or calling deny() on Deleted post) will be a compile-time error. Write simple tests for the task.

Questions

After completing everything above, you should be able to answer (and understand why) the following questions:

  • Why expressing semantics in types is good? What are the benefits and downsides?
  • What is newtype pattern? How does it work? Which guarantees does it give?
  • What is typestates pattern? How does it work? Which guarantees does it give?