diff --git a/www/_blog/2022-06-30-error-handling.mdx b/www/_blog/2022-06-30-error-handling.mdx new file mode 100644 index 000000000..06d45104e --- /dev/null +++ b/www/_blog/2022-06-30-error-handling.mdx @@ -0,0 +1,523 @@ +--- +title: More than you've ever wanted to know about errors in Rust +description: A (mostly) complete guide to error handling in Rust +author: ben +tags: [rust, tutorial] +thumb: ferris-error-handling.png +cover: ferris-error-handling.png +date: "2022-06-30T18:00:00" +--- + +This blog post is powered by shuttle! The serverless platform built for Rust. + +--- + +To quote the Rust Book, 'errors are a fact of life in software'. This post goes +over how to handle them. + +Before talking about recoverable errors and the `Result` type, let's first +touch on unrecoverable errors - a.k.a panics. + +## Panics + +[Panics](https://doc.rust-lang.org/std/macro.panic.html) are exceptions a program can throw. It stops all execution in the current thread. When a panic is thrown it returns a short description of what went wrong as well as information about the position of the the panic. + +```rust +fn main() { + panic!("error!"); + println!("Never reached :("); +} +``` + +Running the above causes: + +``` +thread 'main' panicked at 'error!', examples\panics.rs:2:5 +``` + +They are similar to `throw` in JavaScript and other languages, in that they +don't require an annotation on the function to run and they can pass through +function boundaries. However in Rust, panics cannot be recovered from, there is no way to incept a panic in the current thread. + +```rust +fn send_message(s: String) { + if s.is_empty() { + panic!("Cannot send empty message"); + } else { + // ... + } +} +``` + +The `send_message` function is fallible (can go wrong). If this is called +with an empty message then the program stops running. There is no way for the callee to track that an error has occurred. + +For recoverable errors, Rust has a type for error handling in the standard +library called a **`Result`**. It is a generic type, which means the result +and error variant can basically be whatever you want. + +```rust +pub enum Result { + Ok(T), + Err(E), +} +``` + +## Basic error creation and handling + +At the moment our `send_message` function doesn't return anything. This means no information can be received by the callee. We can change the definition to instead return a `Result` and rather than panicking we can early return a `Result::Err`. + +```rust +fn send_message(s: String) -> Result<(), &'static str> { + if s.is_empty() { + // Note the standard prelude includes `Err` so the `Result::Err` and `Err` are equivalent + return Result::Err("message is empty") + } else { + // ... + } + Ok(()) +} +``` + +Now our function actually returns information about what went wrong we can handle it when we call it: + +```rust +if let Err(send_error) = send_message(message) { + show_user_error(send_error); +} +``` + +### Rust knows when a Result is unused. + +In the above example we inspect the value of the item and branch on it. +However, if we didn't inspect and handle the returned Result then the Rust +compiler gives us a helpful warning about it so that you don't forget to +explicitly deal with errors in your program. + +``` +| send_message(); +| ^^^^^^^^^^^^^^^ += note: `#[warn(unused_must_use)]` on by default += note: this `Result` may be an `Err` variant, which should be handled +``` + +### Examples of `Result` in the standard library + +`Result` can be found in most libraries. One of my favorite examples is the +return type of the [FromStr::from_str](https://doc.rust-lang.org/std/str/trait.FromStr.html#tymethod.from_str) trait method. With [str::parse](https://doc.rust-lang.org/std/primitive.str.html#method.parse) (which uses the `FromStr` trait) we can do the following: + +```rust +fn main() { + let mut input = String::new(); + std::io::stdin().read_line(&mut input).unwrap(); + + match input.trim_end().parse::() { + Ok(number) => { + dbg!(number); + } + Err(err) => { + dbg!(err); + } + }; +} +``` + +(We'll ignore the `unwrap` for now 😉) + +```js +$ cargo r --example input -q +10 +[examples\input.rs:7] number = 10.0 + +$ cargo r --example input -q +100 +[examples\input.rs:7] number = 100.0 + +$ cargo r --example input -q +bad +[examples\input.rs:10] err = ParseFloatError { + kind: Invalid, +} +``` + +Here we can see when we type in a number we get a `Ok` variant with the number else we get a [ParseFloatError](https://doc.rust-lang.org/std/num/struct.ParseFloatError.html) + +## Files, networks and databases + +**All errors occur when you interact with the outside world or things +outside the Rust runtime**. One of the places where a lot of errors can +occur is interacting with the file system. The `File::open` function +attempts to open a file. This can fail for a variety of reasons. The +filename is invalid, the file doesn't exist or you simply don't have +permission to read the file. Notice the errors are well-defined and known +before-hand. You can even access the error variants with the [`kind`](https://doc.rust-lang.org/std/io/struct.Error.html#method.kind) function +and in order to implement your program logic or return an instructive error +message to the user. + +### Aliasing Results and errors + +When you're working on a project you'll often find yourself repeating +yourself when it comes to return types in function signatures: + +```rust +fn foo() -> Result { +... +} +``` + +To give a concrete example, all functions which operate on the file system have +the same +errors (file not exists, invalid permissions). [io::Result](https://doc.rust-lang.org/std/io/type.Result.html) is a alias over a result but means that every function does not have to specify the error type: + +```rust +pub type Result = Result; +``` + +If you have an API which has a common error type, you may want to +consider this pattern. + +### The question mark operator + +One of the best things about Results is the question mark operator, The +question mark operator can short circuit Result +error values. Let's look at a simple function which uploads text from a file. +This can error in a bunch of different ways: + +```rust +fn upload_file() -> Result<(), &'static str> { + let text = match std::fs::read_to_string("file.txt").map_err(|_| "read file error") { + Ok(value) => value, + Err(err) => { + return err; + } + }; + if let Err(err) = upload_text(text) { + return err + } + Ok(()) +} +``` + +Hang on, we're writing Rust not Go! + +If a `?` is postfixed on to a Result (or anything that implements [`try`](https://doc.rust-lang.org/std/ops/trait.Try.html) so also `Option`) we can +obtain a functionally equivalent outcome with a much more readable and +concise syntax. + +```rust +fn upload_file() -> Result<(), &'static str> { + let text = std::fs::read_to_string("file.txt").map_err(|_| "read file error")?; + upload_text(text)?; + Ok(()) +} +``` + +As long as the calling function also returns a `Result` with the +same `Error` type, `?` saves a ton of explicit code being written. Moreover, +the question-mark implicitly runs [Into::into](https://doc.rust-lang.org/std/convert/trait.Into.html#tymethod.into) +(which is automatically implemented for [From](https://doc.rust-lang.org/std/convert/trait.From.html) implementors) on the error value. So we don't have to worry about converting the error before we use the operator: + +```rust +// This derive an into implementation for `std::io::Error -> MyError` +#[derive(derive_enum_from_into::EnumFrom)] +enum MyError { + IoError(std::io::Error) + // ... +} + +fn do_stuff() -> Result<(), MyError> { + let file = File::open("data.csv")?; + // ... +} +``` + +We will look at more patterns for combining error types later! + +## [The Error trait](https://doc.rust-lang.org/std/error/trait.Error.html) + +The [Error](https://doc.rust-lang.org/std/error/trait.Error.html#) trait is +defined in the standard library. It basically represents the expectations of +error values - values of type `E` in `Result`. +[The Error trait is implemented for many errors](https://doc.rust-lang.org/std/error/trait.Error.html#implementors) +and provides a unified API for information on errors. The Error trait is a bit needy and requires that the error implements both [Debug](https://doc.rust-lang.org/std/fmt/trait.Debug.html) and [Display](https://doc.rust-lang.org/std/fmt/trait.Display.html). While it can be cumbersome to implement we will see some helper libraries for doing so later on. + +In the standard library [VarError](https://doc.rust-lang.org/std/env/enum.VarError.html) (for reading environment variables) and [ParseIntError](https://doc.rust-lang.org/std/num/struct.ParseIntError.html) (for parsing a string slice as a integer) are different errors. When we interact them we need to differentiate between the types because they have different properties and different stack sizes. To build a combination of them we could build a sum type using an enum. Alternatively we can use dynamically dispatched traits which handle varying stack sized items and other type information. + +Using the above mentioned try syntax (`?`) we can convert the above errors to be dynamically dispatched. This makes it easy to handle different errors without building enums to combine errors. + +```rust +fn main() -> Result<(), Box> { + let key = std::env::var("NUMBER_IN_ENV")?; + let number = key.parse::()?; + println!("\"NUMBER_IN_ENV\" is {}", number); + Ok(()) +} +``` + +While this is an easy way to handle errors, it isn't easy to differentiate +between the types and can make handling errors in libraries hard. More information on this later. + +### The Error trait vs Results and enums + +One thing when using an enum is we can use `match` to branch on the enum +error variants. On the other hand, with the `dyn` trait unless you go down +the down casting path it is very hard to get specific information about the +error: + +```rust +match my_enum_error { + FsError(err) => { + report_fs_error(err) + }, + DbError(DbError { err, database }) => { + report_db_error(database, err) + }, +} +``` + +For reusable libraries it is better to use enums to combine errors so that +users of your library can handle the specifics themselves. But for CLIs and +other applications using the trait can be a lot simpler. + +## Methods on Result + +Result and Option contains many useful functions. Here are some functions I +commonly use: + +### [Result::map](https://doc.rust-lang.org/std/result/enum.Result.html#method.map) + +This maps or converts the `Ok` value if it exists. This can be more concise +than using the `?` operator. + +```rust +fn string_to_plus_one(s: &str) -> Result { + s.parse::().map(|num| num + 1) +} +``` + +### [Result::ok](https://doc.rust-lang.org/std/result/enum.Result.html#method.ok) + +Useful for converting Results to Options + +```rust +assert_eq!(Ok(2).ok(), Some(2)); +assert_eq!(Err("err!").ok(), None); +``` + +### [Option::ok_or_else](https://doc.rust-lang.org/std/option/enum.Option.html#method.ok_or_else) + +Useful for going the other way in converting from Options to Results + +```rust +fn get_first(vec: &Vec) -> Result<&i32, NotInVec> { + vec.first().ok_or_else(|| NotInVec) +} +``` + +### Error handling for iteration + +Using results in iterator chains can be a little confusing. Luckily `Result` +implements [collect](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.collect). +We can use this to short circuit an iterator if an error +occurs. In the following, if all the `parse`s succeed then we get collected vec of numbers result. If one fails then it instead returns a Result with the failing Err. + +```rust +fn main() { + let a = ["1", "2", "not a number"] + .into_iter() + .map(|a| a.parse::()) + .collect::, _>>(); + dbg!(a); +} +``` + +``` +[examples\iteration.rs:6] a = Err( ParseFloatError { kind: Invalid, }, ) +``` + +Removing the `"not a number"` entry + +``` +[examples\iteration.rs:3] a = Ok( [ 1.0, 2.0, ], ) +``` + +Because Rust iterators are *piecewise* and lazy the iterator can short circuit without evaluating parse on any of the later items. + +## More Panic + +### Special panics + +`todo!()`, `unimplemented!()`, `unreachable!()` are all wrappers for `panic! +()` which but are specialized to their situation. Panics have a special [`!`](https://doc.rust-lang.org/reference/types/never.html) +type, called the 'never type', which represents the result of computations +that never complete (also means it can be passed anywhere): + +```rust +fn func_i_havent_written_yet() -> u32 { + todo!() +} +``` + +Sometimes there is Rust code which the compiler cannot properly infer is +valid. For this type of situation, the `unreachable!` panic can be used: + +```rust +fn get_from_vec_else_zero(a: Vec) -> i32 { + if let Some(value) = a.get(2) { + if let Some(prev_value) = a.get(1) { + prev_value + } else { + unreachable!() + } + } else { + 0 + } +} +``` + +### Unwrapping + +`unwrap` is a method on `Result` and `Option`. They return the `Ok` or `Some` +variant or else panic... + +```rust +// result.unwrap() + +let value = if let Ok(value) = result { + value +} else { + panic!("Unwrapped!") +}; +``` + +The uses-cases for this are developer error and situations the compiler can't +quite figure out. +If you are just trying something and don't want to set up a full error handling system then they can be used to ignore compiler warnings. + +Even if the situation calls for `unwrap` you are better off using `expect` +which has an accompanying message - you'll be thanking your past self when the +`expect` error message helps you find the root cause of an issue 2 weeks +down the line. + +### Panics in the standard library + +It is important to note that some of the APIs in the standard library *can* +panic. You should look out for these annotations in the docs. One of them is +[Vec::remove](https://doc.rust-lang.org/std/vec/struct.Vec.html#panics-6). +If you use this you should ensure that the argument is in its indexable range. + +```rust +fn remove_at_idx(a: usize, vec: &mut Vec) -> Option { + if a < idx.len() { + Some(vec.remove(a)) + } else { + None + } +} +``` + +## Handling multiple errors and helper crates: + +Handling errors from multiple libraries and APIs can become challenging as +you have to deal with a bunch of different types. They are different sizes +and contain different information. To unify the types we have to build a sum +type using an enum, in order to ensure they have the same size at compile time. + +```rust +enum Errors { + FileSystemError(..), + StringParseError(..), + NetworkError(..), +} +``` + +Some crates for making creating these unifying enums easier: + +### [thiserror](https://crates.io/crates/thiserror) + +`thiserror` provides a derive implementation which adds the Error trait for us. +As previously mentioned, to implement Error we have to implement display and +thiserrors' `#[error]` attributes provide templating for the displayed errors. + +```rust +use thiserror::Error; + +#[derive(Error, Debug)] +pub enum DataStoreError { + #[error("data store disconnected")] + Disconnect(#[from] io::Error), + #[error("the data for key `{0}` is not available")] + Redaction(String), + #[error("invalid header (expected {expected:?}, found {found:?})")] + InvalidHeader { + expected: String, + found: String, + }, + #[error("unknown data store error")] + Unknown, +} +``` + +### [anyhow](https://crates.io/crates/anyhow) + +`anyhow` provides an ergonomic and idiomatic alternative to explicitly +handling errors. It is similar to the previously mentioned error trait but +has additional features such as adding context to thrown errors. + +This is really, really, useful when you want to convey errors to an +application's users in a context-aware fashion: + +```rust +use anyhow::{bail, Result, Context}; + +fn main() -> Result<()> { + println!("Hello World!"); + func1().context("while calling func1")?; + Ok(()) +} + +fn func1() -> Result<()> { + func2().context("while calling func2") +} + +fn func2() -> Result<()> { + bail!("Hmm something went wrong ") +} +``` + +``` +Error: while calling func1 + +Caused by: + 0: while calling func2 + 1: Hmm something went wrong +``` + + +Similar to the `Error` trait, `anyhow` suffers from the fact you can't match on +`anyhow`'s result error variant. This is why it is suggested in `anyhow`'s +docs to use `anyhow` for applications and `thiserror` for libraries. + +### [eyre](https://crates.io/crates/eyre) + +Finally, `eyre` is a fork of `anyhow` and adds more backtrace information. +It's highly customisable and using [color-eyre](https://lib.rs/crates/color-eyre) we get colors in our panic messages - a little color +always brightens up the dev experience. + +``` +The application panicked (crashed). +Message: test +Location: examples\color_eyre.rs:6 + + ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ + ⋮ 13 frames hidden ⋮ + 14: core::ops::function::FnOnce::call_once,eyre::Report>, 1, 18446744073709551615, Err> (*)(),tuple$<> > + at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c\library\core\src\ops\function.rs:227 + ⋮ 17 frames hidden ⋮ +``` + +## [Shuttle](https://www.shuttle.rs/): Stateful Serverless for Rust + +Deploying and managing your Rust web apps can be an expensive, anxious and time consuming process. + +If you want a batteries included and ops-free experience, [try out Shuttle](https://docs.rs/shuttle-service/latest/shuttle_service/). diff --git a/www/public/images/blog/ferris-error-handling.png b/www/public/images/blog/ferris-error-handling.png new file mode 100644 index 000000000..4a7316394 Binary files /dev/null and b/www/public/images/blog/ferris-error-handling.png differ