diff --git a/src/doc/intro.md b/src/doc/intro.md index 14027b93419bd..b45610faebc41 100644 --- a/src/doc/intro.md +++ b/src/doc/intro.md @@ -1,435 +1,576 @@ % A 30-minute Introduction to Rust -Rust is a systems programming language that combines strong compile-time correctness guarantees with fast performance. -It improves upon the ideas of other systems languages like C++ -by providing guaranteed memory safety (no crashes, no data races) and complete control over the lifecycle of memory. -Strong memory guarantees make writing correct concurrent Rust code easier than in other languages. -This introduction will give you an idea of what Rust is like in about thirty minutes. -It expects that you're at least vaguely familiar with a previous 'curly brace' language, -but does not require prior experience with systems programming. -The concepts are more important than the syntax, -so don't worry if you don't get every last detail: -the [guide](guide.html) can help you out with that later. - -Let's talk about the most important concept in Rust, "ownership," -and its implications on a task that programmers usually find very difficult: concurrency. - -# The power of ownership - -Ownership is central to Rust, -and is the feature from which many of Rust's powerful capabilities are derived. -"Ownership" refers to which parts of your code are allowed to read, -write, and ultimately release, memory. -Let's start by looking at some C++ code: - -```cpp -int* dangling(void) -{ - int i = 1234; - return &i; -} +Rust is a modern systems programming language focusing on safety and speed. It +accomplishes these goals by being memory safe without using garbage collection. -int add_one(void) -{ - int* num = dangling(); - return *num + 1; -} +This introduction will give you a rough idea of what Rust is like, eliding many +details. It does not require prior experience with systems programming, but you +may find the syntax easier if you've used a 'curly brace' programming language +before, like C or JavaScript. The concepts are more important than the syntax, +so don't worry if you don't get every last detail: you can read [the +Guide](guide.html) to get a more complete explanation. + +Because this is about high-level concepts, you don't need to actually install +Rust to follow along. If you'd like to anyway, check out [the +homepage](http://rust-lang.org) for explanation. + +To show off Rust, let's talk about how easy it is to get started with Rust. +Then, we'll talk about Rust's most interesting feature, **ownership**, and +then discuss how it makes concurrency easier to reason about. Finally, +we'll talk about how Rust breaks down the perceived dichotomy between speed +and safety. + +# Tools + +Getting started on a new Rust project is incredibly easy, thanks to Rust's +package manager, [Cargo](http://crates.io). + +To start a new project with Cargo, use `cargo new`: + +```{bash} +$ cargo new hello_world --bin ``` -**Note: The above C++ code is deliberately simple and non-idiomatic for the purpose -of demonstration. It is not representative of production-quality C++ code.** - -This function allocates an integer on the stack, -and stores it in a variable, `i`. -It then returns a reference to the variable `i`. -There's just one problem: -stack memory becomes invalid when the function returns. -This means that in the second line of `add_one`, -`num` points to some garbage values, -and we won't get the effect that we want. -While this is a trivial example, -it can happen quite often in C++ code. -There's a similar problem when memory on the heap is allocated with `malloc` (or `new`), -then freed with `free` (or `delete`), -yet your code attempts to do something with the pointer to that memory. -This problem is called a 'dangling pointer,' -and it's not possible to write Rust code that has it. -Let's try writing it in Rust: - -```ignore -fn dangling() -> &int { - let i = 1234; - return &i; -} +We're passing `--bin` because we're making a binary program: if we +were making a library, we'd leave it off. -fn add_one() -> int { - let num = dangling(); - return *num + 1; -} +Let's check out what Cargo has generated for us: + +```{bash} +$ cd hello_world +$ tree . +. +├── Cargo.toml +└── src + └── main.rs + +1 directory, 2 files +``` + +This is all we need to get started. First, let's check out `Cargo.toml`: +```{toml} +[package] + +name = "hello_world" +version = "0.0.1" +authors = ["Your Name "] +``` + +This is called a **manifest**, and it contains all of the metadata that Cargo +needs to compile your project. + +Here's what's in `src/main.rs`: + +```{rust} fn main() { - add_one(); + println!("Hello, world!") } ``` -Save this program as `dangling.rs`. When you try to compile this program with `rustc dangling.rs`, you'll get an interesting (and long) error message: - -```text -dangling.rs:3:12: 3:14 error: `i` does not live long enough -dangling.rs:3 return &i; - ^~ -dangling.rs:1:23: 4:2 note: reference must be valid for the anonymous lifetime #1 defined on the block at 1:22... -dangling.rs:1 fn dangling() -> &int { -dangling.rs:2 let i = 1234; -dangling.rs:3 return &i; -dangling.rs:4 } -dangling.rs:1:23: 4:2 note: ...but borrowed value is only valid for the block at 1:22 -dangling.rs:1 fn dangling() -> &int { -dangling.rs:2 let i = 1234; -dangling.rs:3 return &i; -dangling.rs:4 } -error: aborting due to previous error +Cargo generated a 'hello world' for us. We'll talk more about the syntax here +later, but that's what Rust code looks like! Let's compile and run it: + +```{bash} +$ cargo run + Compiling hello_world v0.0.1 (file:///Users/you/src/hello_world) + Running `target/hello_world` +Hello, world! ``` -In order to fully understand this error message, -we need to talk about what it means to "own" something. -So for now, -let's just accept that Rust will not allow us to write code with a dangling pointer, -and we'll come back to this code once we understand ownership. - -Let's forget about programming for a second and talk about books. -I like to read physical books, -and sometimes I really like one and tell my friends they should read it. -While I'm reading my book, I own it: the book is in my possession. -When I loan the book out to someone else for a while, they "borrow" it from me. -And when you borrow a book, it's yours for a certain period of time, -and then you give it back to me, and I own it again. Right? - -This concept applies directly to Rust code as well: -some code "owns" a particular pointer to memory. -It's the sole owner of that pointer. -It can also lend that memory out to some other code for a while: -that code "borrows" the memory, -and it borrows it for a precise period of time, -called a "lifetime." - -That's all there is to it. -That doesn't seem so hard, right? -Let's go back to that error message: -`error: 'i' does not live long enough`. -We tried to loan out a particular variable, `i`, -using a reference (the `&` operator) but Rust knew that the variable would be invalid after the function returns, -and so it tells us that: -`reference must be valid for the anonymous lifetime #1...`. -Neat! - -That's a great example for stack memory, -but what about heap memory? -Rust has a second kind of pointer, -an 'owned box', -that you can create with the `box` operator. -Check it out: +Using an external dependency in Rust is incredibly easy. You add a line to +your `Cargo.toml`: +```{toml} +[package] + +name = "hello_world" +version = "0.0.1" +authors = ["Your Name "] + +[dependencies.semver] + +git = "https://github.com/rust-lang/semver.git" ``` -fn dangling() -> Box { - let i = box 1234i; - return i; -} +You added the `semver` library, which parses version numbers and compares them +according to the [SemVer specification](http://semver.org/). + +Now, you can pull in that library using `extern crate` in +`main.rs`. + +```{rust,ignore} +extern crate semver; -fn add_one() -> int { - let num = dangling(); - return *num + 1; +use semver::Version; + +fn main() { + assert!(Version::parse("1.2.3") == Ok(Version { + major: 1u, + minor: 2u, + patch: 3u, + pre: vec!(), + build: vec!(), + })); + + println!("Versions compared successfully!"); } ``` -Now instead of a stack allocated `1234i`, -we have a heap allocated `box 1234i`. -Whereas `&` borrows a pointer to existing memory, -creating an owned box allocates memory on the heap and places a value in it, -giving you the sole pointer to that memory. -You can roughly compare these two lines: +Again, we'll discuss the exact details of all of this syntax soon. For now, +let's compile and run it: +```{bash} +$ cargo run + Updating git repository `https://github.com/rust-lang/semver.git` + Compiling semver v0.0.1 (https://github.com/rust-lang/semver.git#bf739419) + Compiling hello_world v0.0.1 (file:///home/you/projects/hello_world) + Running `target/hello_world` +Versions compared successfully! ``` -// Rust -let i = box 1234i; + +Because we only specified a repository without a version, if someone else were +to try out our project at a later date, when `semver` was updated, they would +get a different, possibly incompatible version. To solve this problem, Cargo +produces a file, `Cargo.lock`, which records the versions of any dependencies. +This gives us repeatable builds. + +There is a lot more here, and this is a whirlwind tour, but you should feel +right at home if you've used tools like [Bundler](http://bundler.io/), +[npm](https://www.npmjs.org/), or [pip](https://pip.pypa.io/en/latest/). +There's no `Makefile`s or endless `autotools` output here. (Rust's tooling does +[play nice with external libraries written in those +tools](http://crates.io/native-build.html), if you need to.) + +Enough about tools, let's talk code! + +# Ownership + +Rust's defining feature is 'memory safety without garbage collection.' Let's +take a moment to talk about what that means. **Memory safety** means that the +programming language eliminates certain kinds of bugs, such as [buffer +overflows](http://en.wikipedia.org/wiki/Buffer_overflow) and [dangling +pointers](http://en.wikipedia.org/wiki/Dangling_pointer). These problems occur +when you have unrestricted access to memory. As an example, here's some Ruby +code: + +```{ruby} +v = []; + +v.push("Hello"); + +x = v[0]; + +v.push("world"); + +puts x ``` -```cpp -// C++ -int *i = new int; -*i = 1234; +We make an array, `v`, and then call `push` on it. `push` is a method which +adds an element to the end of an array. + +Next, we make a new variable, `x`, that's equal to the first element of +the array. Simple, but this is where the 'bug' will appear. + +Let's keep going. We then call `push` again, pushing "world" onto the +end of the array. `v` now is `["Hello", "world"]`. + +Finally, we print `x` with the `puts` method. This prints "Hello." + +All good? Let's go over a similar, but subtly different example, in C++: + +```{cpp} +#include +#include +#include + +int main() { + std::vector v; + + v.push_back("Hello"); + + std::string& x = v[0]; + + v.push_back("world"); + + std::cout << x; +} ``` -Rust infers the correct type, -allocates the correct amount of memory and sets it to the value you asked for. -This means that it's impossible to allocate uninitialized memory: -*Rust does not have the concept of null*. -Hooray! -There's one other difference between this line of Rust and the C++: -The Rust compiler also figures out the lifetime of `i`, -and then inserts a corresponding `free` call after it's invalid, -like a destructor in C++. -You get all of the benefits of manually allocated heap memory without having to do all the bookkeeping yourself. -Furthermore, all of this checking is done at compile time, -so there's no runtime overhead. -You'll get (basically) the exact same code that you'd get if you wrote the correct C++, -but it's impossible to write the incorrect version, thanks to the compiler. - -You've seen one way that ownership and borrowing are useful to prevent code that would normally be dangerous in a less-strict language, -but let's talk about another: concurrency. - -# Owning concurrency - -Concurrency is an incredibly hot topic in the software world right now. -It's always been an interesting area of study for computer scientists, -but as usage of the Internet explodes, -people are looking to improve the number of users a given service can handle. -Concurrency is one way of achieving this goal. -There is a pretty big drawback to concurrent code, though: -it can be hard to reason about, because it is non-deterministic. -There are a few different approaches to writing good concurrent code, -but let's talk about how Rust's notions of ownership and lifetimes contribute to correct but concurrent code. - -First, let's go over a simple concurrency example. -Rust makes it easy to create "tasks", -otherwise known as "threads". -Typically, tasks do not share memory but instead communicate amongst each other with 'channels', like this: +It's a little more verbose due to the static typing, but it's almost the same +thing. We make a `std::vector` of `std::string`s, we call `push_back` (same as +`push`) on it, take a reference to the first element of the vector, call +`push_back` again, and then print out the reference. +There's two big differences here: one, they're not _exactly_ the same thing, +and two... + +```{bash} +$ g++ hello.cpp -Wall -Werror +$ ./a.out +Segmentation fault (core dumped) ``` + +A crash! (Note that this is actually system-dependent. Because referring to an +invalid reference is undefined behavior, the compiler can do anything, +including the right thing!) Even though we compiled with flags to give us as +many warnings as possible, and to treat those warnings as errors, we got no +errors. When we ran the program, it crashed. + +Why does this happen? When we prepend to an array, its length changes. Since +its length changes, we may need to allocate more memory. In Ruby, this happens +as well, we just don't think about it very often. So why does the C++ version +segfault when we allocate more memory? + +The answer is that in the C++ version, `x` is a **reference** to the memory +location where the first element of the array is stored. But in Ruby, `x` is a +standalone value, not connected to the underyling array at all. Let's dig into +the details for a moment. Your program has access to memory, provided to it by +the operating system. Each location in memory has an address. So when we make +our vector, `v`, it's stored in a memory location somewhere: + +| location | name | value | +|----------|------|-------| +| 0x30 | v | | + +(Address numbers made up, and in hexadecimal. Those of you with deep C++ +knowledge, there are some simplifications going on here, like the lack of an +allocated length for the vector. This is an introduction.) + +When we push our first string onto the array, we allocate some memory, +and `v` refers to it: + +| location | name | value | +|----------|------|----------| +| 0x30 | v | 0x18 | +| 0x18 | | "Hello" | + +We then make a reference to that first element. A reference is a variable +that points to a memory location, so its value is the memory location of +the `"Hello"` string: + +| location | name | value | +|----------|------|----------| +| 0x30 | v | 0x18 | +| 0x18 | | "Hello" | +| 0x14 | x | 0x18 | + +When we push `"world"` onto the vector with `push_back`, there's no room: +we only allocated one element. So, we need to allocate two elements, +copy the `"Hello"` string over, and update the reference. Like this: + +| location | name | value | +|----------|------|----------| +| 0x30 | v | 0x08 | +| 0x18 | | GARBAGE | +| 0x14 | x | 0x18 | +| 0x08 | | "Hello" | +| 0x04 | | "world" | + +Note that `v` now refers to the new list, which has two elements. It's all +good. But our `x` didn't get updated! It still points at the old location, +which isn't valid anymore. In fact, [the documentation for `push_back` mentions +this](http://en.cppreference.com/w/cpp/container/vector/push_back): + +> If the new `size()` is greater than `capacity()` then all iterators and +> references (including the past-the-end iterator) are invalidated. + +Finding where these iterators and references are is a difficult problem, and +even in this simple case, `g++` can't help us here. While the bug is obvious in +this case, in real code, it can be difficult to track down the source of the +error. + +Before we talk about this solution, why didn't our Ruby code have this problem? +The semantics are a little more complicated, and explaining Ruby's internals is +out of the scope of a guide to Rust. But in a nutshell, Ruby's garbage +collector keeps track of references, and makes sure that everything works as +you might expect. This comes at an efficiency cost, and the internals are more +complex. If you'd really like to dig into the details, [this +article](http://patshaughnessy.net/2012/1/18/seeing-double-how-ruby-shares-string-values) +can give you more information. + +Garbage collection is a valid approach to memory safety, but Rust chooses a +different path. Let's examine what the Rust version of this looks like: + +```{rust,ignore} fn main() { - let numbers = vec![1i, 2i, 3i]; + let mut v = vec![]; - let (tx, rx) = channel(); - tx.send(numbers); + v.push("Hello"); - spawn(proc() { - let numbers = rx.recv(); - println!("{}", numbers[0]); - }) + let x = &v[0]; + + v.push("world"); + + println!("{}", x); } ``` -In this example, we create a boxed array of numbers. -We then make a 'channel', -Rust's primary means of passing messages between tasks. -The `channel` function returns two different ends of the channel: -a `Sender` and `Receiver` (commonly abbreviated `tx` and `rx`). -The `spawn` function spins up a new task, -given a *heap allocated closure* to run. -As you can see in the code, -we call `tx.send()` from the original task, -passing in our boxed array, -and we call `rx.recv()` (short for 'receive') inside of the new task: -values given to the `Sender` via the `send` method come out the other end via the `recv` method on the `Receiver`. - -Now here's the exciting part: -because `numbers` is an owned type, -when it is sent across the channel, -it is actually *moved*, -transferring ownership of `numbers` between tasks. -This ownership transfer is *very fast* - -in this case simply copying a pointer - -while also ensuring that the original owning task cannot create data races by continuing to read or write to `numbers` in parallel with the new owner. - -To prove that Rust performs the ownership transfer, -try to modify the previous example to continue using the variable `numbers`: - -```ignore +This looks like a bit of both: fewer type annotations, but we do create new +variables with `let`. The method name is `push`, some other stuff is different, +but it's pretty close. So what happens when we compile this code? Does Rust +print `"Hello"`, or does Rust crash? + +Neither. It refuses to compile: + +```{notrust,ignore} +$ cargo run + Compiling hello_world v0.0.1 (file:///Users/you/src/hello_world) +main.rs:8:5: 8:6 error: cannot borrow `v` as mutable because it is also borrowed as immutable +main.rs:8 v.push("world"); + ^ +main.rs:6:14: 6:15 note: previous borrow of `v` occurs here; the immutable borrow prevents subsequent moves or mutable borrows of `v` until the borrow ends +main.rs:6 let x = &v[0]; + ^ +main.rs:11:2: 11:2 note: previous borrow ends here +main.rs:1 fn main() { +... +main.rs:11 } + ^ +error: aborting due to previous error +``` + +When we try to mutate the array by `push`ing it the second time, Rust throws +an error. It says that we "cannot borrow v as mutable because it is also +borrowed as immutable." What's up with "borrowed"? + +In Rust, the type system encodes the notion of **ownership**. The variable `v` +is an "owner" of the vector. When we make a reference to `v`, we let that +variable (in this case, `x`) 'borrow' it for a while. Just like if you own a +book, and you lend it to me, I'm borrowing the book. + +So, when I try to modify the vector with the second call to `push`, I need +to be owning it. But `x` is borrowing it. You can't modify something that +you've lent to someone. And so Rust throws an error. + +So how do we fix this problem? Well, we can make a copy of the element: + + +```{rust} fn main() { - let numbers = vec![1i, 2i, 3i]; + let mut v = vec![]; - let (tx, rx) = channel(); - tx.send(numbers); + v.push("Hello"); - spawn(proc() { - let numbers = rx.recv(); - println!("{}", numbers[0]); - }); + let x = v[0].clone(); + + v.push("world"); - // Try to print a number from the original task - println!("{}", numbers[0]); + println!("{}", x); } ``` -The compiler will produce an error indicating that the value is no longer in scope: +Note the addition of `clone()`. This creates a copy of the element, leaving +the original untouched. Now, we no longer have two references to the same +memory, and so the compiler is happy. Let's give that a try: -```text -concurrency.rs:12:20: 12:27 error: use of moved value: 'numbers' -concurrency.rs:12 println!("{}", numbers[0]); - ^~~~~~~ +```{bash} +$ cargo run + Compiling hello_world v0.0.1 (file:///Users/you/src/hello_world) + Running `target/hello_world` +Hello ``` -Since only one task can own a boxed array at a time, -if instead of distributing our `numbers` array to a single task we wanted to distribute it to many tasks, -we would need to copy the array for each. -Let's see an example that uses the `clone` method to create copies of the data: +Same result. Now, making a copy can be inefficient, so this solution may not be +acceptable. There are other ways to get around this problem, but this is a toy +example, and because we're in an introduction, we'll leave that for later. -``` -fn main() { - let numbers = vec![1i, 2i, 3i]; +The point is, the Rust compiler and its notion of ownership has saved us from a +bug that would crash the program. We've achieved safety, at compile time, +without needing to rely on a garbage collector to handle our memory. + +# Concurrency + +Rust's ownership model can help in other ways, as well. For example, take +concurrency. Concurrency is a big topic, and an important one for any modern +programming language. Let's take a look at how ownership can help you write +safe concurrent programs. - for num in range(0u, 3) { - let (tx, rx) = channel(); - // Use `clone` to send a *copy* of the array - tx.send(numbers.clone()); +Here's an example of a concurrent Rust program: +```{rust} +fn main() { + for _ in range(0u, 10u) { spawn(proc() { - let numbers = rx.recv(); - println!("{:d}", numbers[num]); - }) + println!("Hello, world!"); + }); } } ``` -This is similar to the code we had before, -except now we loop three times, -making three tasks, -and *cloning* `numbers` before sending it. +This program creates ten threads, who all print `Hello, world!`. The `spawn` +function takes one argument, a `proc`. 'proc' is short for 'procedure,' and is +a form of closure. This closure is executed in a new thread, created by `spawn` +itself. -However, if we're making a lot of tasks, -or if our data is very large, -creating a copy for each task requires a lot of work and a lot of extra memory for little benefit. -In practice, we might not want to do this because of the cost. -Enter `Arc`, -an atomically reference counted box ("A.R.C." == "atomically reference counted"). -`Arc` is the most common way to *share* data between tasks. -Here's some code: +One common form of problem in concurrent programs is a 'data race.' This occurs +when two different threads attempt to access the same location in memory in a +non-synchronized way, where at least one of them is a write. If one thread is +attempting to read, and one thread is attempting to write, you cannot be sure +that your data will not be corrupted. Note the first half of that requirement: +two threads that attempt to access the same location in memory. Rust's +ownership model can track which pointers own which memory locations, which +solves this problem. -``` -use std::sync::Arc; +Let's see an example. This Rust code will not compile: +```{rust,ignore} fn main() { - let numbers = Arc::new(vec![1i, 2i, 3i]); - - for num in range(0u, 3) { - let (tx, rx) = channel(); - tx.send(numbers.clone()); + let mut numbers = vec![1i, 2i, 3i]; + for i in range(0u, 3u) { spawn(proc() { - let numbers = rx.recv(); - println!("{:d}", (*numbers)[num as uint]); - }) + for j in range(0, 3) { numbers[j] += 1 } + }); } } ``` -This is almost exactly the same, -except that this time `numbers` is first put into an `Arc`. -`Arc::new` creates the `Arc`, -`.clone()` makes another `Arc` that refers to the same contents. -So we clone the `Arc` for each task, -send that clone down the channel, -and then use it to print out a number. -Now instead of copying an entire array to send it to our multiple tasks we are just copying a pointer (the `Arc`) and *sharing* the array. - -How can this work though? -Surely if we're sharing data then can't we cause data races if one task writes to the array while others read? - -Well, Rust is super-smart and will only let you put data into an `Arc` that is provably safe to share. -In this case, it's safe to share the array *as long as it's immutable*, -i.e. many tasks may read the data in parallel as long as none can write. -So for this type and many others `Arc` will only give you an immutable view of the data. - -Arcs are great for immutable data, -but what about mutable data? -Shared mutable state is the bane of the concurrent programmer: -you can use a mutex to protect shared mutable state, -but if you forget to acquire the mutex, bad things can happen, including crashes. -Rust provides mutexes but makes it impossible to use them in a way that subverts memory safety. - -Let's take the same example yet again, -and modify it to mutate the shared state: +It gives us this error: +```{notrust,ignore} +6:71 error: capture of moved value: `numbers` + for j in range(0, 3) { numbers[j] += 1 } + ^~~~~~~ +7:50 note: `numbers` moved into closure environment here because it has type `proc():Send`, which is non-copyable (perhaps you meant to use clone()?) + spawn(proc() { + for j in range(0, 3) { numbers[j] += 1 } + }); +6:79 error: cannot assign to immutable dereference (dereference is implicit, due to indexing) + for j in range(0, 3) { numbers[j] += 1 } + ^~~~~~~~~~~~~~~ ``` -use std::sync::{Arc, Mutex}; -fn main() { - let numbers_lock = Arc::new(Mutex::new(vec![1i, 2i, 3i])); +It mentions that "numbers moved into closure environment". Because we referred +to `numbers` inside of our `proc`, and we create ten `proc`s, we would have ten +references. Rust detects this and gives us the error: we claim that `numbers` +has ownership, but our code tries to make ten owners. This may cause a safety +problem, so Rust disallows it. + +What to do here? Rust has two types that helps us: `Arc` and `Mutex`. +"Arc" stands for "atomically reference counted." In other words, an Arc will +keep track of the number of references to something, and not free the +associated resource until the count is zero. The 'atomic' portion refers to an +Arc's usage of concurrency primitives to atomically update the count, making it +safe across threads. If we use an Arc, we can have our ten references. But, an +Arc does not allow mutable borrows of the data it holds, and we want to modify +what we're sharing. In this case, we can use a `Mutex` inside of our Arc. A +Mutex will synchronize our accesses, so that we can ensure that our mutation +doesn't cause a data race. + +Here's what using an Arc with a Mutex looks like: + +```{rust} +use std::sync::{Arc,Mutex}; - for num in range(0u, 3) { - let (tx, rx) = channel(); - tx.send(numbers_lock.clone()); +fn main() { + let numbers = Arc::new(Mutex::new(vec![1i, 2i, 3i])); + for i in range(0u, 3u) { + let number = numbers.clone(); spawn(proc() { - let numbers_lock = rx.recv(); + let mut array = number.lock(); + + (*(*array).get_mut(i)) += 1; - // Take the lock, along with exclusive access to the underlying array - let mut numbers = numbers_lock.lock(); + println!("numbers[{}] is {}", i, (*array)[i]); + }); + } +} +``` + +We first have to `use` the appropriate library, and then we wrap our vector in +an Arc with the call to `Arc::new()`. Inside of the loop, we make a new +reference to the Arc with the `clone()` method. This will increment the +reference count. When each new `numbers` variable binding goes out of scope, it +will decrement the count. The `lock()` call will return us a reference to the +value inside the Mutex, and block any other calls to `lock()` until said +reference goes out of scope. + +We can compile and run this program without error, and in fact, see the +non-deterministic aspect: + +```{shell} +$ cargo run + Compiling hello_world v0.0.1 (file:///Users/you/src/hello_world) + Running `target/hello_world` +numbers[1] is 2 +numbers[0] is 1 +numbers[2] is 3 +$ cargo run + Running `target/hello_world` +numbers[2] is 3 +numbers[1] is 2 +numbers[0] is 1 +``` - // This is ugly for now because of the need for `get_mut`, but - // will be replaced by `numbers[num as uint] += 1` - // in the near future. - // See: https://github.com/rust-lang/rust/issues/6515 - *numbers.get_mut(num as uint) += 1; +Each time, we get a slightly different output, because each thread works in a +different order. You may not get the same output as this sample, even. - println!("{}", (*numbers)[num as uint]); +The important part here is that the Rust compiler was able to use ownership to +give us assurance _at compile time_ that we weren't doing something incorrect +with regards to concurrency. In order to share ownership, we were forced to be +explicit and use a mechanism to ensure that it would be properly handled. - // When `numbers` goes out of scope the lock is dropped - }) +# Safety _and_ speed + +Safety and speed are always presented as a continuum. On one hand, you have +maximum speed, but no safety. On the other, you have absolute safety, with no +speed. Rust seeks to break out of this mode by introducing safety at compile +time, ensuring that you haven't done anything wrong, while compiling to the +same low-level code you'd expect without the safety. + +As an example, Rust's ownership system is _entirely_ at compile time. The +safety check that makes this an error about moved values: + +```{rust,ignore} +fn main() { + let vec = vec![1i, 2, 3]; + + for i in range(1u, 3) { + spawn(proc() { + println!("{}", vec[i]); + }); } } ``` -This example is starting to get more subtle, -but it hints at the powerful composability of Rust's concurrent types. -This time we've put our array of numbers inside a `Mutex` and then put *that* inside the `Arc`. -Like immutable data, -`Mutex`es are sharable, -but unlike immutable data, -data inside a `Mutex` may be mutated as long as the mutex is locked. - -The `lock` method here returns not your original array or a pointer thereof, -but a `MutexGuard`, -a type that is responsible for releasing the lock when it goes out of scope. -This same `MutexGuard` can transparently be treated as if it were the value the `Mutex` contains, -as you can see in the subsequent indexing operation that performs the mutation. - -OK, let's stop there before we get too deep. - -# A footnote: unsafe - -The Rust compiler and libraries are entirely written in Rust; -we say that Rust is "self-hosting". -If Rust makes it impossible to unsafely share data between threads, -and Rust is written in Rust, -then how does it implement concurrent types like `Arc` and `Mutex`? -The answer: `unsafe`. - -You see, while the Rust compiler is very smart, -and saves you from making mistakes you might normally make, -it's not an artificial intelligence. -Because we're smarter than the compiler - -sometimes - we need to over-ride this safe behavior. -For this purpose, Rust has an `unsafe` keyword. -Within an `unsafe` block, -Rust turns off many of its safety checks. -If something bad happens to your program, -you only have to audit what you've done inside `unsafe`, -and not the entire program itself. - -If one of the major goals of Rust was safety, -why allow that safety to be turned off? -Well, there are really only three main reasons to do it: -interfacing with external code, -such as doing FFI into a C library; -performance (in certain cases); -and to provide a safe abstraction around operations that normally would not be safe. -Our `Arc`s are an example of this last purpose. -We can safely hand out multiple pointers to the contents of the `Arc`, -because we are sure the data is safe to share. -But the Rust compiler can't know that we've made these choices, -so _inside_ the implementation of the Arcs, -we use `unsafe` blocks to do (normally) dangerous things. -But we expose a safe interface, -which means that the `Arc`s are impossible to use incorrectly. - -This is how Rust's type system prevents you from making some of the mistakes that make concurrent programming difficult, -yet get the efficiency of languages such as C++. - -# That's all, folks - -I hope that this taste of Rust has given you an idea if Rust is the right language for you. -If that's true, -I encourage you to check out [the guide](guide.html) for a full, +carries no runtime penalty. And while some of Rust's safety features do have +a run-time cost, there's often a way to write your code in such a way that +you can remove it. As an example, this is a poor way to iterate through +a vector: + +```{rust} +let vec = vec![1i, 2, 3]; + +for i in range(1u, vec.len()) { + println!("{}", vec[i]); +} +``` + +The reason is that the access of `vec[i]` does bounds checking, to ensure +that we don't try to access an invalid index. However, we can remove this +while retaining safety. The answer is iterators: + +```{rust} +let vec = vec![1i, 2, 3]; + +for x in vec.iter() { + println!("{}", x); +} +``` + +This version uses an iterator that yields each element of the vector in turn. +Because we have a reference to the element, rather than the whole vector itself, +there's no array access bounds to check. + +# Learning More + +I hope that this taste of Rust has given you an idea if Rust is the right +language for you. We talked about Rust's tooling, how encoding ownership into +the type system helps you find bugs, how Rust can help you write correct +concurrent code, and how you don't have to pay a speed cost for much of this +safety. + +To continue your Rustic education, read [the guide](guide.html) for a more in-depth exploration of Rust's syntax and concepts.