From ad66f56afd7fd3127d6991bd6078246a799573c0 Mon Sep 17 00:00:00 2001 From: Brian Anderson Date: Tue, 8 Apr 2014 16:13:33 -0700 Subject: [PATCH] doc: Add "A 30-minute Introduction to Rust" By Steve Klabnik. --- mk/docs.mk | 2 +- src/doc/index.md | 1 + src/doc/intro.md | 364 +++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 366 insertions(+), 1 deletion(-) create mode 100644 src/doc/intro.md diff --git a/mk/docs.mk b/mk/docs.mk index fab828571cd4f..7fa943283373b 100644 --- a/mk/docs.mk +++ b/mk/docs.mk @@ -26,7 +26,7 @@ # L10N_LANGS are the languages for which the docs have been # translated. ###################################################################### -DOCS := index tutorial guide-ffi guide-macros guide-lifetimes \ +DOCS := index intro tutorial guide-ffi guide-macros guide-lifetimes \ guide-tasks guide-container guide-pointers guide-testing \ guide-runtime complement-bugreport complement-cheatsheet \ complement-lang-faq complement-project-faq rust rustdoc \ diff --git a/src/doc/index.md b/src/doc/index.md index dbf8510d2506d..4f01f7f0e04cb 100644 --- a/src/doc/index.md +++ b/src/doc/index.md @@ -7,6 +7,7 @@ li {list-style-type: none; } +* [A 30-minute Intro to Rust](intro.html) (read this first) * [The Rust tutorial](tutorial.html) (* [PDF](tutorial.pdf)) * [The Rust reference manual](rust.html) (* [PDF](rust.pdf)) diff --git a/src/doc/intro.md b/src/doc/intro.md new file mode 100644 index 0000000000000..9948895a2cba4 --- /dev/null +++ b/src/doc/intro.md @@ -0,0 +1,364 @@ +% A 30-minute Introduction to Rust + +Rust is a systems programming language that focuses on strong compile-time correctness guarantees. +It improves upon the ideas other systems languages like C++, D, +and Cyclone by providing very strong guarantees and explicit control over the life cycle of memory. +Strong memory guarantees make writing correct concurrent Rust code easier than in other languages. +This might sound very complex, but it's easier than it sounds! +This tutorial will give you an idea of what Rust is like in about thirty minutes. +It expects that you're at least vaguely familiar with a previous 'curly brace' language. +The concepts are more important than the syntax, +so don't worry if you don't get every last detail: +the [tutorial](http://static.rust-lang.org/doc/master/tutorial.html) can help you out with that later. + +Let's talk about the most important concept in Rust, "ownership," +and its implications on a task that programmers usually find very difficult: concurrency. + +## Ownership + +Ownership is central to Rust, +and is one of its more interesting and unique features. +"Ownership" refers to which parts of your code are allowed to modify various parts of memory. +Let's start by looking at some C++ code: + +``` +int *dangling(void) +{ + int i = 1234; + return &i; +} + +int add_one(void) +{ + int *num = dangling(); + return *num + 1; +} +``` + +This function allocates an integer on the stack, +and stores it in a variable, `i`. +It then returns a reference to the variable `i`. +There's just one problem: +stack memory becomes invalid when the function returns. +This means that in the second line of `add_one`, +`num` points to some garbage values, +and we won't get the effect that we want. +While this is a trivial example, +it can happen quite often in C++ code. +There's a similar problem when memory on the heap is allocated with `malloc` (or `new`), +then freed with `free` (or `delete`), +yet your code attempts to do something with the pointer to that memory. +More modern C++ uses RAII with constructors/destructors, +but it amounts to the same thing. +This problem is called a 'dangling pointer,' +and it's not possible to write Rust code that has it. +Let's try: + +``` +fn dangling() -> &int { + let i = 1234; + return &i; +} + +fn add_one() -> int { + let num = dangling(); + return *num + 1; +} +``` + +When you try to compile this program, you'll get an interesting (and long) error message: + +``` +temp.rs:3:11: 3:13 error: borrowed value does not live long enough +temp.rs:3 return &i; + +temp.rs:1:22: 4:1 note: borrowed pointer must be valid for the anonymous lifetime #1 defined on the block at 1:22... +temp.rs:1 fn dangling() -> &int { +temp.rs:2 let i = 1234; +temp.rs:3 return &i; +temp.rs:4 } + +temp.rs:1:22: 4:1 note: ...but borrowed value is only valid for the block at 1:22 +temp.rs:1 fn dangling() -> &int { +temp.rs:2 let i = 1234; +temp.rs:3 return &i; +temp.rs:4 } +error: aborting due to previous error +``` + +In order to fully understand this error message, +we need to talk about what it means to "own" something. +So for now, +let's just accept that Rust will not allow us to write code with a dangling pointer, +and we'll come back to this code once we understand ownership. + +Let's forget about programming for a second and talk about books. +I like to read physical books, +and sometimes I really like one and tell my friends they should read it. +While I'm reading my book, I own it: the book is in my possession. +When I loan the book out to someone else for a while, they "borrow" it from me. +And when you borrow a book, it's yours for a certain period of time, +and then you give it back to me, and I own it again. Right? + +This concept applies directly to Rust code as well: +some code "owns" a particular pointer to memory. +It's the sole owner of that pointer. +It can also lend that memory out to some other code for a while: +the code "borrows" it. +It borrows it for a certain period of time, called a "lifetime." + +That's all there is to it. +That doesn't seem so hard, right? +Let's go back to that error message: +`error: borrowed value does not live long enough`. +We tried to loan out a particular variable, `i`, +using Rust's borrowed pointers: the `&`. +But Rust knew that the variable would be invalid after the function returns, +and so it tells us that: +`borrowed pointer must be valid for the anonymous lifetime #1... but borrowed value is only valid for the block`. +Neat! + +That's a great example for stack memory, +but what about heap memory? +Rust has a second kind of pointer, +a 'unique' pointer, +that you can create with a `~`. +Check it out: + +``` +fn dangling() -> ~int { + let i = ~1234; + return i; +} + +fn add_one() -> int { + let num = dangling(); + return *num + 1; +} +``` + +This code will successfully compile. +Note that instead of a stack allocated `1234`, +we use an owned pointer to that value instead: `~1234`. +You can roughly compare these two lines: + +``` +// rust +let i = ~1234; + +// C++ +int *i = new int; +*i = 1234; +``` + +Rust is able to infer the size of the type, +then allocates the correct amount of memory and sets it to the value you asked for. +This means that it's impossible to allocate uninitialized memory: +Rust does not have the concept of null. +Hooray! +There's one other difference between this line of Rust and the C++: +The Rust compiler also figures out the lifetime of `i`, +and then inserts a corresponding `free` call after it's invalid, +like a destructor in C++. +You get all of the benefits of manually allocated heap memory without having to do all the bookkeeping yourself. +Furthermore, all of this checking is done at compile time, +so there's no runtime overhead. +You'll get (basically) the exact same code that you'd get if you wrote the correct C++, +but it's impossible to write the incorrect version, thanks to the compiler. + +You've seen one way that ownership and lifetimes are useful to prevent code that would normally be dangerous in a less-strict language, +but let's talk about another: concurrency. + +## Concurrency + +Concurrency is an incredibly hot topic in the software world right now. +It's always been an interesting area of study for computer scientists, +but as usage of the Internet explodes, +people are looking to improve the number of users a given service can handle. +Concurrency is one way of achieving this goal. +There is a pretty big drawback to concurrent code, though: +it can be hard to reason about, +because it is non-deterministic. +There are a few different approaches to writing good concurrent code, +but let's talk about how Rust's notions of ownership and lifetimes can assist with achieving correct but concurrent code. + +First, let's go over a simple concurrency example in Rust. +Rust allows you to spin up 'tasks,' +which are lightweight, 'green' threads. +These tasks do not have any shared memory, and so, +we communicate between tasks with a 'channel'. +Like this: + +``` +fn main() { + let numbers = [1,2,3]; + + let (port, chan) = Chan::new(); + chan.send(numbers); + + do spawn { + let numbers = port.recv(); + println!("{:d}", numbers[0]); + } +} +``` + +In this example, we create a vector of numbers. +We then make a new `Chan`, +which is the name of the package Rust implements channels with. +This returns two different ends of the channel: +a channel and a port. +You send data into the channel end, and it comes out the port end. +The `spawn` function spins up a new task. +As you can see in the code, +we call `port.recv()` (short for 'receive') inside of the new task, +and we call `chan.send()` outside, +passing in our vector. +We then print the first element of the vector. + +This works out because Rust copies the vector when it is sent through the channel. +That way, if it were mutable, there wouldn't be a race condition. +However, if we're making a lot of tasks, or if our data is very large, +making a copy for each task inflates our memory usage with no real benefit. + +Enter Arc. +Arc stands for 'atomically reference counted,' +and it's a way to share immutable data between multiple tasks. +Here's some code: + +``` +extern mod extra; +use extra::arc::Arc; + +fn main() { + let numbers = [1,2,3]; + + let numbers_arc = Arc::new(numbers); + + for num in range(0, 3) { + let (port, chan) = Chan::new(); + chan.send(numbers_arc.clone()); + + do spawn { + let local_arc = port.recv(); + let task_numbers = local_arc.get(); + println!("{:d}", task_numbers[num]); + } + } +} +``` + +This is very similar to the code we had before, +except now we loop three times, +making three tasks, +and sending an `Arc` between them. +`Arc::new` creates a new Arc, +`.clone()` makes a new reference to that Arc, +and `.get()` gets the value out of the Arc. +So we make a new reference for each task, +send that reference down the channel, +and then use the reference to print out a number. +Now we're not copying our vector. + +Arcs are great for immutable data, +but what about mutable data? +Shared mutable state is the bane of the concurrent programmer. +You can use a mutex to protect shared mutable state, +but if you forget to acquire the mutex, bad things can happen. + +Rust provides a tool for shared mutable state: `RWArc`. +This variant of an Arc allows the contents of the Arc to be mutated. +Check it out: + +``` +extern mod extra; +use extra::arc::RWArc; + +fn main() { + let numbers = [1,2,3]; + + let numbers_arc = RWArc::new(numbers); + + for num in range(0, 3) { + let (port, chan) = Chan::new(); + chan.send(numbers_arc.clone()); + + do spawn { + let local_arc = port.recv(); + + local_arc.write(|nums| { + nums[num] += 1 + }); + + local_arc.read(|nums| { + println!("{:d}", nums[num]); + }) + } + } +} +``` + +We now use the `RWArc` package to get a read/write Arc. +The read/write Arc has a slightly different API than `Arc`: +`read` and `write` allow you to, well, read and write the data. +They both take closures as arguments, +and the read/write Arc will, in the case of write, +acquire a mutex, +and then pass the data to this closure. +After the closure does its thing, the mutex is released. + +You can see how this makes it impossible to mutate the state without remembering to aquire the lock. +We gain the efficiency of shared mutable state, +while retaining the safety of disallowing shared mutable state. + +But wait, how is that possible? +We can't both allow and disallow mutable state. +What gives? + +## A footnote: unsafe + +So, the Rust language does not allow for shared mutable state, +yet I just showed you some code that has it. +How's this possible? The answer: `unsafe`. + +You see, while the Rust compiler is very smart, +and saves you from making mistakes you might normally make, +it's not an artificial intelligence. +Because we're smarter than the compiler, +sometimes, we need to over-ride this safe behavior. +For this purpose, Rust has an `unsafe` keyword. +Within an `unsafe` block, +Rust turns off many of its safety checks. +If something bad happens to your program, +you only have to audit what you've done inside `unsafe`, +and not the entire program itself. + +If one of the major goals of Rust was safety, +why allow that safety to be turned off? +Well, there are really only three main reasons to do it: +interfacing with external code, +such as doing FFI into a C library, +performance (in certain cases), +and to provide a safe abstraction around operations that normally would not be safe. +Our Arcs are an example of this last purpose. +We can safely hand out multiple references to the `Arc`, +because we are sure the data is immutable, +and therefore it is safe to share. +We can hand out multiple references to the `RWArc`, +because we know that we've wrapped the data in a mutex, +and therefore it is safe to share. +But the Rust compiler can't know that we've made these choices, +so _inside_ the implementation of the Arcs, +we use `unsafe` blocks to do (normally) dangerous things. +But we expose a safe interface, +which means that the Arcs are impossible to use incorrectly. + +This is how Rust's type system allows you to not make some of the mistakes that make concurrent programming difficult, +yet get the efficiency of languages such as C++. + +## That's all, folks + +I hope that this taste of Rust has given you an idea if Rust is the right language for you. +If that's true, +I encourage you to check out [the tutorial](http://static.rust-lang.org/doc/0.9/tutorial.html) for a full, +in-depth exploration of Rust's syntax and concepts. \ No newline at end of file