Skip to content

Commit

Permalink
Guide: motivate Box and Rc pointers with need, uses, benefits, and ex…
Browse files Browse the repository at this point in the history
…amples.

Explain that Rust has different pointer types because there is a
tradeoff between flexibility and efficiency. Motivate boxes as
fixed-size containers of variable-sized objects. Clarify that Box and Rc
are pointer types that you deref with * just like references. Stick to
explaining the semantics and avoid implementation details.  Scope isn't
the most accurate framework to think about deallocation (since you
return boxes and otherwise move values out of scopes); it's more "when
the value is done being used," i.e., lifetime. Provide a connection
between Rust's pointer types by locating them on a flexibiltiy /
performance scale. Explain the compiler can't statically analyze
lifetimes with multiple owners; hence the need for (runtime) reference
counting.
  • Loading branch information
John Kleint committed Oct 27, 2014
1 parent f037452 commit d257b37
Showing 1 changed file with 78 additions and 44 deletions.
122 changes: 78 additions & 44 deletions src/doc/guide.md
Expand Up @@ -3637,40 +3637,72 @@ pub fn as_maybe_owned(&self) -> MaybeOwned<'a> { ... }

## Boxes

All of our references so far have been to variables we've created on the stack.
In Rust, the simplest way to allocate heap variables is using a *box*. To
create a box, use the `box` keyword:
Most of the types we've seen so far have a fixed size or number of components.
The compiler needs this fact to lay out values in memory. However, some data
structures, such as a linked list, do not have a fixed size. You might think to
implement a linked list with an enum that's either a `Node` or the end of the
list (`Nil`), like this:

```{rust,ignore}
enum List { // error: illegal recursive enum type
Node(u32, List),
Nil
}
```

But the compiler complains that the type is recursive, that is, it could be
arbitrarily large. To remedy this, Rust provides a fixed-size container called
a **box** that can hold any type. You can box up any value with the `box`
keyword. Our boxed List gets the type `Box<List>` (more on the notation when we
get to generics):

```{rust}
let x = box 5i;
enum List {
Node(u32, Box<List>),
Nil
}
fn main() {
let list = Node(0, box Node(1, box Nil));
}
```

This allocates an integer `5` on the heap, and creates a binding `x` that
refers to it. The great thing about boxed pointers is that we don't have to
manually free this allocation! If we write
A box dynamically allocates memory to hold its contents. The great thing about
Rust is that that memory is *automatically*, *efficiently*, and *predictably*
deallocated when you're done with the box.

A box is a pointer type, and you access what's inside using the `*` operator,
just like regular references. This (rather silly) example dynamically allocates
an integer `5` and makes `x` a pointer to it:

```{rust}
{
let x = box 5i;
// do stuff
println!("{}", *x); // Prints 5
}
```

then Rust will automatically free `x` at the end of the block. This isn't
because Rust has a garbage collector -- it doesn't. Instead, when `x` goes out
of scope, Rust `free`s `x`. This Rust code will do the same thing as the
following C code:
The great thing about boxes is that we don't have to manually free this
allocation! Instead, when `x` reaches the end of its lifetime -- in this case,
when it goes out of scope at the end of the block -- Rust `free`s `x`. This
isn't because Rust has a garbage collector (it doesn't). Instead, by tracking
the ownership and lifetime of a variable (with a little help from you, the
programmer), the compiler knows precisely when it is no longer used.

The Rust code above will do the same thing as the following C code:

```{c,ignore}
{
int *x = (int *)malloc(sizeof(int));
// do stuff
if (!x) abort();
*x = 5;
printf("%d\n", *x);
free(x);
}
```

This means we get the benefits of manual memory management, but the compiler
ensures that we don't do something wrong. We can't forget to `free` our memory.
We get the benefits of manual memory management, while ensuring we don't
introduce any bugs. We can't forget to `free` our memory.

Boxes are the sole owner of their contents, so you cannot take a mutable
reference to them and then use the original box:
Expand Down Expand Up @@ -3706,48 +3738,50 @@ let mut x = box 5i;
*x;
```

## Rc and Arc

Sometimes, you need to allocate something on the heap, but give out multiple
references to the memory. Rust's `Rc<T>` (pronounced 'arr cee tee') and
`Arc<T>` types (again, the `T` is for generics, we'll learn more later) provide
you with this ability. **Rc** stands for 'reference counted,' and **Arc** for
'atomically reference counted.' This is how Rust keeps track of the multiple
owners: every time we make a new reference to the `Rc<T>`, we add one to its
internal 'reference count.' Every time a reference goes out of scope, we
subtract one from the count. When the count is zero, the `Rc<T>` can be safely
deallocated. `Arc<T>` is almost identical to `Rc<T>`, except for one thing: The
'atomically' in 'Arc' means that increasing and decreasing the count uses a
thread-safe mechanism to do so. Why two types? `Rc<T>` is faster, so if you're
not in a multi-threaded scenario, you can have that advantage. Since we haven't
talked about threading yet in Rust, we'll show you `Rc<T>` for the rest of this
section.
Boxes are simple and efficient pointers to dynamically allocated values with a
single owner. They are useful for tree-like structures where the lifetime of a
child depends solely on the lifetime of its (single) parent. If you need a
value that must persist as long as any of several referrers, read on.

To create an `Rc<T>`, use `Rc::new()`:
## Rc and Arc

```{rust}
use std::rc::Rc;
Sometimes, you need a variable that is referenced from multiple places
(immutably!), lasting as long as any of those places, and disappearing when it
is no longer referenced. For instance, in a graph-like data structure, a node
might be referenced from all of its neighbors. In this case, it is not possible
for the compiler to determine ahead of time when the value can be freed -- it
needs a little run-time support.

let x = Rc::new(5i);
```
Rust's **Rc** type provides shared ownership of a dynamically allocated value
that is automatically freed at the end of its last owner's lifetime. (`Rc`
stands for 'reference counted,' referring to the way these library types are
implemented.) This provides more flexibility than single-owner boxes, but has
some runtime overhead.

To create a second reference, use the `.clone()` method:
To create an `Rc` value, use `Rc::new()`. To create a second owner, use the
`.clone()` method:

```{rust}
use std::rc::Rc;
let x = Rc::new(5i);
let y = x.clone();
println!("{} {}", *x, *y); // Prints 5 5
```

The `Rc<T>` will live as long as any of its references are alive. After they
all go out of scope, the memory will be `free`d.
The `Rc` will live as long as any of its owners are alive. After that, the
memory will be `free`d.

**Arc** is an 'atomically reference counted' value, identical to `Rc` except
that ownership can be safely shared among multiple threads. Why two types?
`Arc` has more overhead, so if you're not in a multi-threaded scenario, you
don't have to pay the price.

If you use `Rc<T>` or `Arc<T>`, you have to be careful about introducing
cycles. If you have two `Rc<T>`s that point to each other, the reference counts
will never drop to zero, and you'll have a memory leak. To learn more, check
out [the section on `Rc<T>` and `Arc<T>` in the pointers
guide](guide-pointers.html#rc-and-arc).
If you use `Rc` or `Arc`, you have to be careful about introducing cycles. If
you have two `Rc`s that point to each other, they will happily keep each other
alive forever, creating a memory leak. To learn more, check out [the section on
`Rc` and `Arc` in the pointers guide](guide-pointers.html#rc-and-arc).

# Patterns

Expand Down

0 comments on commit d257b37

Please sign in to comment.