Reference: TRPL 5.8
- Your basic, economy-class references.
- Zero runtime cost; all checks are done at compile time.
- Not allowed to outlive their associated lifetime.
- Can introduce serious lifetime complexity if you're not careful!
- Use these unless you actually need something more complicated.
Borrow
/BorrowMut
: "a trait for borrowing data."¹
trait Borrow<Borrowed> { fn borrow(&self) -> &Borrowed; }
AsRef
/AsMut
: "a cheap, reference-to-reference conversion."²
trait AsRef<T> { fn as_ref(&self) -> &T; }
- So... they're exactly the same?
- No! While the have the same definition,
Borrow
carries additional connotations:- "If you are implementing Borrow and both Self and Borrowed implement Hash, Eq, and/or Ord, they must produce the same result."¹ ²
pub trait Borrow<Borrowed> where Borrowed: ?Sized {
fn borrow(&self) -> &Borrowed;
}
- Borrow lets you lend out references of other types, e.g.:
Vec<T>: Borrow<Vec<T>>
lets you borrow aVec<T>
as a&Vec<T>
Vec<T>: Borrow<[T]>
lets you borrow aVec<T>
as a&[T]
- This is how a
HashMap
'sget
method is defined:
impl<K, V, S> HashMap<K, V, S> where K: Eq + Hash
fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V>
where K: Borrow<Q>, Q: Hash + Eq
- e.g.
String
implementsBorrow<str>
// map: HashMap<String, i32>;
map.get("Foo"), Some(&42));
- Similar to
Deref
, which is the trait that lets you overload*
. - However,
Deref
uses an associated type. So a typeT
can have...- One implementation of
Deref { type Target; }
. - Many implentations of
Borrow<Borrowed>
.
- One implementation of
???
TODO: in the future, this can be moved into the Traits lecture (?)
pub trait AsRef<T> where T: ?Sized {
fn as_ref(&self) -> &T;
}
- Used when you want to explicitly convert between reference types.
- No other guarantees.
- e.g.
impl AsRef<[u8]> for str
impl AsRef<str> for String
impl AsRef<[u8]> for String
impl<T> AsRef<T> for Box<T>
- Borrow has a generic implementation:
impl<T> Borrow<T> for T
.- You can always convert
T
to&T
.
AsRef
actually has a different generic implementation:impl<'a, T, U> AsRef<U> for &'a T where T: AsRef<U>
impl<'a, T, U> AsRef<U> for &'a mut T where T: AsRef<U>
- That means
AsRef
has autodereferencing behavior. (foo.as_ref()
=(&foo).as_ref()
=(&mut foo).as_ref()
)
- Casting (
as
) cannot be overloaded; instead, you can implement these traits: trait From<T> { fn from(T) -> Self; }
Y::from(x)
trait Into<T> { fn into(self) -> T; }
x.into()
- Into comes with two generic implementations:
From<T> for U
impliesInto<U> for T
.- So you should implement
From
.
- So you should implement
- into() is reflexive, i.e.
Into<T> for T
for allT
.
Box<T>
is one of Rust's ways of allocating data on the heap.- A
Box<T>
is a pointer which owns aT
on the heap. - The allocated data is de-allocated when the pointer goes out of scope (via
Drop
). Box<T>
implementsDeref<Target=T>
, can be dereferenced with*
.- Create a
Box
withBox::new()
.
let boxed_five = Box::new(5);
- Pros:
- Easiest way to put something on the heap.
- Zero-cost abstraction for dynamic allocation.
- Shares typical borrowing and move semantics, including Deref.
- Automatic destruction.
- Cons:
- The
Box
binding is subject to general ownership rules.
- The
- In homeworks, you've used have noticed patterns like this:
match self.head.take() {
Some(boxed_node) => {
// boxed_node: Box<Node>
let unboxed_node = *boxed_node;
let a = unboxed_node.elem;
let b = unboxed_node.next;
}
None => (),
}
- But we said there's automatic dereferencing when you access struct elements?
match self.head.take() {
Some(boxed_node) => {
// boxed_node: Box<Node>
let unboxed_node = *boxed_node;
let a = unboxed_node.elem;
let b = unboxed_node.next;
}
None => (),
}
- You can't destructure Boxes (move only part of the struct) because that would leave the heap data only partially-valid.
- But you can move ownership to the local scope (by explicitly dereferencing).
- It's currently not possible to destructure the
Box
inside theOption
. :( - In Nightly Rust, it is, thanks to
box
syntax!
#![feature(box_syntax, box_patterns)]
let opt_box = Some(box 5);
match opt_box {
Some(box value) => assert_eq!(value, 5),
None => println!("None :("),
}
- The syntax might change, so it's behind a feature gate.
- Reference counting is a method of implementing garbage collection.
- Data is allocated on the heap.
- The data keeps track of how many pointers to it exists.
- Increments when it's copied, decrements it goes out of scope.
- When counter hits 0, data is de-allocated.
- Used by Python.
- A "Reference Counted" pointer.
- Call
clone()
on anRc
to get a reference.- Increments its reference count.
- No data gets copied!
- When the ref count drops to 0, the value is freed.
- The data can only be mutated when the reference count is 1.
fn get_mut(this: &mut Rc<T>) -> Option<&mut T>
use std::rc::Rc;
let mut shared = Rc::new(6);
{
println!("{:?}", Rc::get_mut(&mut shared)); // ==> Some(6)
}
let mut cloned = shared.clone();
{
println!("{:?}", Rc::get_mut(&mut shared)); // ==> None
println!("{:?}", Rc::get_mut(&mut cloned)); // ==> None
}
- Reference counting has weaknesses: if a cycle is ever created.
- Doubly-Linked List:
- Node 1 has an
Rc
pointing to Node 2 - Node 2 has an
Rc
pointing to Node 1
- Node 1 has an
==>
Memory leak!- This can be avoided with weak references.
- Weak refs don't increment the strong reference count, so they don't affect possible deallocation.
- But that means they aren't always valid!
- An
Rc
can be downgraded into aWeak
pointer.fn downgrade() -> Weak<T>
.
Weak
pointers can only be upgraded back intoRc
.fn upgrade() -> Option<Rc<T>>
- When do you use an
Rc
vs. aWeak
?- Generally, you probably want to use
Rc
.
- Generally, you probably want to use
- If you want possible access to data but no ownership, you might want to use
a
Weak
.- Always take into account the possibility of upgrade failing.
- Any structure with reference cycles may also need
Weak
, to avoid the leak. - Note:
Rc
cycles are actually difficult to create in Rust, because of mutability rules.
- Pros:
- Allows sharing ownership of data.
- Cons:
- Has a (small) runtime cost.
- Holds two reference counts (strong and weak).
- Must update and check reference counts when cloning, upgrading, or dropping.
- Reference cycles can potentially leak memory; must design data structure so it doesn't contain cycles.
- Has a (small) runtime cost.
???
So two more things about Rc.
- An Atomic Reference-Counted pointer
- Functionally the same as Rc, but uses thread-safe (atomic) reference counting.
- Note:
Rc
cycles are actually difficult to create in Rust, because of mutability rules.
struct Node {
next: Option<Rc<Node>>,
prev: Option<Rc<Node>>,
}
let mut n1 = Rc::new(Node { next: None, prev: None });
let mut n2 = Node { next: None, prev: Some(n.clone()) };
// Can no longer get a mutable reference to n1
// because n2 has a pointer to it.
assert_eq(None, Rc::get_mut(&mut n)));
- A way to wrap data to allow interior mutability.
- An immutable reference allows modifying the contained value!
- There are two types of cell:
Cell<T>
andRefCell<T>
.
struct Foo {
x: Cell<i32>,
y: RefCell<u32>,
}
- A wrapper type providing interior mutability for
Copy
types.Cell<T>
s cannot contain references.
- Get a copy of the value from a
Cell
withget()
. - Replace the value inside a
Cell
withset()
. - Limited (only for making copies), but safe and cheap.
let c = Cell::new(10);
c.set(20);
println!("{}", c.get()); // 20
- Pros:
- Interior mutability.
- No runtime cost!
- Small allocation cost.
- Cons:
Copy
types only.
- A wrapper type providing interior mutability for any type.
- Uses runtime borrow checking rules.
- Same rules as compiler borrowchecker.
- Borrow inner data via
borrow()
orborrow_mut()
.- Panics if the
RefCell
is already borrowed!
- Panics if the
use std::cell::RefCell;
let refc = RefCell::new(vec![12]);
let mut inner = refc.borrow_mut();
inner.push(24);
println!("{:?}", *inner); // ==> [12, 24]
let inner2 = refc.borrow();
// ==> Panics since there is an existing mutable borrow
- Notice this isn't thread-safe!
- There is no way (in stable Rust) to check if a borrow will panic before
executing it.
borrow_state(&self)
is an unstable way to do this.- Use a thread-safe Mutex instead.
- A common paradigm is putting a
RefCell
inside anRc
to allow shared mutability. - Implement under-the-hood mutability (internal state changes).
use std::cell::Cell;
struct Rc<T> {
ptr: *mut RcBox<T>
}
struct RcBox<T> {
value: T,
count: Cell<usize>
}
impl<T> Clone for Rc<T> {
fn clone(&self) -> Rc<T> {
unsafe {
(*self.ptr).count.set((*self.ptr).count.get() + 1);
Rc { ptr: self.ptr }
}
}
}
- Pros:
- Interior mutability for any type.
- Cons:
- Stores an additional borrow state variable.
- May panic at runtime.
- Not thread-safe.
- When you invoke
borrow()
on aRefCell<T>
, you actually getRef<T>
, not&T
.- Similarly,
borrow_mut()
gives you aRefMut<T>
.
- Similarly,
- These are pretty simple wrapper over
&T
, but define some extra methods.- Sadly, all of them are unstable pending the
cell_extras
feature 😞.
- Sadly, all of them are unstable pending the
- C-like raw pointers: they just point... somewhere in memory.
- No ownership rules.
- No lifetime rules.
- Zero-cost abstraction... because there is no abstraction.
- Requires
unsafe
to be dereferenced.- May eat your laundry if you're not careful.
- Use these if you're building a low-level structure like
Vec<T>
, but not in typical code.- Can be useful for manually avoiding runtime costs.
- Nothing unsafe about creating raw pointers:
let x = 5;
let raw_x: *const i32 = &x as *const i32;
- References can be implicitly coerced to raw pointers.
- They have the same runtime representation!
fn foo(x: *const i32) { }
foo(&x);
foo(raw_x);
- But you need an
unsafe
block to dereference them:
let x = 5;
let raw_x: *const i32 = &x as *const i32;
unsafe {
println!("{}", *raw_x);
}
- Raw pointers must be explicitly cast to references:
- By doing a dereference and reference operation (
&*
)
- By doing a dereference and reference operation (
let x = 5;
let raw_x: *const i32 = &x;
unsafe {
let ref_x = &*raw_x;
}
- Can also create a raw pointer with
Box::into_raw
.- Consumes the box and returns the raw pointer.
- Must remember to de-allocate the data.
let my_speed: Box<i32> = Box::new(88);
let my_speed: *mut i32 = Box::into_raw(my_speed);
// ...
unsafe {
Box::from_raw(my_speed);
}
- There are three things you can do in
unsafe
...- Update a
static mut
variable. - Dereference a raw pointer.
- Call unsafe functions.
- Update a
- That's it?
- Lots of things are implemented as
unsafe
functions:- FFI
- Compiler intrinsics
- Memory operations: memcopy, memset,
mem::transmute
slice::get_unchecked
- Atomics, raw arithmetic operations
- Compiler-internal flags such as getting type names, return address values.
- Memory operations: memcopy, memset,
- Memory Operations
- Allocate, deallocate memory.
???
Basically this + raw pointers = Vecs, Boxes, Rcs, etc.
- There are four types of
unsafe
blocks...
unsafe fn foo() {
}
unsafe {
}
unsafe trait Spooky {
}
unsafe impl Spooky for i32 {
}
- Unsafe functions can only be called in unsafe blocks.
- Functions which contain unsafe blocks can decide whether to expose a safe or unsafe API.
- _Unsafe traits can only be implemented in unsafe impl blocks.