# Thread safety

Data race is impossible in safe rust.

Impossible means impossible. Rust is not a linter that catch common problems. It uses a sound analysis to catch every possible data race.

A thread safe program:

In [3]:
let mut x = 0;
x += 2;
println!("{x}");

2


Single threaded programs are thread safe. Let's see a multi threaded thread safe program:

In [2]:
use std::thread;

let mut x = 0;
let join_handler = thread::spawn(|| {
    println!("Hello ");
});
println!("world");
join_handler.join();
println!("finished");

world
Hello 
finished


Now lets see a trivial data race and how compiler will reject it:

In [3]:
let mut x = 0;
let join_handler = thread::spawn(|| {
    x += 2;
    println!("{x}");
});
x += 2;
println!("{x}");
join_handler.join();
println!("{x}");

Error: closure may outlive the current function, but it borrows `x`, which is owned by the current function

Error: cannot use `x` because it was mutably borrowed

Lets see the compiler errors. Compiler errors in Rust (unlike c++) are usually very meaningful and helpful.

The compiler error is:
```
closure may outlive the current function, but it
borrows `x`, which is owned by the current function
```
It means, the current thread might end before the created thread, but the created thread needs the memory of `x` which is on the stack of the main thread.

You: But we are joining the thread, so is this a false positive?

No, because we may not reach that line. For example, `println!` may panic (panics are similar to exceptions in c++) for various reasons.

This is not the data race that we expected, but definietly a data race! Let's fix that.

In [4]:
let mut x = 0;
thread::scope(|s| {
    s.spawn(|| {
        x += 2;
        println!("{x}");
    });
    x += 2;
    println!("{x}");
});
println!("{x}");

Error: cannot use `x` because it was mutably borrowed

Error: cannot assign to `x` because it is borrowed

This is equivalent of the above, but `thread::scope` will join the scoped thread we made inside of it at the end of the scope (either normally or in the case of panic) so the previous problem is solved.

But the code still doesn't compile:
```
cannot use `x` because it was mutably borrowed
```
The code shouldn't compile (because it has data race) but the error might not be clear if you are not familiar with the Rust terminology. So we need some background.

## Ownership and borrowing

In Rust, everything has some owner, for example here `x` is owner of the vector:

In [7]:
let x = vec![1, 2, 3];

You can move (similar to move semantics in c++) values and change the owner:

In [9]:
let x = vec![1, 2, 3];
let y = x; // move by assignment
let z = f(y); // move by function call
fn f(input: Vec<i32>) -> Vec<i32> {
    let mut tmp = input; // move by assignment
    tmp.push(4);
    tmp // move by return
}
z

[1, 2, 3, 4]

And you can borrow things that you own. Borrowing is the act of creating a pointer (called a reference in Rust, but they are more close to a pointer than a reference in c++):

In [11]:
{ // blocks are due jupyter (evcxr) limitations
    let x = 5;
    let y = &x; // borrow of `x` occurs here. `y` is a reference to `x`
    let z = *y; // we can read the original i32 here with dereferencing operator
    z
}

5

But in normal Rust code we won't usually use the derefrence operator explicitly. For example, method call and field access (`.` operator) will dereference the callee as needed. So Rust doesn't have a `->` operator.

In [22]:
let x = vec![1, 2, 3, 4, 5];
{
    let y = &&&&&x; // a reference of reference of reference of ... of x
    // those are equal:
    println!("{}", y.len());
    println!("{}", (**y).len());
    println!("{}", (*****y).len());
};

5
5
5


Index operator `v[x]` is the same:

In [24]:
{
    let y = &&&&&x;
    // those are equal:
    println!("{}", y[2]);
    println!("{}", (**y)[2]);
    println!("{}", (*****y)[3]);
};

3
3
4


And the situation is the same for many operations. A reference in rust is just a handle for accessing a value, and itself doesn't matter. Even equality of references will automatically dereference.

In [26]:
{
    let a = vec![1, 2, 3];
    let b = vec![1, 2, 3];
    let ref1_to_a = &a;
    let ref2_to_a = &a;
    let ref_to_b = &b;
    println!("{}", a == b); // a and b are equal    
    println!("{}", ref1_to_a == ref_to_b); // reference equality uses value semantics
    // but we can ask manually for pointer equality
    println!("{}", std::ptr::eq(ref1_to_a, ref_to_b));
    println!("{}", std::ptr::eq(ref1_to_a, ref2_to_a));
};

true
true
false
true


References are immutable by default, so you can't change the referred value by them:

In [5]:
{
    let x = 5;
    let y = &x;
    *y = 2;
    x
}

Error: cannot assign to `*y`, which is behind a `&` reference

Compiler error is clear and to the point. Let's fix that.

In [15]:
{
    let mut x = 5;
    let y = &mut x;
    *y = 2;
    x
}

2

Now that we know what is ownership, borrowing and references, we can understand rules of borrowing:
* At any given time, you can have either one mutable reference or any number of immutable references.
* References must always be valid. That is, owner must not be moved or dropped while there is a live borrow.

These simple rules enforces memory and thread safety in Rust.

Checking these rules is done by the borrow checker (BC) inside the compiler.

We can see this rules being enforced in action:

No two mutable reference:

In [6]:
{
    let mut x = 5;
    let y = &mut x;
    let z = &mut x;
    *y = *z + 5;
    *z
}

Error: cannot borrow `x` as mutable more than once at a time

No mutable and immutable reference at the same time:

In [7]:
{
    let mut x = 5;
    let y = &mut x;
    let z = &x;
    *y = *z + 5;
    *z
}

Error: cannot borrow `x` as immutable because it is also borrowed as mutable

No ownership change while borrowing:

In [8]:
{
    let x = vec![1, 2, 3];
    let y = &x;
    let z = x;
    (y.len(), z.len())
}

Error: cannot move out of `x` because it is borrowed

Now we are ready for our original code:

In [9]:
let mut x = 0;
thread::scope(|s| {
    s.spawn(|| {
        x += 2;
        println!("{x}");
    });
    x += 2;
    println!("{x}");
});
println!("{x}");

Error: cannot use `x` because it was mutably borrowed

Error: cannot assign to `x` because it is borrowed

Closures capture the environment variables by reference (can be changed to by move) so the thread's closure mutably borrows `x`, and we can not later borrow `x` again. Rust catched this trivial data race.

Let's remove the second `x+=2`.

In [10]:
let mut x = 0;
thread::scope(|s| {
    s.spawn(|| {
        x += 2;
        println!("{x}");
    });
    println!("{x}");
});
println!("{x}");

Error: cannot borrow `x` as immutable because it is also borrowed as mutable

Rust doesn't even allow this. This is UB in c++ as well, as some archituctures may write into an integer incrementally, and some compiler optimizations may reorder and/or remove some writes, leaving the reader in an invalid state. To make it more clear that why it is wrong, we can use a vector:

In [11]:
let mut x = vec![1, 2, 3];
thread::scope(|s| {
    s.spawn(|| {
        x.push(4);
    });
    println!("{}", x[3]);
});
println!("{}", x.len());

Error: cannot borrow `x` as immutable because it is also borrowed as mutable

Here, `x.push(4)` will at first increase the vector capacity and length, then will copy `4` in its place. If a context switch happens between those steps, reader might read uninitialized value as `x[3]` and think it is the target value, which is very bad and Rust prevent us from doing that.

You: So rust is pretty restrictive, and we can't never mutate a variable in two threads?

We can mutate a variable in two threads, but we should do it correctly.

## Mutex

Mutex in Rust is similar to c++'s one, but it also holds the data we want exclusive access to. Locking a mutex will return a guard that will provide mutable access to the data, and will unlock the lock on drop.

In [12]:
use std::sync::Mutex;

{
    let mut x = Mutex::new(vec![1, 2, 3]);
    thread::scope(|s| {
        s.spawn(|| {
            let mut x_guard = x.lock().unwrap();
            x_guard.push(4);
        });
        let x_guard = x.lock().unwrap();
        println!("{:?}", x_guard.get(3)); // x_guard[3] is possible as well, but that will panic if it is non existent
    });
    let x_guard = x.lock().unwrap();
    println!("{}", x_guard.len());
};

None
4


Now let's do something meaningful, like computing the sum of `1..10000` in 10 threads.

In [14]:
// single threaded
(1..10000).sum::<i32>()

49995000

In [15]:
use std::mem::drop;

let mut to_be_computed = Mutex::new(0);
let mut result = Mutex::new(0);
thread::scope(|s| {
    for thread_id in 0..10 {
        // we want to capture mutexes by reference, but thread_id by value, so we will create
        // these references and move them into the closure
        let result = &result;
        let to_be_computed = &to_be_computed;
        s.spawn(move || { // move closures capture everything by move
            let mut to_be_computed_guard = to_be_computed.lock().unwrap();
            let t = *to_be_computed_guard;
            *to_be_computed_guard = t + 1000;
            drop(to_be_computed_guard); // unlock the lock for other threads
            let mut tmp = 0;
            for i in t..t+1000 {
                tmp += i;
            } // or tmp = (t..t+1000).sum()
            println!("result for {t} from thread {thread_id} is {tmp}");
            let mut result_guard = result.lock().unwrap();
            *result_guard = *result_guard + tmp;
        });
    }
});
// Here, since we own the Mutex, and there is no live reference to it at this point, we
// are allowed to consume it and read its value without locking.
println!("final result is {}", result.into_inner().unwrap());

result for 0 from thread 0 is 499500
result for 1000 from thread 1 is 1499500
result for 2000 from thread 2 is 2499500
result for 3000 from thread 3 is 3499500
result for 4000 from thread 4 is 4499500
result for 5000 from thread 5 is 5499500
result for 6000 from thread 7 is 6499500
result for 7000 from thread 6 is 7499500
result for 8000 from thread 9 is 8499500
result for 9000 from thread 8 is 9499500
final result is 49995000


Mutex of integers are overkill. We can use atomics for simplicity and better performance:

In [16]:
use std::sync::atomic::{AtomicI32, Ordering};

let mut to_be_computed = AtomicI32::new(0);
let mut result = AtomicI32::new(0);
thread::scope(|s| {
    for thread_id in 0..10 {
        // we want to capture mutexes by reference, but thread_id by value, so we will create
        // these references and move them into the closure
        let result = &result;
        let to_be_computed = &to_be_computed;
        s.spawn(move || { // move closures capture everything by move
            let mut t = to_be_computed.fetch_add(1000, Ordering::SeqCst);
            let mut tmp = 0;
            for i in t..t+1000 {
                tmp += i;
            } // or tmp = (t..t+1000).sum()
            println!("result for {t} from thread {thread_id} is {tmp}");
            result.fetch_add(tmp, Ordering::Relaxed);
        });
    }
});
println!("final result is {}", result.into_inner());

result for 0 from thread 1 is 499500
result for 1000 from thread 0 is 1499500
result for 2000 from thread 2 is 2499500
result for 3000 from thread 3 is 3499500
result for 4000 from thread 5 is 4499500
result for 5000 from thread 4 is 5499500
result for 6000 from thread 6 is 6499500
result for 7000 from thread 7 is 7499500
result for 8000 from thread 9 is 8499500
result for 9000 from thread 8 is 9499500
final result is 49995000


We can do it even easier with Rayon. Rayon is the equivalent of OpenMP in Rust. It will create a thread pool with number of hardware cores and automatically does the job.

In [17]:
:dep rayon = "1.5"

In [18]:
use rayon::prelude::*;

(1..10000i32).into_par_iter().sum::<i32>()

49995000

Rust thread safety gurantees works in third-party crates as well. To see it, let's first try to calculate sum of even numbers in that range:

In [19]:
(1..10000i32).into_par_iter().filter(|x| x % 2 == 0).sum::<i32>()

24995000

Rayon will automatically run `x % 2 == 0` in different threads. Let's add some data race into it:

In [20]:
let mut some_variable = 1;
(1..10000i32)
    .into_par_iter()
    .filter(|x| {
        some_variable += 1;
        x % some_variable == 0
    })
    .sum::<i32>()

Error: cannot assign to `some_variable`, as it is a captured variable in a `Fn` closure

Details are omitted for now.

### Things we learnt
* Data race in safe Rust is impossible
* Ownership, borrowing and references
* Mutex

### Things we seen
* Atomics
* Rayon

### Things we didn't learn
* Channels
* Arc
* Send and Sync
