# Memory safety for 2275: Piloting Rust

## **Jed Brown**, CU Boulder


## 2025-02-18

# What does this function do?

```c
int table[4];
bool exists_in_table(int v) {
    for (int i = 0; i <= 4; i++) {
        if (table[i] == v) return true;
    }
    return false;
}
```
Compiles and runs cleanly with `-Wall -Wextra -fstack-protector`

---
* https://godbolt.org/z/64Yxsr31f
* https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=633

```asm
exists_in_table:
        mov     al, 1
        ret
```

That is, as if we had written

```c
bool exists_in_table(int v) {
    return true;
}
```

# What does this program print?

```c++
int main() {
    std::vector<int> v {10, 11, 12};
    if (coin_flip) v.pop_back();
    int &vref = v[1];
    v.push_back(13);
    std::cout << vref << std::endl;
    return 0;
}
```

---
* https://godbolt.org/z/aMs9fTKhG
* https://cacm.acm.org/research/safe-systems-programming-in-rust/

```
11
```

## Or

```
5
```
(or **any behavior at all!**)

# What about this?

```c++
int main () {
    std::string s = "Helloooooooooooooooooo ";
    std::string_view sv = s + "World\n";
    std::cout << sv;
}
```


* https://godbolt.org/z/E9ebTM9nq

```
���G�9�Ԇۮoooooo World
```

## Why?

* `s + "World\n"` created a temporary that goes out of scope at the sequence point `;`, so this is a use-after-free.

* Detected in simple cases with `-Wdangling-gsl`, but not for trivial changes such as
```c++
    std::string_view sv;
    sv = s + "World\n";
```

# Or this?

```c++
int main () {
    std::vector v { 11, 12, 13 };
    for (int i: v) {
        if (i % 2 == 0) v.push_back(i);
        std::cout << i << std::endl;
    }
}
```

* https://godbolt.org/z/5Yb4jE1xj

```
11
12
741750957
```

# Undefined Behavior (UB) is painful and costly

* UB is masked by abstraction
  * You're never looking at the whole context
* Debug/print statements can mask UB
* Reliably avoiding UB in code review and CI is intractible
  * tools help with some forms, but lack of detection is not lack of UB
* It's hard for new developers to "learn" the paranoia that seasoned developers have
  * Part of being burned is learning the arcane tools to debug
  * Even experts make these mistakes
* The cognitive load is a tax on your critical and creative thinking

<img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQt1XoSwUrAiUmN6tbntYLZ-IsBBV-e2aAKIKJJcavncM9t6IwD4LVlse0OSiA5ecs52_wkiaUml_9MoncUNOU8wxajv3dPonrtVlV31TJW6bKBs6mPNec7jb12rX18VRI0VwhETljd2QEp0kQ4oFQZBNq0pwoH-EedxhThqfwD73s0dqZALf_nGPkPMdK/s1600/graph.png" width="90%" />

[Google Project Zero](https://security.googleblog.com/2024/11/retrofitting-spatial-safety-to-hundreds.html) analysis of CVE exploits, attributed by class of memory safety bug:
* **spatial**: out-of-bounds indexing
* **temporal**: use-after-free
* **type**: invalid conversion (e.g., `bool` or `enum`)
* **initialization**: uninitialized variables/memory
* **data-race**: threads, devices, signal handler

<a href="https://media.defense.gov/2023/Dec/06/2003352724/-1/-1/0/THE-CASE-FOR-MEMORY-SAFE-ROADMAPS-TLP-CLEAR.PDF"><img src="figures/rust/case-for-memory-safe-roadmaps.png" width="90%" /></a>

## CISA: [Product Security Bad Practices](https://www.cisa.gov/resources-tools/resources/product-security-bad-practices)

> The development of new product lines for use in service of critical infrastructure or NCFs **in a memory-unsafe language (e.g., C or C++)** where readily available alternative memory-safe languages could be used **is dangerous and significantly elevates risk to national security, national economic security, and national public health and safety**.

### CISA: [Secure by Design](https://www.cisa.gov/securebydesign)

> **Prioritize the use of memory safe languages wherever possible.** [...] Some examples of modern memory safe languages include C#, Rust, Ruby, Java, Go, and Swift.


# Google

## [Secure by Design: Google's Perspective on Memory Safety](https://research.google/pubs/secure-by-design-googles-perspective-on-memory-safety/) (2024)
> We see no realistic path for an evolution of C++ into a language with rigorous memory safety guarantees that include temporal safety. As a consequence, we are considering a gradual transition of C++ code at Google towards other languages that are memory safe.

> Rust is the only mature, production-ready language that provides temporal safety without run-time mechanisms such as garbage collection or universally-applied refcounting, for large
classes of code.

## [Safer with Google: Advancing Memory Safety](https://security.googleblog.com/2024/10/safer-with-google-advancing-memory.html) (2024)

<img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSRbsz3UFa32nAEek2cEOIN-IM5XN6df3vibnuP7nmzJoYLMAfkHgjlAcbCbjGmV0THU_CMtP9vgs3EHHe7zwRqeuXbQoxA_EGrqDMLDRJShnakXuMxesVqDJaq2xPWcpyqCcRpvW3-ZWJiZu2LXtyEs23CvI4jOBkw89T1iSVWHl-j4OYMsC0EN0E4dFh/s600/memory%20safety%20graphic.png" width="100%" />

# Microsoft

## [Microsoft Azure CTO](https://www.theregister.com/2022/09/20/rust_microsoft_c/) (2022)

> Speaking of languages, it's time to **halt starting any new projects in C/C++ and use Rust** for those scenarios where a non-garbage collected language is required. [...] For the sake of security and reliability, the industry should declare those languages as deprecated.

* Rust [is used](https://www.theregister.com/2024/11/04/windows_11_market_share/) in the Windows kernel 24H2
* DWriteCore (text analysis, layout, rendering) is [mostly Rust](https://www.theregister.com/2023/04/27/microsoft_windows_rust/); similar for Windows graphics device interface (GDI)
* [Rust for Windows](https://learn.microsoft.com/en-us/windows/dev-environment/rust/rust-for-windows) is officially supported by Microsoft
* [Several](https://github.com/omarabid/rust-companies) Azure services

# Broad industry use

## Amazon/AWS
* Core language activities and leadership
* [Formal verification](https://aws.amazon.com/blogs/opensource/verify-the-safety-of-the-rust-standard-library/) of the Rust Standard Library
* Core products substantially or entirely [written in Rust](https://aws.amazon.com/blogs/devops/why-aws-is-the-best-place-to-run-rust/): S3, CloudFront, EC2, Nitro System, Lambda
  
## [Safety-Critical Rust Consortium](https://rustfoundation.org/safety-critical-rust-consortium/) <a href="https://ferrocene.dev/en/"><img src="https://ferrocene.dev/media/images/logo.svg" width="30%" align="right" /></a>

* ISO26262 (ASIL D), IEC 61508 (SIL 4) and IEC 62304
* [Volvo](https://corrode.dev/podcast/s03e08-volvo/), [Woven by Toyota](https://rustfoundation.org/safety-critical-rust-consortium/), and other automotive

<img src="figures/rust/rust-foundation-members.png" width="100%" />

* Cloudflare, Hugging Face, Linux kernel, Discord, Mozilla

# Rust: a type-safe systems language

```rust
fn main() {
    let mut v = vec![10, 11, 12];
    let vref = &v[1];
    v.push(13);
    println!("{}", *vref);
}
```

**Type-safe/memory-safe**: cannot create undefined behavior without using the `unsafe` keyword

**Near-zero cost**: most safety enforced at compile-time; some dynamic checks (often optimized out)

**Expressive, low-level control**: unboxed, space-efficient, ergonomic

<pre><font color="#F66151"><b>error[E0502]</b></font><b>: cannot borrow `v` as mutable because it is also borrowed as immutable</b>
 <font color="#2A7BDE"><b>--&gt; </b></font>src/main.rs:4:5
  <font color="#2A7BDE"><b>|</b></font>
<font color="#2A7BDE"><b>3</b></font> <font color="#2A7BDE"><b>|</b></font>     let vref = &amp;v[1];
  <font color="#2A7BDE"><b>|</b></font>                 <font color="#2A7BDE"><b>-</b></font> <font color="#2A7BDE"><b>immutable borrow occurs here</b></font>
<font color="#2A7BDE"><b>4</b></font> <font color="#2A7BDE"><b>|</b></font>     v.push(13);
  <font color="#2A7BDE"><b>|</b></font>     <font color="#F66151"><b>^^^^^^^^^^</b></font> <font color="#F66151"><b>mutable borrow occurs here</b></font>
<font color="#2A7BDE"><b>5</b></font> <font color="#2A7BDE"><b>|</b></font>     println!(&quot;{}&quot;, *vref);
  <font color="#2A7BDE"><b>|</b></font>                    <font color="#2A7BDE"><b>-----</b></font> <font color="#2A7BDE"><b>immutable borrow later used here</b></font>

<b>For more information about this error, try `rustc --explain E0502`.</b></pre>

# Rust ecosystem

## rustup
Cross-platform toolchain management

## Cargo

* `Cargo.toml`
```toml
[dependencies]
mpi = { version = "0.8.0", features = ["derive"] }
```
* Parallel across your dependency graph

## `cargo run`

Rebuilds if necessary

## `cargo test`

Unit testing, doctests, integration tests, custom test harnesses, editor integration.

## `cargo doc`

Cross-referenced documentation including doctests; [docs.rs](https://docs.rs) integrated with [crates.io](https://crates.io)

## `cargo fix`
Automatically fix lint warnings

## `cargo fmt`

## rust-analyzer

IDE integration, works for any project without setup steps


# LLMs and critical thinking

* [The Impact of Generative AI on Critical Thinking](https://www.microsoft.com/en-us/research/uploads/prod/2025/01/lee_2025_ai_critical_thinking_survey.pdf) (Microsoft Research)

> Specifically, higher
confidence in GenAI is associated with less critical thinking, while
higher self-confidence is associated with more critical thinking.
* [Google's DevOps Report](https://redmonk.com/rstephens/2024/11/26/dora2024/) shows a grave impact on stability

> * if AI adoption increases by 25%, time spent doing valuable work is estimated to decrease 2.6%
> * if AI adoption increases by 25%, estimated throughput delivery is expected to decrease by 1.5%
> * if AI adoption increases by 25%, estimated delivery stability is expected to decrease by 7.2%



* [Uplevel Data Lab](https://resources.uplevelteam.com/gen-ai-for-coding)

> Developers with Copilot access saw a [41%] higher bug rate while their issue throughput remained consistent. 

## Anti-patterns today

* Students today often complete assignments by getting it partly working and poking at it with external sources like StackOverflow, and increasingly, LLMs.
* C++ error messages are intimidating, pushing people to give up/pattern match/reach for LLMs.
* C++ environments have lots of incidental complexity and gotchas, with different orgs adopting mutually-incompatible conventions


# Quality diagnostics support learning

<img src="figures/rust/gankra-ekuber.jpg" width="100%" />

* This ethos permeates the ecosystem and is a central factor in language evolution

<!-- ## [Stability without stagnation](https://doc.rust-lang.org/book/appendix-07-nightly-rust.html)

Experimental features are available only on `nightly`, not the `stable` release channel. -->

## [MIRI](https://github.com/rust-lang/miri): An interpreter for Rust's mid-level intermediate representation

* `cargo miri run`
* Detects and explains when `unsafe` code leads to UB
* Much more capable than valgrind, stack protector, address sanitizer

# Why CSCI-2275 to pilot a Rust version?

* Smaller class sizes while being representative of the core sequencee
* It has a culture of modification
* No "language change" for 1300 -> 2270
* Later courses are not overly tied to C++ (e.g., 2400 has only one assignment that is tied to C/unsafe pointers)

## How can we evolve our curriculum to raise the professionalism of our graduates as software engineers in the 2030s?