<div align="center">
    <h1>DS-210: Programming for Data Science</h1>
    <h1>Lecture 7</h1>
</div>


# Enums (§6.1) and Pattern Matching, Structs (§5), and Introduction to Memory Management

So far we looked at:

* scalar datatypes like ints, floats, booleans, chars (lecture_04.ipynb, §3.2 -- Scalar types)
* String literals, string slices (`&str`) and `String` types (lecture_04.ipynb, §3.2, §4.3)
* Arrays (lecture_04.ipynb, §3.2 -- Compound types)
* Tuples (lecture_04.ipynb, §3.2 -- Compound types)

> §m.n referes to chapter m, section n in the [Rust book](https://doc.rust-lang.org/book/title-page.html).

# Enums

`enum` is short for "enumeration" and allows you to define a _type_ by enumerating
its possible _variants_.

The type you define can only take on one of the variants you have defined.

Allows you to encode meaning along with data.

Pattern matching using `match` and `if let` allows you to run different code depending on the value of the enum.

> Python doesn't have native support for `enum`, but it does have an
> [`enum` module](https://docs.python.org/3/library/enum.html) that let's do something
> similar by subclassing an `Enum` class.

Let's start with a simple example:

In [2]:
// define the enum and its variants
enum Direction {
    North,
    East,
    South,
    West,
    SouthWest,
}

// create instances of the enum variants
let dir_1 = Direction::North;   // dir is inferred to be of type Direction
let dir_2: Direction = Direction::South; // dir_2 is explicitly of type Direction

The `enum` declaration is defining our new type, so now a type called `Direction` is in scope,
similar to `i32`, `f64`, `bool`, etc., but it instances can only be one of the variants we have defined.

The `let` declarations are creating instances of the `Direction` type.

You can bring the variants into scope using `use` statements.

In [3]:
// Bring the variant `East` into scope
use Direction::East;

// we didn't have to specify "Direction::"
let dir_3 = East;

In [4]:
// Bringing two options into the current scope
use Direction::{East,West};
let dir_4 = West;

In [5]:
// Bringing all options in
use Direction::*;
let dir_5 = South;

Why might we not always want to bring all the variants into scope?
<br><br>

In [6]:
let MyVar = "my string";

enum Prohibited {
    MyVar,
    YourVar,
}

let another_var = Prohibited::MyVar;

// what happens if we bring all the variants into scope?
// use Prohibited::*;

println!("{MyVar}");

my string


We can also define a function that takes our new type as an argument.

```Rust
fn turn(dir: Direction) { ... }
```


## Enums: Control Flow with `match`

The `match` statement is used to control flow based on the value of an enum.



In [16]:
enum Direction {
    North,
    East,
    South,
    West,
}
let dir = Direction::East;

// print the direction
match dir {
    Direction::North => println!("N"),
    Direction::South => println!("S"),
    Direction::West => {  // can do more than one thing
        println!("Go west!");
        println!("W")
    }
    Direction::East => println!("E"),
};

E


`match` is exhaustive, so we must cover all the variants.

In [8]:
let dir_2: Direction = Direction::South;

// won't work 
match dir_2 {
    North => println!("N"),
    South => println!("S"),
    // East and West not covered
};

Error: non-exhaustive patterns: `Direction::East` and `Direction::West` not covered

But there is a way to match anything left.

In [11]:
let dir_2: Direction = Direction::North;

match dir_2 {
    North => println!("N"),
    South => println!("S"),
    
    // match anything left
    _ => (),  // covers all the other variants but doesn't do anything
}

N


()

**WARNING!!**

In [12]:

match dir_2 {
    _ => println!("anything else"),
    
    // will never get here!!
    North => println!("N"),
    South => println!("S"),
}

anything else


()

### Recap of `match`
* Type of a switch statement like in C/C++ (Python doesn't have an equivalent)
* Must be exhaustive though there is a way to specify default (_ =>) 

## Putting Data in an Enum Variant

* Each variant can come with additional information

In [19]:
#[derive(Debug)]   // allows us to print the enum by having Rust automatically implement a Debug train (more later)
enum DivisionResult {
    Ok(u32),    // This variant has an associated value of type u32
    DivisionByZero,
}

// This function returns a DivisionResult which can handle the case where the division is by zero
fn divide(x:u32, y:u32) -> DivisionResult {
    if y == 0 {
        return DivisionResult::DivisionByZero;
    } else {
        return DivisionResult::Ok(x / y); // Prove a value with the variant
    }
}

let (a,b) = (9,3);  // this is just short hand for let a = 9; let b = 3;

// we can call `divide` and handle the result
match divide(a,b) {
    DivisionResult::Ok(result)  // assign the variant value to result
        => println!("the result is {}",result),
    DivisionResult::DivisionByZero
        => println!("noooooo!!!!"),
};

// we can also call `divide`, store the result and print it
let z = divide(5, 4);
println!("The result is {:?}", z);

the result is 3
The result is Ok(1)


We can have more than one associated value in a variant.

In [3]:
enum DivisionResultWithRemainder {
    Ok(u32,u32),  // Store the result of the integer division and the remainder
    DivisionByZero,
}

fn divide_with_remainder(x:u32, y:u32) -> DivisionResultWithRemainder {
    if y == 0 {
        DivisionResultWithRemainder::DivisionByZero
    } else {
        DivisionResultWithRemainder::Ok(x / y, x % y) // Return the integer division and the remainder
    }
}

let (a,b) = (9,4);
match divide_with_remainder(a,b) {
    DivisionResultWithRemainder::Ok(result,remainder) => {
            println!("the result is {}",result);
            println!("the remainder is {}",remainder);
    }
    DivisionResultWithRemainder::DivisionByZero
        => println!("noooooo!!!!"),
};

the result is 2
the remainder is 1


## A Note on the Memory Size of Enums

The size of the enum is related to the size of its largest variant, not the sum of the sizes.


In [16]:
use std::mem;

enum SuperSimpleEnum {
    First,
    Second,
    Third
}

enum SimpleEnum {
    A,           // No data
    B(i32),      // Contains an i32
    C(i32, i32), // Contains two i32s
    D(i64)       // Contains an i64
}

fn main() {
    println!("Size of SuperSimpleEnum::First: {} bytes\n", mem::size_of::<SuperSimpleEnum>());

    println!("Size of SimpleEnum: {} bytes", mem::size_of::<SimpleEnum>());
    println!("Size of i32: {} bytes", mem::size_of::<i32>());
    println!("Size of (i32, i32): {} bytes", mem::size_of::<(i32, i32)>());
    println!("Size of (i64, i64): {} bytes", mem::size_of::<(i64, i64)>());
}

main();


Size of SuperSimpleEnum::First: 1 bytes

Size of SimpleEnum: 16 bytes
Size of i32: 4 bytes
Size of (i32, i32): 8 bytes
Size of (i64, i64): 16 bytes


## Displaying enums

By default Rust doesn't know how to display a new enum type.

Here we try to debug print the `Direction` enum.

In [20]:
let dir = Direction::North;
println!("{:?}",dir);

Error: `Direction` doesn't implement `Debug`

Or equivalently:

In [21]:
let dir = Direction::North;

dir

Error: `Direction` doesn't implement `Debug`

Adding the `#[derive(Debug)]` attribute to the enum definition allows Rust to automatically implement the `Debug` trait.

In [18]:
#[derive(Debug)]
enum Direction {
    North,
    East,
    South,
    West,
}

use Direction::*;
let dir = Direction::North;
println!("{:?}",dir);

North


### Optional: More on #[derive(Debug)]

* A simple way to tell Rust to generate code that allows a complex type to be printed
* Here's the equivalent manual implementation of the `Debug` trait
* more on traits and `impl` later

```
use std::fmt;

impl fmt::Debug for Direction {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
           match *self {
               Direction::North => write!(f, "North"),
               Direction::East => write!(f, "East"),
               Direction::South => write!(f, "South"),
               Direction::West => write!(f, "West"),               
           }
    }
}
```

In [23]:
let dir = Direction::North;
dir

North

In [24]:
let dir = Direction::North;
println!("{:?}",dir);

North


In [25]:
// Example of how make a complex datatype printable directly (without deriving from Debug)
use std::fmt;

impl fmt::Display for Direction {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
           match *self {
               Direction::North => write!(f, "North"),
               Direction::East => write!(f, "East"),
               Direction::South => write!(f, "South"),
               Direction::West => write!(f, "West"),               
           }
    }
}
println!("{}", dir);

North


## `match` as expression

The result of a `match` can be used as an expression.

Each branch (arm) returns a value.

In [19]:
// swap east and west
let mut dir_4 = North;
println!("{:?}", dir_4);

dir_4 = match dir_4 {
    East => West,
    West => {
        println!("Switching West to East");
        East
    }
    // variable mathching anything else
    _ => West,

};

println!("{:?}", dir_4);

North
West


## Simplified matching with `if let`

Consider the following example (in which we want to use just one branch):

In [20]:
#[derive(Debug)]
enum DivisionResult {
    Ok(u32,u32),
    DivisionByZero,
}

fn divide(x:u32, y:u32) -> DivisionResult {
    if y == 0 {
        DivisionResult::DivisionByZero
    } else {
        DivisionResult::Ok(x / y, x % y)
    }
}


match divide(8,3) {
    DivisionResult::Ok(result,remainder) => println!("{} (remainder {})",result,remainder),
    _ => (), // <--- how to avoid this?
};


2 (remainder 2)


This is a common enough pattern that Rust provides a shortcut for it.

`if let` allows for matching just one branch (arm)

In [28]:
if let DivisionResult::Ok(result,reminder) = divide(8,7) { 
    println!("{} (remainder {})",result,reminder);
};

1 (remainder 1)


The single `=` is not an assignment, it is a pattern matching operator.

In [29]:
use Direction::*;
let dir = North;
if let North = dir {
    println!("North");
};

North


You can use `else` to match anything else.

In [30]:
// But it is important to have the enum
// on the left hand side
// if let West = dir {
if let dir = West {
 println!("North");
} else {
    println!("Something else");
};

North


Remember to use the single `=` for pattern matching, not the double `==` for equality.

In [31]:
// Don't do this.
if dir == North {
    println!("North");
}

Error: an implementation of `PartialEq` might be missing for `Direction`

# Structs

Previously we saw tuples, e.g., `(12, 1.7, true)`, where we can mix different types of data.

Structs compared to tuples:

* **Similar:** can hold items of different types
* **Different:** the items have names

In [46]:
// Definition: list items (called fields)
//             and their types

struct Person {
    name: String,
    year_born: u16,
    time_100m: f64,
    likes_ice_cream: bool,
}

In [47]:
// Instantiation: replace types with values

let mut cartoon_character: Person = Person {
    name: String::from("Tasmanian Devil"),
    year_born: 1954,
    time_100m: 7.52,
    likes_ice_cream: true,
};


In [48]:
// Accessing fields: use ".field_name"
println!("{} was born in {}",
    cartoon_character.name,
    cartoon_character.year_born);
    
cartoon_character.year_born = 2022;
println!("{} was born in {}",
    cartoon_character.name,
    cartoon_character.year_born);

Tasmanian Devil was born in 1954


Tasmanian Devil was born in 2022


## Tuple structs

Named tuples to impose more meaning and delineate a different type.  

Example: both `(f64,f64,f64)`

* box size (e.g., height $\times$ width $\times$ depth)
* Euclidean coordinates of a point in 3D

In [49]:
struct BoxSize(f64,f64,f64);
struct Point2(f64,f64,f64);

In [50]:
let mut my_box = BoxSize(3.2,6.0,2.0);
let mut p : Point2 = Point2(-1.3,2.1,0.0);

In [51]:
// won't work
my_box = p;

// Impossible to accidentally confuse different
// types of triples.
// No runtime penalty! Verified at compilation.

Error: mismatched types

In [52]:
// Acessing via index
println!("{} {} {}",p.0,p.1,p.2);
p.0 = 17.2;

// Destructuring
let Point2(first,second,third) = p;
println!("{} {} {}", first, second, third);

-1.3 2.1 0


17.2 2.1 0


## Named structs in enums

Structs with braces and exchangable with tuples in many places

In [53]:
enum LPSolution {
    None,
    Point{x:f64,y:f64}
}

let example = LPSolution::Point{x:1.2, y:4.2};

In [54]:
if let LPSolution::Point{x:first,y:second} = example {
    println!("coordinates: {} {}", first, second);
};

coordinates: 1.2 4.2


## Method Syntax

Brings aspects of object-oriented programming to Rust: _combine properties and methods in one object_.

Methods are _functions that are defined within the context of a struct_.

The first parameter is always `self`, which refers to the instance of the
`struct` the method is being called on.

Use and `impl` (implementation) block on the struct to define methods.

In [59]:
struct Point {
    x: f64,
    y: f64,
}

struct Rectangle {
    p1: Point,
    p2: Point,
}

impl Rectangle {
    // This is a method
    fn area(&self) -> f64 {
        // `self` gives access to the struct fields via the dot operator
        let Point { x: x1, y: y1 } = self.p1;
        let Point { x: x2, y: y2 } = self.p2;

        // `abs` is a `f64` method that returns the absolute value of the
        // caller
        ((x1 - x2) * (y1 - y2)).abs()
    }

    fn perimeter(&self) -> f64 {
        let Point { x: x1, y: y1 } = self.p1;
        let Point { x: x2, y: y2 } = self.p2;

        2.0 * ((x1 - x2).abs() + (y1 - y2).abs())
    }
}

let rectangle = Rectangle {
    p1: Point{x:0.0, y:0.0},
    p2: Point{x:3.0, y:4.0},
};

println!("Rectangle perimeter: {}", rectangle.perimeter());
println!("Rectangle area: {}", rectangle.area());


Rectangle perimeter: 14


Rectangle area: 12


### Associated Functions without `self` parameter

Useful as constructors.

You can have more than one `impl` block on the same struct.


In [60]:
impl Rectangle {
    fn new(p1: Point, p2: Point) -> Rectangle {
        Rectangle { p1, p2 }
    }
}

let rect = Rectangle::new(Point{x:0.0, y:0.0}, Point{x:3.0, y:4.0});
println!("Rectangle area: {}", rect.area());


Rectangle area: 12


# Memory Management: Stack vs. Heap

We talked about stack and heap generically. Let's look at an example based on Rust.

* Two different places where space for data can be allocated
* We will discuss them one by one


# Stack

* LIFO (last in first out) memory allocation
* Stores current local variables and additional information such as:
  - function arguments
  - function output
  - where to continue when a function terminates
* Fast memory allocation
* Usually small fraction of the memory
* Often: size of the allocated memory has to be known in advance (compilation time)

Almost everything you saw so far allocated on stack
* Exception: data in `String` allocated on heap

## Stack example (idealized)

In [2]:
fn main() {
    let mut x = 3;
    let mut y = 8;
    println!("x = {}, y = {}",x,y);
    x = add_or_subtract(x,y,true); // x = x + y
    y = add_or_subtract(x,y,false); // y = x - y
    x = add_or_subtract(x,y,false); // x = x - y
    println!("x = {}, y = {}",x,y);
}

fn add_or_subtract(x:i32, y:i32, add:bool) -> i32 {
    let second_arg = if add {y} else {negate(y)};
    x + second_arg
}

fn negate(x:i32) -> i32 {
    -x
}

main();


x = 3, y = 8
x = 8, y = 3


---

### Step 1: call `main`
* `x` and `y` allocated on stack and initiated
* Stack: `main` (`x`, `y`)

| Idealized Stack |
| -- |
| y (main) |
| x (main) |

---

### Step 2: call `add_or_subtract` (1st time)
* arguments for `add_or_subtract` put on stack
* space for solution allocated on stack
* space for `second_arg` allocated as well
* Stack: `main` (`x`, `y`), `add_or_subtract` (all the above + auxiliary information)

| Idealized Stack |
| -- |
| second_arg (add_or_subtract) |
| retval (add_or_subtract) |
| true(Bool) (add_or_subtract) |
| y (add_or_subtract) |
| x (add_or_subtract) |
| y (main) |
| x (main) |

---

## Stack example (idealized)

In [None]:
fn main() {
    let mut x = 3;
    let mut y = 8;
    println!("x = {}, y = {}",x,y);
    x = add_or_subtract(x,y,true);
    y = add_or_subtract(x,y,false);
    x = add_or_subtract(x,y,false);
    println!("x = {}, y = {}",x,y);
}

fn add_or_subtract(x:i32, y:i32, add:bool) -> i32 {
    let second_arg = if add {y} else {negate(y)};
    x + second_arg
}

fn negate(x:i32) -> i32 {
    -x
}

main();


---

### Step 3: `add_or_subtract` terminates
* process and remove all information about the 
call
* Stack: `main` (`x`, `y`)

| Idealized Stack |
| -- |
| y (main) |
| x (main) |

---

### Step 4: call `add_or_subtract` (2nd time)
* arguments for `add_or_subtract` put on stack
* space for solution allocated on stack
* space for `second_arg` allocated as well
* Stack: `main` (`x`, `y`), `add_or_subtract` (all the above + auxiliary information)

| Idealized Stack |
| -- |
| second_arg (add_or_subtract) |
| retval (add_or_subtract) |
| false(Bool) (add_or_subtract) |
| y (add_or_subtract) |
| x (add_or_subtract) |
| y (main) |
| x (main) |

---

## Stack example (idealized)

In [None]:
fn main() {
    let mut x = 3;
    let mut y = 8;
    println!("x = {}, y = {}",x,y);
    x = add_or_subtract(x,y,true);
    y = add_or_subtract(x,y,false);
    x = add_or_subtract(x,y,false);
    println!("x = {}, y = {}",x,y);
}

fn add_or_subtract(x:i32, y:i32, add:bool) -> i32 {
    let second_arg = if add {y} else {negate(y)};
    x + second_arg
}

fn negate(x:i32) -> i32 {
    -x
}

main();


---

### Step 5: call `negate` (1st time)
* the argument for `negate` put on stack
* space for solution allocated on stack
* Stack: `main` (`x`, `y`), `add_or_subtract` (...), `negate` (all of the above + auxiliary information)

| Idealized Stack |
| -- |
| retval (negate) |
| y (negate) |
| second_arg (add_or_subtract) |
| retval (add_or_subtract) |
| false(Bool) (add_or_subtract) |
| y (add_or_subtract) |
| x (add_or_subtract) |
| y (main) |
| x (main) |

---

### Step 6: `negate` terminates
* process and remove all information about the 
call
* Stack: `main` (`x`, `y`), `add_or_subtract` (...)

| Idealized Stack |
| -- |
| second_arg (add_or_subtract) |
| retval (add_or_subtract) |
| false(bool) (add_or_subtract) |
| y (add_or_subtract) |
| x (add_or_subtract) |
| y (main) |
| x (main) |

---

## Stack example (idealized)

In [None]:
fn main() {
    let mut x = 3;
    let mut y = 8;
    println!("x = {}, y = {}",x,y);
    x = add_or_subtract(x,y,true);
    y = add_or_subtract(x,y,false);
    x = add_or_subtract(x,y,false);
    println!("x = {}, y = {}",x,y);
}

fn add_or_subtract(x:i32, y:i32, add:bool) -> i32 {
    let second_arg = if add {y} else {negate(y)};
    x + second_arg
}

fn negate(x:i32) -> i32 {
    -x
}

main();



---

### Step 7: `add_or_subtract` terminates
* [...]
* Stack: `main` (`x`, `y`)

| Idealized Stack |
| -- |
| y (main) |
| x (main) |

---

### Step 8: call `add_or_subtract` (3rd time)
* [...]
* Stack: `main` (`x`, `y`), `add_or_subtract` (...)

| Idealized Stack |
| -- |
| second_arg (add_or_subtract) |
| retval (add_or_subtract) |
| false(bool) (add_or_subtract) |
| y (add_or_subtract) |
| x (add_or_subtract) |
| y (main) |
| x (main) |

---

## Limited space on stack!

In [35]:

fn same_number(x:u32) -> u32 {
    match x {
        0 => 0,
        _ => 1 + same_number(x - 1),
    }
}

In [36]:
same_number(7)

7

In [37]:
same_number(123_456)

123456

In [38]:
same_number(1_000_000)

Error: Subprocess terminated with status: signal: 11 (SIGSEGV)

## Using too much memory on stack: *stack overflow*

This is where the name of the popular webpage for asking questions about programming comes from!<br>

<div align="center">
    <img src="chucknorris.png" alt="[screenshot of stackoverflow.com]">
</div>

# Heap

* Memory allocated and freed in arbitrary order
* Arbitrary amount allocated
* The application knows a *pointer* = the address of assigned memory


Pros:
* Arbitrary amount of data
* No copying to pass data around
  * Just share the pointer!


Cons:
* Slower allocation:
  * Possible request for more space to the operating system
* Possible memory fragmentation
* Slower access:
  * Have to follow the pointer to get to data


## Stack vs. heap in Python

* Elementary pieces of data allocated on stack: integers, floats, Boolean values, ...

* Anything else allocated on the heap

<br><br>
<div align="center">
    <h2>[Switch to the Python notebook]</h2>
</div>

## Heap management

**Memory allocation:**
* ask for a given amount of space 
* receives a pointer to it<br> (or an out of memory error)

**Freeing memory:**
* classical manual: explicitly return it
  * more complicated
* automatic: garbage collection
  * comes with additional costs



C: `malloc` / `free`

C++: `new` / `delete` + C

**Pitfalls of manual memory management:**

 * leaks: unused memory never returned

 * attempting to use a pointer to memory that was deallocated

 * returning memory that was already deallocated

<div align="center">
    <b>How does Rust deal with these problems?</b>
</div>

## Allocating on the heap in Rust

* `String` is a type of collection in Rust.
* It and other collections are allocated on the heap.
* When you allocate a collection, a data structure (usually a `struct`) is created on the _stack_ which holds data about the collection.

<div style="text-align: center;">
  <img src="string-data-structures.png" alt="String data structures" style="width: 40%;">
</div>

### `Vec` Collection

Another useful collection the `Vec` (§8.1).

* Allows you to store multiple values in a single structure
* All adjacent in memory
* Like array, can only store values of the same type.

There are other useful collections (e.g. Hash Map §8.3, Box §15.1, etc.), some we'll cover later.

The complete list is in [Module collections](https://doc.rust-lang.org/std/collections/index.html).

In [62]:
// placing integers on the heap
let mut v = vec![1, 2, 3];

In [63]:
println!("{:?}", v);

[1, 2, 3]


Let's look at the address of the data structure describing the vector.

In [64]:
println!("{:p}", &v);

0x16ce394d8


Let's look at the pointer in the `Vec` data structure pointing to the heap.

In [65]:
println!("{:p}", v.as_ptr());


0x1460041d0


Just for comparison, let's look at the pointers for an immutable array.

In [74]:
let m = [4, 5, 6];
println!("{:?}", m);
println!("{:p}", &m);
println!("{:p}", m.as_ptr());


[4, 5, 6]
0x16ce3946c
0x16ce3946c


Pointers are the same!

We'll into more depth on `Vec` and collections later.

## In Class Poll

https://piazza.com/class/m5qyw6267j12cj/post/132

