<div align="center">
    <h1>DS-210 B1: Programming for Data Science</h1>
    <h2>Lecture 4</h2>
</div>


# Introduction to Rust

* Compiling. 
* Basic types and variables. 
* Project manager (`cargo`).


# VSCode Rust Environment config

**Extremely Important** You must always connect via
https://rhods-dashboard-redhat-ods-applications.apps.shift.nerc.mghpcc.org/notebookController/spawner

**Don't use other links or the navigation on the left side of the browser window, e.g. Data Science Projects.**

Make sure to have the following configuration for vscode on the environment

<div style="text-align: center;">
  <img src="server-variables.png" alt="GitHub Image" style="width: 70%;">
</div>

I'll show quick demo of how to:

1. create a new `projects` or `source` folder.
2. reopen VS Code to a particular project
3. open a terminal window

# Rust: Compiling. Basic types and variables. Project manager (`cargo`).

## Source of Truth

<div align="center">
    <img src="./rust-book-title.png" alt="Rust Book" style="width: 70%;">

https://doc.rust-lang.org/book/
</div>


## Rust and Jupyter Notbooks

A few comments on using Rust as a kernel in Jupyter notebooks and how it differs from normal Rust development.

Reminder: Rust is a compiled language and, as we discussed earlier and the typical flow is:

1. Write code in an editor (VSCode, Intellij, etc.)
2. Compile the code with `rustc` or `cargo`
3. Run the compiled code

```mermaid
graph LR;
    A[Write code\n in an editor] --> B[Compile the code\n with `rustc` or `cargo`]
    B --> C[Run the\n compiled code]
    C --> D[Check the\n output]
    D --> A
```

* In Jupyter notebooks, code is put into cells and executed one at a time.
* Functions and variables defined in a cell are maintained in the notebook's kernel context.
* This means that if you define a function in one cell, you can use it in another cell without redefining it.
* However, if you restart the kernel or close the notebook, all variables and functions will be lost.


Again, this is different from normal Rust development, where you compile the code and then run the compiled binary.

Unlike normal Rust development, where the function `main` is the entry point, in Jupyter notebooks,
you have to explicitly call the function main to run it.

And finally, you will see cells with Rust commands that are not wrapped in a `fn main() { ... }` block.
These are called "global context" and are executed when the cell is run.

This is all made possible by the [Evcxr](https://github.com/evcxr/evcxr/) project
(**Ev**alutation **C**onte**X**t for **R**ust).

## Write and compile simple Rust program

Generally you would create a project directory for all your projects and then
a subdirectory for each project.

```bash
$ mkdir ~/projects
$ cd ~/projects
$ mkdir hello_world
$ cd hello_world
```

All Rust source files have the extension `.rs`.

Create a file called `main.rs` and add the following code:


In [3]:
fn main() {
    println!("Hello, world!");
}

If you created that file on the command line, then you compile and run the
program with the following commands:

```bash
$ rustc main.rs    # compile with rustc which creates an executable
$ ./main           # run the executable
Hello, world!
```


```rust
fn main() { ... }
```
is how you define a function in Rust.

The function name `main` is reserved and is the entry point of the program.

Again, unlike in normal Rust development, here in the notebook we need to 
explicitly call the `main` function to execute it.

In [4]:
main();

Hello, world!


Let's look at the single line of code in the main function:

```rust
    println!("Hello, world!");
```

Rust convention is to indent with 4 spaces -- never use tabs!!

`println!` is a macro which is indicated by the `!` suffix. Macros are functions that are expanded at compile time.
The string `"Hello, world!"` is passed as an argument to the macro.

The line ends with a `;` which is the end of the statement.


Let's look at a program that prints in a bunch of different ways.

In [5]:
// A bunch of the output routines
fn main() {
    let x = 9;
    let y = 16;
    
    print!("Hello, DS210!\n");       // Need to include the newline character
    println!("Hello, DS210!\n");     // The newline character here is redundant

    println!("{} plus {} is {}", x, y, x+y);  // print with formatting placeholders
    //println!("{x} plus {y} is {x+y}");      // error: cannot use `x+y` in a format string
    println!("{x} plus {y} is {}\n", x+y);      // but you can put variable names in the format string
    
    println!("{:?} plus {:?} is {:?}\n", x, y, x+y);  // {:?} format specifier for debug

    println!("Hexadecimal: 0x{:X} plus 0x{:X} is 0x{:X}", x, y, x+y);  // {:X} format specifier for uppercase hexadecimal
    println!("Octal: 0o{:o} plus 0o{:o} is 0o{:o}", x, y, x+y);  // {:o} format specifier for octal
    println!("Binary: {:b} plus {:b} is {:b}\n", x, y, x+y);  // {:b} format specifier for binary
    
    println!("pointer to x: {:p}", &x);   // {:p} format specifier for pointer
    println!("pointer to y: {:p}", &y);
    println!("pointer to x+y: {:p}\n", &(x+y));
    
    let z = format!("{} plus {} is {}\n", x, y, x+y);  // format! is a macro that returns a string
    println!("{}", z);  
    
    eprint!("E {} plus {} is {}\n", x, y, x+y);      // eprint! is a macro that prints to the standard error stream
    eprintln!("E {} plus {} is {}\n", x, y, x+y);    // eprintln! is a macro that prints to the standard error stream and adds a newline character
}

More on `println!`:

- first parameter is a format string
- `{}` are replaced by the following parameters

`print!` is similar to `println!` but does not add a newline at the end.

To dig deeper on formatting strings:

* [`fmt` module](https://doc.rust-lang.org/std/fmt/index.html)
* Format strings [syntax](https://doc.rust-lang.org/std/fmt/index.html#syntax)


In [6]:
main();

Hello, DS210!
Hello, DS210!

9 plus 16 is 25
9 plus 16 is 25

9 plus 16 is 25

Hexadecimal: 0x9 plus 0x10 is 0x19
Octal: 0o11 plus 0o20 is 0o31
Binary: 1001 plus 10000 is 11001

pointer to x: 0x7ff7b3972c70
pointer to y: 0x7ff7b3972c6c
pointer to x+y: 0x7ff7b3972c74

9 plus 16 is 25



E 9 plus 16 is 25
E 9 plus 16 is 25



```rust
// And some input routines
// Unfortunately jupyter notebook does not have support for reading from the terminal with Rust at this point.
// So this is for demo purposes
use std::io;
use std::io::Write;

fn main() {
    let mut user_input = String::new();
    print!("What's your name? ");
    io::stdout().flush().expect("Error flushing");  // flush the output and print error if it fails
    let _ =io::stdin().read_line(&mut user_input);  // read the input and store it in user_input
    println!("Hello, {}!", user_input.trim());
}
```

**Simplest way to compile:**
  * put the content in file `hello.rs`
  * command line:
    - navigate to this folder
    - `rustc hello.rs`
    - run `./hello` or `hello.exe`

## Variable definitions

### By default immutable!

In [7]:
let x = 3;
x = x + 1; // <== error here
x

Error: cannot assign twice to immutable variable `x`

### Use `mut` to make them mutable

In [8]:
// mutable variable
let mut x = 3;
x = x + 1;
x

4

In [9]:
// mutable variable
let mut x = 3;
x = x + 1;
x = 9.5;   // what happens here??
x

Error: mismatched types

### Variable shadowing: new variable with the same name

In [16]:
let solution = "0.1";
let solution : i32 = solution.parse()
                     .expect("Not a number!");
let solution = solution * (solution - 1) / 2;
println!("solution = {}",solution);
let solution = "This is a string";
println!("solution = {}", solution);

thread '<unnamed>' panicked at src/lib.rs:128:23:
Not a number!: ParseIntError { kind: InvalidDigit }
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: _run_user_code_13
   4: evcxr::runtime::Runtime::run_loop
   5: evcxr::runtime::runtime_hook
   6: evcxr_jupyter::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


### You can gloss over this one for now as we will revisit it again

```rust
    a: &T      // immutable binding of immutable reference
mut a: &T      // mutable binding of immutable reference
    a: &mut T  // immutable binding of mutable reference
mut a: &mut T  // mutable binding of mutable reference
```

## Basic types: integers and floats

* unsigned integers: `u8`, `u16`, `u32`, `u64`, `u128`, `usize` (architecture specific size)
   - from $0$ to $2^n-1$
* signed integers: `i8`, `i16`, `i32` (default), `i64`, `i128`, `isize` (architecture specific size)
   - from $-2^{n-1}$ to $2^{n-1}-1$

> if you need to convert, use the `as` operator

> `i128` and `u128` are useful for cryptography

| Number literals |	Example |
| :-: | :-:|
| Decimal | 98_222 |
| Hex | 0xff |
| Octal | 0o77 |
| Binary | 0b1111_0000 |
| Byte (u8 only) | b'A' |

In [19]:
let s1 = 2_55i32;
let s2 = 0xf_f;
let s3 = 0o3_77;
let s4 = 0b1111_1111;
println!("{} {} {} {}", s1, s2, s3, s4);
println!("{} 0x{:X} 0o{:o} 0b{:b}", s1, s2, s3, s4);

255 255 255 255
255 0xFF 0o377 0b11111111


In [12]:
println!("U8 min is {} max is {}", u8::MIN, u8::MAX);
println!("I8 min is {} max is {}", i8::MIN, i8::MAX);
println!("U16 min is {} max is {}", u16::MIN, u16::MAX);
println!("I16 min is {} max is {}", i16::MIN, i16::MAX);
println!("U32 min is {} max is {}", u32::MIN, u32::MAX);
println!("I32 min is {} max is {}", i32::MIN, i32::MAX);
println!("U64 min is {} max is {}", u64::MIN, u64::MAX);
println!("I64 min is {} max is {}", i64::MIN, i64::MAX);
println!("U128 min is {} max is {}", u128::MIN, u128::MAX);
println!("I128 min is {} max is {}", i128::MIN, i128::MAX);
println!("USIZE min is {} max is {}", usize::MIN, usize::MAX);
println!("ISIZE min is {} max is {}", isize::MIN, isize::MAX);

U8 min is 0 max is 255
I8 min is -128 max is 127
U16 min is 0 max is 65535
I16 min is -32768 max is 32767
U32 min is 0 max is 4294967295
I32 min is -2147483648 max is 2147483647
U64 min is 0 max is 18446744073709551615
I64 min is -9223372036854775808 max is 9223372036854775807
U128 min is 0 max is 340282366920938463463374607431768211455
I128 min is -170141183460469231731687303715884105728 max is 170141183460469231731687303715884105727
USIZE min is 0 max is 18446744073709551615
ISIZE min is -9223372036854775808 max is 9223372036854775807


In [20]:
let x : i16 = 13;
let y : i32 = -17;
// won't work without the conversion
println!("{}", x * y);   // will not work
println!("{}", (x as i32)* y);

Error: mismatched types

Error: cannot multiply `i16` by `i32`

* floats: `f32` and `f64` (default)
* There is talk about adding f128 to the language but it is not as useful as u128/i128.

In [21]:
let x:f32 = 4.0;
//let y:f32 = 4; // Will not work.  It will not autoconvert for you.

let z = 1.25; // default float type: f64

println!("{:.1}", x * z);

//println!("{:.1}", (x as f64) * z);


5.0


In [22]:
println!("F32 min is {:e} max is {:e}", f32::MIN, f32::MAX);
println!("F64 min is {:e} max is {:e}", f64::MIN, f64::MAX);


F32 min is -3.4028235e38 max is 3.4028235e38
F64 min is -1.7976931348623157e308 max is 1.7976931348623157e308


## Basic types: Booleans, characters, and strings

### Logical operators

* `bool` uses one byte of memory

In [23]:
let x = true;
let y: bool = false;

// x and (not y)
println!("{}", x & y);  // bitwise and
println!("{}", x | y);  // bitwise or

println!("{}", x && y); // logical and
println!("{}", x || y); // logical or
println!("{}", !y);    // logical not


false


true
false
true
true


### Bitwise operators

In [24]:
let x = 10;
let y = 7;
println!("{x:04b} & {y:04b} = {:04b}", x & y);
println!("{x:04b} | {y:04b} = {:04b}", x | y);
// println!("{}", x && y);
// println!("{}", x || y);
println!("!{y:04b} = {:04b} or {0}", !y);



1010 & 0111 = 0010
1010 | 0111 = 1111
!0111 = 11111111111111111111111111111000 or -8


What's going on with that last line?

`y` is I32, so let's display all 32 bits.

In [25]:
let y = 7;
println!("{:032b}", y);

00000000000000000000000000000111


So when we do `!y` we get the bitwise negation of `y`.

In [26]:
println!("{:032b}", !y);

11111111111111111111111111111000


But integers are stored in **two's complement format**, where:

* if the number is positive, the first bit is 0
* if the number is negative, the first bit is 1

To calculate the two's complement of a number, we flip all the bits and add 1.


In [27]:
// binary representation of 7 and -7
println!("{:032b}", 7);
println!("{:032b}", -7);

00000000000000000000000000000111
11111111111111111111111111111001


### Characters

* `char` defined via single quote, uses four bytes of memory (Unicode scalar value)
* For a complete list of UTF-8 characters check https://www.fileformat.info/info/charset/UTF-8/list.htm

> Note that on Mac, you can insert an emoji by typing `Control-Command-Space` and then typing the emoji name, e.g. 😜.

> On Windows, you can insert an emoji by typing `Windows-Key + .` or `Windows-Key + ;` and then typing the emoji name, e.g. 😜.

In [28]:
let x: char = 'a';
let y = '🚦';
let z = '🦕';

println!("{} {} {}", x, y, z);

a 🚦 🦕


### Strings

* string slice defined via double quotes (not so basic actually!)

In [29]:
fn testme() {
    let s1 = "Hello! How are you, 🦕?";  // type is immutable borrowed reference to a string slice: `&str`
    let s2 : &str = "Καλημέρα από την Βοστώνη και την DS210";  // here we make the type explicit
    
    println!("{}", s1);
    println!("{}\n", s2);

    // This doesn't work.  You can't do String = &str
    //let s3: String = "Does this work?";
    
    let s3: String = "Does this work?".to_string();
    println!("{}", s3);

    let s4: String = String::from("How about this?");
    println!("{}\n", s4);

    let s5: &str = &s3;
    println!("str reference to a String reference: {}\n", s5);
    
    // This won't work.  You can't index directly into a string slice. Why???
    // println!("{}", s1[3]);
    // println!("{}", s2[3]);

    // But you can index this way.
    println!("4th character of s1: {}", s1.chars().nth(3).unwrap());
    println!("4th character of s2: {}", s2.chars().nth(3).unwrap());
    println!("3rd character of s4: {}", s4.chars().nth(2).unwrap());
}

testme();

Hello! How are you, 🦕?
Καλημέρα από την Βοστώνη και την DS210

Does this work?
How about this?

str reference to a String reference: Does this work?

4th character of s1: l
4th character of s2: η
3rd character of s4: w


# Project manager: `cargo`

* create a project: `cargo new PROJECT-NAME`
* main file will be `PROJECT-NAME/src/main.rs`

* to run: `cargo run`
* to just build: `cargo build`

Add `--release` to create a "fully optimized" version:
 * longer compilation
 * faster execution
 * some runtime checks not included (e.g., integer overflow)
 * debuging information not included
 * the executable in a different folder
 * Demo fibonacci on the terminal

## Project manager: `cargo`

If you just want to **check** if your current version compiles: `cargo check`
  * Much faster for big projects

## Read book chapter 1 and sections 3.1 and 3.2

## Piazza Poll

Will publish in class.

https://piazza.com/class/m5qyw6267j12cj/post/43
