# Introduction

We have already learned some Rust' collection data types such as arrays or tuple. However, they live on the stack.  
<span style="color:lightgreen">In this chapter, we will learn **vector**, **string** and **hash maps**. The data these collections point to is **stored on the heap**, which means the amount of data does not need to be known at compile time and **can grow or shrink as the program runs</span>

- A vector allows you to store a variable number of values next to each other.
- A string is a collection of characters. We’ve mentioned the `String` type previously, but in this chapter we’ll talk about it in depth.
- A hash map allows you to associate a value with a particular key. It’s a particular implementation of the more general data structure called a `map`.

# Vectors: Storing Lists of Values with `Vec<T>`

## Creating, Reading and Updating Vectors

In [12]:
// Creating a new vector (no initial values)
let mut v: Vec<i32> = Vec::new();

// Creating a new vector with initial values and inferred type
let v2 = vec![1, 2, 3];

// Updating the vector v (it needs to be mutable)
println!("v before updating is {:?}", v);
v.push(5);
v.push(6);
v.push(7);
v.push(8);
println!("v after updating is {:?}", v);


v before updating is []
v after updating is [5, 6, 7, 8]


In [21]:
// Reading elements of vectors: either using reference with &, or .get() method
fn main() {
    let v = vec![1, 2, 3, 4, 5];

    let third: &i32 = &v[2];
    println!("The third element is {third}");

    let third: Option<&i32> = v.get(2);  // get returns the Option<T> enum
    match third {
        Some(third) => println!("The third element is {third}"),
        None => println!("There is no third element."),
    }

    let senventh: Option<&i32> = v.get(6);  // what if we get the index out of range?
    match senventh {
        Some(senventh) => println!("The senventh element is {senventh}"),
        None => println!("There is no senventh element."),
    }
}

main()

The third element is 3
The third element is 3
There is no senventh element.


()

- When using `&v[index]`, the program will panic if index is out of range
- When using `v.get(index)`, the program returns `None` without pancking

## Vector's Ownership and Borrowing Rules

<span style="color:orange">*When the program has a valid reference to an element of a vector, the borrow checker enforces the ownership and borrowing rules to ensure this reference and any other references to the contents of the vector remain valid*.</span>     
We can’t have mutable and immutable references in the same scope. For example, in the code below we hold an immutable reference to the first element in a vector and try to add an element to the end, which won’t work if we also try to refer to that element later in the function:

In [29]:
fn main() {
    let mut v = vec![1, 2, 3, 4, 5];
    let first = &v[0];
    v.push(6);
    println!("The first element is: {first}");
}

main()

Error: cannot borrow `v` as mutable because it is also borrowed as immutable

<span style="color:purple">*This error is due to the way vectors work: because vectors put the values next to each other in memory, adding a new element onto the end of the vector might require allocating new memory and copying the old elements to the new space, if there isn’t enough room to put all the elements next to each other where the vector is currently stored. In that case, the reference to the first element would be pointing to deallocated memory. The borrowing rules prevent programs from ending up in that situation*</span>

## Iterating over Vectors

Using `for` loop with immutable references

In [32]:
let v = vec![100, 32, 57];
for i in &v {
    println!("{i}");
}

100
32
57


()

Using `for` loop with mutable references to make changes

In [36]:
let mut v = vec![100, 32, 57];
for i in &mut v {
    *i += 50;
}
println!("{:?}", v)

[150, 82, 107]


()

> **📓 Note: Iterating and Ownership**  
> *Iterating over a vector, whether immutably or mutably, is safe because of the borrow checker's rules. If we attempted to insert or remove items in the `for` loop bodies, we would get a compiler error. The reference to the vector that the `for` loop holds prevents simultaneous modification of the whole vector.*

## Using an `Enum` to Store Multiple Types

*Rust needs to know what types will be in the vector at compile time so it knows exactly how much memory on the heap will be needed to store each element. Using an enum plus a match expression means that Rust will ensure at compile time that every possible case is handled. If you don’t know the exhaustive set of types a program will get at runtime to store in a vector, the enum technique won’t work. Instead, you can use a trait object.*

In [44]:
fn main() {
    #[derive(Debug)]
    enum SpreadsheetCell {
        Int(i32),
        Float(f64),
        Text(String),
    }

    let row = vec![
        SpreadsheetCell::Int(3),
        SpreadsheetCell::Text(String::from("blue")),
        SpreadsheetCell::Float(10.12),
    ];
}

main()

()

## Dropping a Vector Drops Its Elements

Like any other `struct`, a vector is freed when it goes out of scope

In [49]:
fn main() {
    {
        let v = vec![1, 2, 3, 4];
        // do stuff with v
    } // <- v goes out of scope and is freed here
    println!("{:?}", v);
}

Error: cannot find value `v` in this scope

# Strings: Storing UTF-8 Encoded Text

We discuss strings in the context of collections because <span style="color:lightgreen">*strings are implemented as a collection of bytes, together with some methods to provide useful functionality when those bytes are interpreted as text.*</span>

In this section, we’ll talk about the operations on `String` that every collection type has, such as creating, updating, and reading. We’ll also discuss the ways in which `String` is different from the other collections, namely how indexing into a `String` is complicated by the differences between how people and computers interpret `String` data.

## String vs string slice
Even though Rust's `String` and string slice (`str` which is usually seen in its borrowed form `&str`) are both UTF-8 encoded, they have some fundamental differences:
- **Ownership**: `String` is an owned type with its own memory allocation, whereas `&str` is a borrowed reference to an existing string.
- **Mutability**: `String` is mutable, while `&str` is an immutable reference.
- **Memory Allocation**: `String` is heap-allocated, but `&str` could point to a portion of memory in various locations, including the stack (if it's a reference to a string literal) or the heap (if it's a reference to a part of `String`).
- **Use Case**:
    - Use `String` when you need to own and modify string data.
    - Use `&str` when you just need to read or pass string data without taking ownership, which is more efficient, especially for function arguments.

## Creating a New `String`
- Use `new()`
- Use `to_string()` on types that implement the `Display` trait, e.g. string literals
- Use `String::from`` to create a `String` from a string literal

In [55]:
let mut s = String::new();
println!("{s}");

let s2 = "initial contents".to_string();
println!("{s2}");

let s3 = String::from("initial contents");
println!("{s3}");




initial contents
initial contents


In [57]:
let hello = String::from("안녕하세요");
println!("{hello}");

안녕하세요


## Updating a `String`

### Appending to a `String` with `push_str` and `push`

Note that `push_str` takes a string slice because we don’t necessarily want to take ownership of the parameter passed to it

In [60]:
let mut s1 = String::from("foo");
let s2 = "bar";
s1.push_str(s2);
println!("we can still use s2, which is '{s2}'");

we can still use s2, which is 'bar'


The `push` method takes a single character as a parameter and adds it to the `String`

In [67]:
let mut s = String::from("lo");
s.push('l');
s

"lol"

### Concatenation with the `+` Operator or the `format!` Macro

The `+` operator uses the `add` method, whose signature looks something like this 
```rust
fn add(self, s: &str) -> String {
```
Hence we have to give a reference to the `String` after `+`

In [73]:
let s1 = String::from("Hello, ");
let s2 = String::from("world!");
let s3 = s1 + &s2; // note s1 has been moved here and can no longer be used
s3

"Hello, world!"

However, note that `&s2` is `&str`, not `&String` but the code above still compiles because <span style="color:lightgreen">*the compiler can coerce the `&String` argument into a `&str`*</span>.  
Also, add takes `self` as an argument, not `&self`, so it takes onwership of `self`

Concatenating multiple strings with `+` works, but gets messy:

In [78]:
let s1 = String::from("tic");
let s2 = String::from("tac");
let s3 = String::from("toe");

let s = s1 + "-" + &s2 + "-" + &s3;  // s1 is moved here and can't be used
s

"tic-tac-toe"

Hence we have the `format!` macro

In [82]:
let s1 = String::from("tic");
let s2 = String::from("tac");
let s3 = String::from("toe");

let s = format!("{s1}-{s2}-{s3}");  // we can use s1, s2 and s3 after this

"toe"

The `format!` macro uses references so that it doesn’t take ownership of any of its parameters.

## Indexing into `String`s

<span style="color:orange">*In many other programming languages, accessing individual characters in a string by referencing them by index is a valid and common operation. However, **if you try to access parts of a String using indexing syntax in Rust, you’ll get an error**.*</span>

In [83]:
let s1 = String::from("hello");
let h = s1[0];

Error: consider importing one of these items

Error: the type `String` cannot be indexed by `{integer}`

The error and the note tell the story: Rust strings don’t support indexing. But why not? To answer that question, we need to discuss how Rust stores strings in memory

### Internal `String` Representation

<span style="color:skyblue">***A `String` is a wrapper over a `Vec<u8>`***</span>

For example
```rust
let hello1 = String::from("Hola");
```
*In this case, `hello1`'s len will be 4, which means the vector storing the string “Hola” is 4 bytes long. Each of these letters takes 1 byte when encoded in UTF-8.* However,
```rust
let hello2 = String::from("Здра");
```
*However, `hello2`'s len will be 8, since each Unicode scalar value in `hello2` takes 2 bytes of storage.*

<span style="color:orange">***Therefore, an index into the string’s bytes will not always correlate to a valid Unicode scalar value.***</span>

### Bytes and Scalar Values and Grapheme Clusters! Oh My!

Another point about UTF-8 is that there are actually three relevant ways to look at strings from Rust’s perspective: as bytes, scalar values, and grapheme clusters (the closest thing to what we would call letters).

For example, the नमस्ते word in Hindi can be seen as
1. A vector of `u8` value: `[224, 164, 168, 224, 164, 174, 224, 164, 184, 224, 165, 141, 224, 164, 164, 224, 165, 135]`
2. Scalar values (Rust's `char`): `['न', 'म', 'स', '्', 'त', 'े']` where the 4th and 6th chars are diacritics
3. As grapheme clusters: `["न", "म", "स्", "ते"]`

<span style="color:lightgreen">*Rust provides different ways of interpreting the raw string data that computers store so that each program can choose the interpretation it needs, no matter what human language the data is in.*</span>

<span style="color:skyblue">*A final reason for Rust not support String indexing is because Rust would have to walk through the contents from the beginning to the index of a `String` to determine how many valid characters there were, indexing is not guaranteed to be `O(1)` as expected of the operation.*</span>

### Slicing `String`s

Rather than indexing using `[]` with a single number, you can use `[]` with a `range` to create **a string slice containing particular bytes**

In [91]:
let hello = "Здравствуйте";
let hello2 = "Hola";

let s = &hello[0..8];  // get the first 8 bytes
let s2 = &hello2[0..4];  // get the first 4 bytes

println!("s = {s}");
println!("s2 = {s2}");

s = Здра
s2 = Hola


We can't slice only part of a character’s bytes with something like `&hello[0..1]`, since each Unicode scalar in `hello = "Здравствуйте"` takes 2 bytes

In [96]:
let s = &hello[0..1];

thread '<unnamed>' panicked at src/lib.rs:121:15:
byte index 1 is not a char boundary; it is inside 'З' (bytes 0..2) of `Здравствуйте`
stack backtrace:
   0: rust_begin_unwind
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/panicking.rs:597:5
   1: core::panicking::panic_fmt
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/panicking.rs:72:14
   2: core::str::slice_error_fail_rt
   3: core::str::slice_error_fail
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/str/mod.rs:87:9
   4: <unknown>
   5: evcxr::runtime::Runtime::run_loop
   6: evcxr::runtime::runtime_hook
   7: evcxr_jupyter::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


However, this works with `hello2` since each Unicode scalar in `hello2` takes 1 bytes

In [95]:
let s2 = &hello2[0..1];
s2

"H"

## Iterating Over Strings

<span style="color:purple">***The best way to operate on pieces of strings is to be explicit about whether you want characters or bytes.***</span>

In [97]:
for c in "Зд".chars() {  // only print 2 chars
    println!("{c}");
}

З
д


()

In [98]:
for b in "Зд".bytes() {  // will print 4 bytes
    println!("{b}");
}

208
151
208
180


()

## Strings Are Not Simple

*To summarize, strings are complicated. Different programming languages make different choices about how to present this complexity to the programmer. Rust has chosen to make the correct handling of String data the default behavior for all Rust programs, which means programmers have to put more thought into handling UTF-8 data upfront.*

# `HashMap`s: Storing Keys with Associated Values
The type `HashMap<K, V>` stores a mapping of keys of type `K` to values of type `V` using a hashing function, which determines how it places these keys and values into memory. Hash maps are useful when you want to look up data not by using an index, as you can with vectors, but by using a key that can be of any type.

## Creating a New Hash Map
Using `new` and `insert` 

In [4]:
use std::collections::HashMap;

let mut scores: HashMap<String, i32> = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

println!("{:?}", scores);

{"Yellow": 50, "Blue": 10}


## Accessing Values in a Hash Map

In [5]:
use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

let team_name = String::from("Blue");
let score = scores.get(&team_name).copied().unwrap_or(0);
score


10

Hashmap's `get` method returns an `Option<&V>`. The line
```rust
let score = scores.get(&team_name).copied().unwrap_or(0);
```
handles the `Option` by calling copied to get an `Option<i32>` rather than an `Option<&i32>`, then `unwrap_or` to set score to `0` if scores doesn't have an entry for the key.

## Iterating over Hash Maps

In [6]:
use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

for (key, value) in &scores {
    println!("{key}: {value}");
}

Blue: 10
Yellow: 50


()

## Hash Maps and Ownership
- <span style="color:lightgreen">***For types that implement the `Copy` trait, like `i32`, the values are copied into the hash map.***</span>  
 
- <span style="color:lightgreen">***For owned values like `String`, the values will be moved and the hash map will be the owner of those values***</span>

In [7]:
use std::collections::HashMap;

let field_name = String::from("Favorite color");
let field_value = String::from("Blue");

let mut map = HashMap::new();
map.insert(field_name, field_value);  // field_name and field_value are invalid at this point
                                      // , try using them and see what compiler error you get!
println!("{field_name}");


Error: borrow of moved value: `field_name`

<span style="color:orange">*If we insert references to values into the hash map, the values won’t be moved into the hash map. **The values that the references point to must be valid for at least as long as the hash map is valid**. We’ll talk more about these issues in the “Validating References with Lifetimes” section in Chapter 10.*</span>

## Updating a Hash Map
When you want to change the data in a hash map, you have to decide how to handle the case when a key already has a value assigned.

- You could replace the old value with the new value, completely disregarding the old value. 
- You could keep the old value and ignore the new value, only adding the new value if the key doesn’t already have a value. 
- Or you could combine the old value and the new value.

### Overwriting a Value

In [8]:
use std::collections::HashMap;

let mut scores = HashMap::new();

scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Blue"), 25);

println!("{:?}", scores);


{"Blue": 25}


### Adding a Key and Value Only If a Key Isn’t Present (Otherwise Keep the Old Value)
For this, we use the `entry` method. The return value of the `entry` method is an enum called `Entry` that represents a value that might or might not exist.

The `or_insert` method on `Entry` is defined to return a mutable reference to the value for the corresponding `Entry` key if that key exists, and if not, inserts the parameter as the new value for this key and returns a mutable reference to the new value


In [9]:
use std::collections::HashMap;

let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);

scores.entry(String::from("Yellow")).or_insert(50);  // entry returns an `Entry` enum
scores.entry(String::from("Blue")).or_insert(50);

println!("{:?}", scores);


{"Yellow": 50, "Blue": 10}


### Updating a Value Based on the Old Value

In [11]:
use std::collections::HashMap;

let text = "hello world wonderful world";

let mut map = HashMap::new();  // keeps the number of occurences of each word in text

for word in text.split_whitespace() {
    println!("{:?}", word);
    let count = map.entry(word).or_insert(0);
    *count += 1;
}

println!("{:?}", map);

"hello"
"world"
"wonderful"
"world"
{"world": 2, "wonderful": 1, "hello": 1}


The `split_whitespace` method returns an iterator over sub-slices, separated by whitespace, of the value in text. The `or_insert` method returns a mutable reference (`&mut V`) to the value for the specified key. Here we store that mutable reference in the count variable, so in order to assign to that value, we must first dereference count using the asterisk (`*`). The mutable reference goes out of scope at the end of the for loop, so all of these changes are safe and allowed by the borrowing rules.

## Hashing Functions
By default, `HashMap` uses a hashing function called `SipHash` that can provide resistance to Denial of Service (DoS) attacks involving hash tables