# Data Structures

**Data Structures** allow us to store and organize data in a way that allows us to process it efficiently.

We want to build up a mental toolbox of data structures that we can apply to solve our problems. We want to be able to pick the right data structure for any given job.

Different Structures have different strengths and weaknesses. They have different runtimes for their operations. 

When thinking about a program, we consider the operations we need to perform and pick the data structures that supports those operations most efficiently.

All data structures have a handful of common operations:
- Add an element
- Remove an element
- Access into the data structure
- Search for an element
- etc..

There are many types of data structures:
- arrays
- dictionaries
- sets
- linked lists
- trees
- graphs
- stacks
- queues

## Arrays

Python lists are under the hood arrays.

An **array** is stored in consecutive memory locations. 

All variables in our program are stored in in RAM. To the computer RAM is just a large block of bytes to store data in. Bytes 0 - max_byte_number.

To create an array, we needf to know what we want to store and how many of them we want to store.

Why? Different types of data take up different amounts of space. The OS needs to reserve space for the entire array when it is created.

Different types of data could be: ints, floats, or characters.

In C, we have different data types for whole numbers:
- short: 2 bytes
- int: 4 bytes
- long: 8 bytes

If we want to create an array to store 10 ints how many bytes are needed?

When we create an array to hold 10 ints, it can never hold more than 10 ints. The size of all arrays are limited by how large they are when created.

## Operations on an Array

- Access an element at an index
- Insert into the beginning of the array
- Append an element to the array

### Element access

Index accessing is $O(1)$ because the memory location of any element in the array is trivial to calculate.

### Insertion at the beginning of an array

Suppose we have array that can hold 10 elements:

```
-------------------------------
|  |  |  |  |  |  |  |  |  |  |
-------------------------------
 0  1  2  3  4  5  6  7  8  9
```

```
[3, 5, 7, 11, 13]
```

```
---------------------------------------------------
| 3  | 5  | 7  | 11 | 13 |    |    |    |    |    |
---------------------------------------------------
  0    1    2    3    4    5    6    7    8    9
```

To insert an element at index 0, we have to copy every element to the right.

```
---------------------------------------------------
|    | 3  | 5  | 7  | 11 | 13 |    |    |    |    |
---------------------------------------------------
  0    1    2    3    4    5    6    7    8    9
```

Then we can insert the value:

```
---------------------------------------------------
| 2  | 3  | 5  | 7  | 11 | 13 |    |    |    |    |
---------------------------------------------------
  0    1    2    3    4    5    6    7    8    9
```

The runtime depends on the number of elements in the array.

If we have `n` elements in the array, we have to perform `n` copies.

The runtime is $O(n)$

### Append an element to the array

```
---------------------------------------------------
| 2  | 3  | 5  | 7  | 11 | 13 |    |    |    |    |
---------------------------------------------------
  0    1    2    3    4    5    6    7    8    9
```

To append an element into an array that is not full, we can just assign the value.

This is $O(1)$

What if the array is full?

```
---------------------------------------------------
| 2  | 3  | 5  | 7  | 11 | 13 | 17 | 19 | 23 | 27 |
---------------------------------------------------
  0    1    2    3    4    5    6    7    8    9
```

We can't insert into a full array. To append a new element, we need to create a new larger array, copy everything into it, then append to that new array.

**The original array:**
```
---------------------------------------------------
| 2  | 3  | 5  | 7  | 11 | 13 | 17 | 19 | 23 | 29 |
---------------------------------------------------
  0    1    2    3    4    5    6    7    8    9
```

**Create new larger array:**
```
----------------------------------------------------------------------------
|    |    |    |    |    |    |    |    |    |    |    |    |    |    |    |
----------------------------------------------------------------------------
  0    1    2    3    4    5    6    7    8    9    10   11   12   13   14
```

**Copy everything into it:**
```
---------------------------------------------------
| 2  | 3  | 5  | 7  | 11 | 13 | 17 | 19 | 23 | 29 |
---------------------------------------------------
  |    |    |    |    |    |    |    |    |    |
  v    v    v    v    v    v    v    v    v    v
----------------------------------------------------------------------------
| 2  | 3  | 5  | 7  | 11 | 13 | 17 | 19 | 23 | 29 |    |    |    |    |    |
----------------------------------------------------------------------------
  0    1    2    3    4    5    6    7    8    9    10   11   12   13   14
```

**Append the next value:**
```
----------------------------------------------------------------------------
| 2  | 3  | 5  | 7  | 11 | 13 | 17 | 19 | 23 | 31 | 33 |    |    |    |    |
----------------------------------------------------------------------------
  0    1    2    3    4    5    6    7    8    9    10   11   12   13   14
```

Since we have to copy every element over, the runtime is $0(n)$

## Runtime Summary

- Access an element at an index
    - $O(1)$
- Insert into the beginning of the array
    - $O(n)$
- Append an element to the array
    - $O(n)$
- Insert element into arbitrary index
    - $O(n)$
    - The arbitrary index may be 0

# Linked Lists

A linked list is NOT an array and is not stored in a set of consecutive memory locations.

A **Singly Linked List** is a set of **nodes** where each node contains a single element of the list and a pointer to the next node in the list. 

In a **Doubly Linked List** each node also has a pointer to the previous node as well.

### Diagram

```
[2] -> [3] -> [5] -> [7] -> [11]
 ^                           ^
 H                           T
```

The first node is the **Head** and the last is the **Tail**.

A Linked List is dynamic. It grows and shrinks as we add/remove elements.

## Operations

- Append element
- Insert to beginning
- Access by index
- Insert at arbitrary index

### Implementing a Linked List

We can create a Node class and then create a Linked List class that uses Nodes

### Class Node

Attributes:
- the element is contains
- a reference to the next node