# Data Structures

## Heaps

Supports:
- Insertion $O(\log{n})$ 
- Extract Min or Max $O(\log{n})$
- Heapify $O(n)$ n batched inserts
- Delete $O(\log{n})$

A container for objects that are "ordered" by keys

Applications:
- Improve selection sort $O(n^2)$ using a heap which then runs in $O(n\log{n})$
- Priority Queue
- Event Manager
- Median Maintanence

    Given a sequence of numbers $x_1, x_2, \cdots, x_n$ one at a time, keep track of the medain of those numbers.

    Solution:

    maintain two heaps,
    $$
    H_{low} : \text{supports extract max} \\
    H_{high} : \text{supports extract min}
    $$
    maintain that approximately half of the elements are in each of the two heaps.

    The median will be either the max of $H_{low}$ or the min of $H_{high}$. Then when a new element comes in we can compare it to this median and add this element to either the high or the low heap as neccesary keeping the two heaps balanced.


## Search Trees

Static sorted arrays support the following operations

|Operation|Running Time|
|---|---|
|Search|$O(\log{n})$|
|Select|$O(1)$|
|Min/Max|$O(1)$|
|Pred/Succ|$O(1)$|
|Rank (# of keys less than given value)| $O(\log{n})$|
|Output in sorted order|$O(n)$|

A Search Tree can provide sorted array like behaviour whilst allowing for insertion and deletes.

|Operation|Running Time|
|---|---|
|Search|$O(\log{n})$|
|Select|$O(\log{n})$|
|Min/Max|$O(\log{n})$|
|Pred/Succ|$O(\log{n})$|
|Rank (# of keys less than given value)| $O(\log{n})$|
|Output in sorted order|$O(n)$|
|Insertion|$O(\log{n})$|
|Deletion|$O(\log{n})$|

### Binary Search Tree Structure

Nodes will contain a key, as well as a pointer to data. We have a root node, and leaves where each node has three pointers,
1. left child
2. right child
3. parent

Search Tree property maintains that for each node all of the nodes stored under the left child have keys less than the node, and all the nodes stores under the right child have keys more than the node.

The heigh/depth of such a search tree is defined as the max number of generations in a tree. This can be at most $n$ and at least $\log_2{n}$.

1. Searching and Inserting

    We can do a binary search style recursion to recurse on the left or right subtree by comparing the search item to the parent node.

    In order to insert, we follow the same search procedure until we reach a NULL node, then we can insert it there.

2. Min and Max

    The minimum element will be the left most child in the entire tree, and likewise the maximum element will be the right most child in the tree.

3. Predecessor or Successor Elements

    The predecessor is the largest element less than and element, therefore it will be the right most node in the elements left sub tree. Likewise the successor will be the left most element in the elements right sub tree.

    If the subtree is empty, we can follow parent pointers until we find the first parent less than the element itself. Or more succintly, the first time we make a "left" turn going up.

4. In-Order Traversal
    ```
    Let $r$ = root of search Tree with subtrees $T_L$ and $T_R$

    Recurse on $T_L$

    yeild $r$
    
    Recurse on $T_R$
    ```

5. Deletion
    
    If the node has no children, just delete it

    If the node has one child, that child can replace the node

    If the node has two children, we will need to identify who to replace that node. First we compute its Predecessor. Then we swap this node with its predecessor. We can then delete the node at the predecessor position. Since the predecessors position is garunteed to only have one left child.

6. Select and Rank

    We store the $size(x)$ = # of tree nodes in substree starting from $x$. We can get this recursively,

    for node $x$ with children $y$ and $z$
    $$
    size(x) = size(y) + size(z) + 1
    $$

    These will have to be maintained during insertion and deletion. For insertion as we go down a path, we increment their subtree sizes by 1

    Then, the $i^{th}$ order statistic will be the node such that it's left subtree is of size $i-1$

    ```
    start at root 

    start = x
    let a = size(left child)

    if a = i -1:
        return x
    
    if a >= i:
        recurse on left child

    if a < i-1
        recurse on right child, looking for (i - a - 1) order statistic
    ```

## Red-Black Trees

Since the running time of search tree operations depends on its height, we want to implement a balanced search tree such that the height is $\log{n}$.

See also
- [Avl Trees](https://en.wikipedia.org/wiki/AVL_tree)
- [Splay Trees](https://en.wikipedia.org/wiki/Splay_tree)
- [B Trees](https://en.wikipedia.org/wiki/B-tree)

A Red Black garuntees a balanced tree with logarithmic height by enforcing more invariants.

1. Each Node is either red or black
2. Root is black
3. No 2 Reds in a row i.e the children of a red node must be black
4. Every root - NULL path has the same number of black nodes

These four invariants Gurantee that the height of a tree with $n$ nodes is a most $2\log_2{n}$.

Proof:

1. Observation: if every root - Null path has $\geq k$ nodes, then the top $k$ levels of the tree are perfectly balanced.

Size of the search tree,
$$
n \geq 2^{k} -1 \quad \text{where} \quad k= \text{minimum number of nodes on a root-null path} \\

\implies k \leq \log_2{n + 1}
$$

Thus, in a red-black tree with $n$ nodes there is a root-null path with at most $\log_2{n + 1}$ black nodes.

By the $4^{th}$ invariant, every root-null path has $\log_2{n+1}$ black nodes.

By the $3^{rd}$ invariant, in the worst case there is an equal number of red and black nodes, i.e, black, red, black, red, etc.

This gives that every root-null path has $\leq 2 \log_2{n+1}$ nodes


## Rotations

A rotation is a key primitive that does constant work $O(1)$ that helps to locally rebalance subtrees at a node. 

Rotations are invoked upon a parent - child pair. When the child is a right child, the rotation is defined as a "left" rotation, and likewise for a left child.

WLOG let's discuss a left rotation:

Let the parent be $x$ and the right child $y$.

Further, let the left subtree of $x$ be $x_L$, and the left and right subtrees of $y$ be $y_L$ and $y_R$ respectively.

By the search tree property, we have
$$
x < y \\
x_L < x \\
y_R > y \\
x < y_L < y
$$

The goal of the rotation is to invert the relationship between $x$ and $y$ such that $y$ becomes the parent.

The new tree, will have $y$ as a parent,

Left Child: $x$
- Left Child: $x_L$
- Right Child: $y_L$

Right Child: $y_R$

## Insertion into a Red-Black Tree

General idea is to insert normally, and then recolor or perform rotations until invariants are restored.

```
Insert(x):
    Insert x as usual

    Remember the parent of x, call it y
    
    Color x red

    If y is black:
        return
    
    // else y is red:

    y has a black parent, call it w
    y has a sibling child wall it z

    If z is red:
        \\ this steps maintains the 4th invariant
        recolor y, z as black and w as red
        
        \\ this might propagate the double red upward to w's parent and w
        \\ can only happen a maximum of O(log n) times
        \\ if we reach the root node, we color it black

    else z is black:
        \\ 2 - 3 rotations + recolorings is sufficient to maintain the invariants
```