# Binary Search Trees

### Example Problems
*Examples of a Local Search*
* Find all words that start with some given string
* Find all emails received in a given period
* Find the person in you class whose height is closest to yours

## Local Search
**Definition:** A **Local Search Datastructure** stores a number of elementts each with a **key** coming from an ordered set.

#### Operations for a Local Search Datastructure
| Operation      | Output | Time (Hash Table) | Time (Array) | Time (Sorted Array) | Time (Linked List) | Time (Binary Search Tree) |
| :-            | :-    | :-: | :-: | :-: | :-: | :-: |
`RangeSearch(x,y)`| Return all elements with keys betwwen `x` and `y`  | Impossible | $O(n)$ | $O(\log n)$ | $O(n)$ | $O(\log n)$ |
`NearestNeighbour(z)`|   Return the element with keys on either side of `z`  | Impossible | $O(n)$ | $O(\log n)$ | $O(n)$ | $O(\log n)$ |
`Insert(z)`| Insert `z`  | $O(1)$ | $O(1)$ | $O(n)$ | $O(1)$ | $O(\log n)$ |
`Delete(z)`| Delete `z`  | $O(1)$ | $O(1)$ | $O(n)$ | $O(1)$ | $O(\log n)$ |

### Binary Search Tree

**Search Tree Property:** $X$'s key is larger than the key of any descendent of its left child, and smaller than the key of any descendent of its right child.  
*B correct*  
*A, C incorrect*

<p align="center">
    <img src="images\search_trees.png" width="800" style="display: inline-block; margin-right: 0px;">
</p>

### Problem: Find

**Input:** Key $k$ and Root $R$.

**Output:** The node in the tree of $R$ with key $k$.

In [1]:
def Find(k, R):
    # if R.Key = k:
    #     return R
    # elif R.Key > k:
    #     if R.Left is not None:
    #         return Find(k, R.Left)
    #      return R
    # else:
    #     if R.Right is not None:   
    #         return Find(k, R.Right)
    #     return R
    ...

### Problem: Next

**Input:** Node $N$.

**Output:** The node in the tree withthe next largest key.

In [3]:
def Next(N):
    # if N.right is not None:
    #     return LeftDescendent(N.right)
    # else:
    #     return RightAncestor(N)
    ...

def LeftDescendent(N):
    # if N.left is None:
    #     return N
    # else:
    #     return LeftDescendent(N.left)
    ...

def RightAncestor(N):
    # if N.Key < N.Parent.Key:
    #     return N.Parent
    # else:
    #     return RightAncestor(N.Parent)
    ...

### Problem: Range  Search

**Input:** Numbers $x$, $y$, root $R$.

**Output:** A list of nodes with key between $x$ and $y$.

In [5]:
def RangeSearch(x, y, R):
    # L = []
    # N = Find(x,R)
    # while N.key <= y
    #     if N.key >= x
    #         L.append(N.key)
    #     N = Next(N)
    # return L
    ...

### Problem: Insert

**Input:** Key $k$ and Root $R$.

**Output:** Adds node with key $k$ to the tree.

In [7]:
def Insert(k, R):
    # P = Find(k, R)
    # add new node with key k as child of P
    ...

### Problem: Delete

**Input:** Node $N$.

**Output:** Remove node $N$ from the tree.

In [8]:
def Delete(N):
    # if N.Right is None:
    #     Remove N, promote N.Left to be child of N's parent
    # else:
    #    X = Next(N) # note X.left = None
    #   Replace N with X, promote X.Right to be child of X's parent
    ...

## Balance

* Want left and right subtrees to have approximately the same size.
* Suppose perfectly balanced:
    * Each subtree half the size of its parent.
    * After $\log_2(n)$ levels, subtree of size 1.
    * Operations run in $O(\log n)$ time.

**Problem**  
Insertions and deletions can destroy balance!

**Solution**  
Rebalancing via **Rotations**:

<p align="center">
    <img src="images\binary_tree_rebalance.png" width="600" style="display: inline-block; margin-right: 0px;">
</p>

* Height is a rough measure of subtree size.
* Want size of subtrees roughly the same.
* Force heights to be roughly the same.

### AVL Property
**Definition:** AVL trees maintain the following property:

For all nodes $N$, `|N.Left.Height - N.Right.Height|` $\leq$ `1`.

**LEMMA**  
AVL property $\implies$ height $h = O(\log n)$

*Equivalently*,  
Large height $\implies$ many nodes.

*i.e.*  
**THEOREM**  
Let $N$ be a node of a binary tree satisfying the AVL property.  
Let $h = N.\text{Height}$.  
Then the subtree of $N$ has size at least the Fibonacci Number $F_h$.

Therefore, 
* node of height $h$ has subtree of size at least $2^{h/2}$
* $n \geq 2^{h/2} \implies h \leq 2 \log_2 n = O(\log n)$.

So,  
if you can maintain the AVL property, you can perform operations in $O(\log n)$ time.

### How to Maintain AVL Property
* Need a new Insertion algorithm (that involves rebalancing)
* Need a new Deletion algorithm (that involves rebalancing)

In [10]:
def AVLInsert(k, R):
    # Insert(k, R)
    # N = Find(k, R)
    # Rebalance(N)
    ...

def Rebalance(N):
    # P = N.Parent
    # if N.Left.Height > N.Right.Height + 1:
    #     RebalanceRight(N)
    # elif N.Right.Height > N.Left.Height + 1:
    #     RebalanceLeft(N)
    # AdjustHeight(N)
    # if P is not None:
    #     Rebalance(P)
    ...

def AdjustHeight(N):
    # N.Height = 1 + max(N.Left.Height, N.Right.Height)
    ...

def RebalanceRight(N):
    # M = N.Left
    # if M.Left.Height < M.Right.Height:
    #     RotateLeft(M)
    # RotateRight(N)
    # AdjustHeight on affected nodes
    ...

def AVLDelete(N):
    # Delete(N)
    # M = Parent of node replacing N
    # Rebalance(M)
    ...

## Split & Merge
* **Merge:** Combines two binary search trees into a single one.
* **Split:** Breaks one binary search tree into two.

### Merge
In general, to merge two sorted lists takes $O(n)$ time. However, when the are separated it is faster. I.e., when all the nodes in one tree are larger (or smaller) than all the nodes in the other tree.

**Input:** Roots $R_1$ and $R_2$ of trees with all keys in $R_1$'s tree smaller than those in $R_2$'s.

**Output:** The root of a new tree with all the elements of both trees.

In [11]:
def MergeWithRoot(R1, R2, new_root):
    """Time O(1)."""
    ...

def Merge(R1, R2):
    """Time O(h)."""
    ...

def AVLTreeMergeWithRoot(R1, R2, new_root):
    """Time O(|R1.Height - R2.Height| + 1)."""
    ...

**Time:** $\sum O(|R_i.Height - R_{i+1}.Height| + 1) = O(h_{max}) = O(\log n)$.

### Split
*Idea*  
Search for $x$, merge subtrees.

**Input:** Root $R$ of a tree, key $x$.

**Output:** Two trees, one with elements $< x$, one with elements $> x$.

In [None]:
def Split(R, x):
    """Time O(|R1.Height - R2.Height| + 1)."""
    ...

**Time:** $\sum O(|R_i.Height - R_{i+1}.Height| + 1) = O(h_{max}) = O(\log n)$.

## Applications of Binary Search Trees

Things you might want to do:
* Return the 7th largest element.
* Return the median element.
* Return the 25% percentile element.

### Problem 1: Order Statistic

**Input:** The root of a tree $T$ and a number $k$.

**Output:** The $k^{th}$ smallest element in $T$.

* Need to know which subtree to look in.
* Need to know how many elements are in left subtree. (More or less than $k$?)

Easy fix: Add a **new field**.

$N.\text{Size}$ returns the number of elements in the subtree of $N$. Should satisfy:
$$
N.\text{Size} = N.\text{Left.Size} + N.\text{Right.Size} + 1,
$$
where null nodes have zero size.

In [None]:
def RecomputeSize(N):
    ...

def Rotate()
    ...

def OrderStatistic(R, k):
    ...

**Time:** $O(h)$.

### Problem 2: Color Flips

* Array of squares.
* Each black or white.
* Want to be able to flip colors of all squares after index $x$.

<p align="center">
    <img src="images\color_flips.png" width="600" style="display: inline-block; margin-right: 0px;">
</p>

#### Operations for Color Flips
| Operation      | Output | Time (Array) | Time (Binary Search Tree)
| :-            | :-    | :-: | :-: | 
`NewArray(n)`| Create an array with `n` white squares.  | $O(1)$ | $O()$ | 
`Color(m)`| Return color of `m`$^{th}$ square.  | $O(1)$ | $O()$ | 
`Flip(x)`| Flip the color of all squares of index $> x$.  | $O(n)$ | $O(0)$ | 

**Trees can be used for more than searching.**  
**Trees can be used to store lists.**

<p align="center">
    <img src="images\color_flips_2.png" width="300" style="display: inline-block; margin-right: 0px;">
</p>

In [13]:
def NewArray(n):
    ...

def Color(m):
    ...

def Flip(x):
    ...

## Splay Trees...

Another type of binary search tree structure...

Consider different structures for different problems.