# AVL Trees

An implementation of dictionarys using a hash table has a worst case of $\Theta(n)$, however in practice the worst case is rare. If you cannot afford to hit the worst case however an AVL tree provides an $\Theta(\lg n)$ implementation.

## Binary Tree

In this example we will have integer keys and string values. As long as the keys are comparable, any types can be used for the key and values.

Here is an implementation for insertItem. If the root is null, then we are at a leaf node, and the `key` `value` pair is inserted, otherwise we compare the key and go left or right.

## Binary Search Tree

Before we consider AVL trees, lets look at a `findElement` implementation using binary search trees.

```
Algorithum findElement(k)
    if isEmpty(T) then
        return NO_SUCH_KEY
    else
        u <- root
        while (u is not null) and (u.key != k) do
            if k < u.key then
                u <- u.left
            else
                u <- u.right
        end while
        if (u is not null) and (u.key = k) then
            return u.element
        else
            return NO_SUCH_KEY
```

In the worst case, the while loop will step through the longest chain of nodes to a leaf node at the bottom of the tree. The longest path in a tree is the height of the tree, thus for the run time is $O(h)$ where $h$ is the height of the tree. In fact `insertItem` and `removeItem` implementations are also $O(h)$.

In [1]:
from random import randrange
import ipywidgets as widgets
from IPython.display import display, clear_output


class Node():
    def __init__(self, key, value):
        self.key = key
        self.value = value
        self.left = None
        self.right = None
        self.parent = None
        
    def height(self):
        if self.left is None and self.right is None:
            return 1
        elif self.left is None:
            return 1 + self.right.height()
        elif self.right is None:
            return 1 + self.left.height()
        else:
            return 1 + max(self.left.height(), self.right.height())
        
    def insert(self, node):
        if node.key > self.key:
            self.right = node
            node.parent = self
        else:
            self.left = node
            node.parent = self
        
    def __str__(self, level=0, prefix=""):
        ret = "\t"*level + "{}{}".format(prefix, repr(self.value))
        ret += "\n"
        if self.left:
            ret += self.left.__str__(level+1, "l:")
        if self.right:
            ret += self.right.__str__(level+1, "r:")
        return ret
        
def insertItem(root, key, value):
    u = root
    while True:
        if key > u.key:
            if u.right is None:
                break
            u = u.right
        else:
            if u.left is None:
                break
            u = u.left
    
    u.insert(Node(key, value))
        
root = Node(500, "ROOT")

button = widgets.Button(description="Add random node")
out = widgets.Output()

def insertRandom():
    value = randrange(1000)
    insertItem(root, value, str(value))
    with out:
        clear_output()
        print(root)
        print("Height: ", root.height())
        
[insertRandom() for i in range(5)]
button.on_click(lambda b: insertRandom())
widgets.VBox([button, out])

If $h < n$ then the binary search implementation seems to be an improvement on hashtables however the tree can be _unbalanced_. In the worst case a tree storing $n$ items would have a height $n$ (essentially a linked list).

In [None]:
root2 = Node(1, "1")
[insertItem(root2, i, str(i)) for i in range(2, 6)]
print(root2)

## Balancing

AVL trees are an extension on binary trees, which keep the tree balanced by maintaining the following property

> A vertex is balanced if the heights of its children differ by at most one.

With this constraint, the height of an $n$ element AVL tree is $O(\lg n)$.

For $h \in \mathbb{N}$, let $n(h)$ denote the minimum number of items stored in an AVL tree of height $h$. By induction we will show that $n(h) > 2^{h/2} - 1$.

$n(1) = 1 > \sqrt{2} - 1$ and $n(2) = 2 > 2 - 1$, so the base cases hold. Suppose $h \geq 3$ and that $n(h) > 2^{h/2} - 1$ holds for $h-1$ and $h-2$. We observe that

$$
n(h) \geq 1 + n(h-1) + n(h-2)
$$

That is the minumum number of items stored in a tree of height $h$ is atleast as much as the sub trees (which by the balancing property cannot differ by more than 1) plus 1 for the root node. By the inductive hypothesis:

$$
\begin{align}
n(h) &\geq 1 + (2^{\frac{h-1}{2}} - 1) + (2^{\frac{h-2}{2}} - 1) \\
&= 2^{\frac{h-1}{2}} + 2^{\frac{h-2}{2}} - 1 \\
&= (2^{-\frac{1}{2}} + 2^{-1}) 2^{h/2} - 1 \\
&> 2^{h/2} - 1
\end{align}
$$

Thus for every tree of height $h$, $n \geq n(h) > 2^{h/2} - 1$, thus:

$$
\begin{align}
n &> 2^{h/2} - 1 \\
n + 1 &> 2^{h/2} \\
\lg (n + 1) &> \frac{h}{2} \\
h &< 2 \lg (n + 1) \\
&= O(\lg n)
\end{align}
$$

## Insertion

Insertion into an AVL tree is not as simple as a binary tree, since the tree needs to be balanced after the insertion. To insert a node we start with normal binary tree insertion, which is $O(h) = O(\lg n)$, then we repair the tree if it is no longer balanced. If the tree is no longer balanced, the error will be somewhere in the path between the new node and the root, thus we move back up the tree looking for an error. 

If $z$ is not balanced then its children must differ in height by 2 or more. Let $y$ be the child with greater height, if both children of $y$ have the same height, then the height of $y$ hasnt changed, thus $z$ is still balanced, which is a contraditiction. Thus the new node is in the subtree of $y$ with the largest height which we will call $x$. To repair the tree we do 1 of 4 diffrent rotation operations.

### Left-Left heavy
- The left sub-tree of the left child grew.
- To balance: Right rotation around the root.

![](res/left-left-heavy.png)

### Left-Right heavy
- The right sub-tree of the left child grew.
- To balance: Left rotation around child, then right rotation around root.

![](res/left-right-heavy.png)

### Right-Left heavy
- The left sub-tree of the right child grew.
- To balance: right rotation around child, then  left rotation around root.
    
![](res/right-left-heavy.png)

### Right-Right heavy
- The right sub-tree of the right child grew.
- To balance: Left rotation around root.

![](res/right-right-heavy.png)

In [None]:
from copy import copy

class BalancedNode(Node):
    def rr_rot(self):
        temp = copy(self.parent.right);
        self.parent.right = temp.left;
        temp.left = parent;