## Today's Agenda
- Binary Search Trees
- AVL Trees

## Objectives
- To understand the structure of Binary Search Trees and AVL Trees.

# Binary Search Tree (BST)

BST has two properties:
### Structure property (binary tree)
- Each node has ≤ 2 children

### Order property
- All keys in left subtree smaller than node’s key.
- All keys in right subtree larger than node’s key.
     - Result: easy to find any given key.     
        
<img src="images/week-04/binary_search_tree.png" width="300">

For example,
<img src="images/week-04/binary_search_tree2.png"  width="300">
is a BST, but

<img src="images/week-04/binary_search_tree3.png"  width="320">
is not BST.

Note that the tree is a binary tree.

**A binary search tree is a type of binary tree (but not all binary trees are binary search trees!)**

# BST ADT
- find(key) : returns TRUE if key is at the tree.
- findMax() : retuns the maximum valued key at the tree and, returns nothing if the tree is empty.
- findMin : retuns the minimum valued key at the tree and, returns nothing if the tree is empty. 
- Here is the pseudocode for findMin:

Node findMin(Node root){

    if(root == null)
 
        return null;
    
    if(root.left==null)
 
        return root;
   
    return findMin(root.left);
 
}

In [1]:
class Node:
    """A class for creating a binary tree node and inserting elements.

       Attributes:
       -----------
       key : int, str
            The value that exists at this node of the tree.  eg. tree=Node(4) initializes a tree with 
            a stump integer value of 4.

       Methods: 
       --------   
       insert(self, key) : Inserts a new element into the tree. 
    """

    def __init__(self, key):
        self.key = key
        self.right = None
        self.left = None


    def insert(self, key):
        if self.key == key:
            return
        elif self.key < key:
            if self.right is None:
                self.right = Node(key)
            else:
                self.right.insert(key)
        else: # self.key > key
            if self.left is None:
                self.left = Node(key)
            else:
                self.left.insert(key)
                
 

    def display(self):
        lines, _, _, _ = self._display_aux()
        for line in lines:
            print(line)


    def _display_aux(self):
        """Returns list of strings, width, height, and horizontal coordinate of the root. this is 
           a utility function that gets used by the <display()> method for building pretty stdout 
           visualization of the binary tree. """

        # No child exists.
        if self.right is None and self.left is None:
            line = '%s' % self.key
            width = len(line)
            height = 1
            middle = width // 2
            return [line], width, height, middle

        # Only left child exists.
        if self.right is None:
            lines, n, p, x = self.left._display_aux()
            s = '%s' % self.key
            u = len(s)
            first_line = (x + 1) * ' ' + (n - x - 1) * '_' + s
            second_line = x * ' ' + '/' + (n - x - 1 + u) * ' '
            shifted_lines = [line + u * ' ' for line in lines]
            return [first_line, second_line] + shifted_lines, n + u, p + 2, n + u // 2

        # Only right child exists.
        if self.left is None:
            lines, n, p, x = self.right._display_aux()
            s = '%s' % self.key
            u = len(s)
            first_line = s + x * '_' + (n - x) * ' '
            second_line = (u + x) * ' ' + '\\' + (n - x - 1) * ' '
            shifted_lines = [u * ' ' + line for line in lines]
            return [first_line, second_line] + shifted_lines, n + u, p + 2, u // 2

        # Two children exist.
        left, n, p, x = self.left._display_aux()
        right, m, q, y = self.right._display_aux()
        s = '%s' % self.key
        u = len(s)
        first_line = (x + 1) * ' ' + (n - x - 1) * '_' + s + y * '_' + (m - y) * ' '
        second_line = x * ' ' + '/' + (n - x - 1 + u + y) * ' ' + '\\' + (m - y - 1) * ' '

        if p < q:
            left += [n * ' '] * (q - p)
        elif q < p:
            right += [m * ' '] * (p - q)
            
        zipped_lines = zip(left, right)
        lines = [first_line, second_line] + [a + u * ' ' + b for a, b in zipped_lines]
        return lines, n + m + u, max(p, q) + 2, n + u // 2

In [2]:
bst3 = Node(10)
keys = [9,8,7,6,5,4,3,2,1,11,12,13,14,15,16,17,18,19]
for key in keys:
    bst3.insert(key)

In [3]:
bst = Node(13)

In [4]:
bst.display()

13


In [5]:
bst.insert(7)

In [6]:
bst.display()

 13
/  
7  


In [None]:
bst.insert(20)

In [None]:
bst.display()

In [None]:
keys = [5,10,15,24]
for key in keys:
    bst.insert(key)

In [None]:
bst.display()

In [None]:
bst2 = Node(1)
keys = [2,3,4,5]
for key in keys:
    bst2.insert(key)
bst2.display()

In [None]:
ubt = Node('D')

keys = ['C', 'B', 'A']
for key in keys:
    ubt.insert(key)

ubt.display()

In [None]:
def inorder(tree):
    # Returns an array of tree elements using inorder traversal.
    visited = []
    if tree:
        visited = inorder(tree.left)
        visited.append(tree.key)
        visited = visited + inorder(tree.right)
    return visited

In [None]:
inorder(bst2)

In [None]:
inorder(bst)

In [None]:
def postorder(tree): #exercise
    visited = []
    if tree:
        visitedL = postorder(tree.left)
        visitedR = postorder(tree.right)
        visited.append(tree.key)
        visited =visitedL + visitedR + visited
    return visited

In [None]:
postorder(bst)

In [None]:
def preorder(tree): #exercise
    pass

## Binary Search Tree Analysis

- How fast are BST operations?
     - Given a tree, what is the worstcase node to find/remove?
- What is the best-case tree?
     - a balanced tree
     
     <img src="images/week-04/BSTbalanced.png"  width="300">
     
- What is the worst-case tree?
     - a completely unbalanced tree
     
     <img src="images/week-04/BSTunbalanced.png"  width="200">
     
**Problem**: operations may be inefficient if BST is unbalanced.

In [None]:
def get_height(tree):
    '''
    Returns the height of the tree.  
    '''
    if tree is None or (tree.left is None and tree.right is None): 
        return 0
    return 1 + max(get_height(tree.left), get_height(tree.right))    

In [None]:
def is_balanced(tree):
    '''
    Method for determining if a binary tree is balanced.

    A binary tree is balanced if:
        - it's empty
        - the left sub tree is balanced
        - the right subtree is balanced
        - the difference in depth between left and right is <=1

    Parameters:
    ____________
    root : the node object, below which the definition of 'balanced' will be applied.    
    '''
    if tree is None: 
        return True
    return is_balanced(tree.right) and is_balanced(tree.left) and abs(get_height(tree.left) - get_height(tree.right)) <= 1   

In [None]:
bst.display()

In [None]:
get_height(bst)

In [None]:
bst2.display()

In [None]:
get_height(bst2)

In [None]:
is_balanced(bst)

In [None]:
is_balanced(bst2)

In [None]:
bst3.display()

In [None]:
get_height(bst3)

In [None]:
is_balanced(bst3)

In [None]:
bst4 = Node(3)
keys = [2,1]
for key in keys:
    bst4.insert(key)

In [None]:
get_height(bst4)

In [None]:
bst4.display()

In [None]:
is_balanced(bst4)

# AVL Trees

## Motivation
- All BST operations are  $\mathcal{O}(d)$, where $d$ is tree depth
- Minimum $d$ is $d=\lfloor\log{n}\rfloor$ for a binary tree with $n$ nodes, assume base 2.
    - What is the best case tree?
    - What is the worst case tree?
- So, best case running time of BST operations is $\mathcal{O}(\log{n})$.
- Worst case running time is $\mathcal{O}(n)$

    - What happens when you Insert elements in ascending order?
        - Insert: 1, 2, 3, 4, 5, 6, 7 into an empty BST

- Worst case running time is $\mathcal{O}(n)$  
    - **Problem:** Lack of “balance”:
         - compare depths of left and right subtree
    - Unbalanced degenerate tree
        
<img src="images/week-04/binary_search_tree.png" width="250">

Recall that Binary Search Tree (BST) has two properties:
**Structure property (binary tree)**
- Each node has ≤ 2 children
    - Result: keeps operations simple

**Order property**
- All keys in left subtree smaller than node’s key.
- All keys in right subtree larger than node’s key.
     - Result: easy to find any given key.
     
**Problem:** operations may be inefficient if BST is unbalanced
- Find, insert, and delete
    - $\mathcal{O}(n)$ in the worst case

**Observation**
- BST: the shallower the better!

**Solution:** Require and maintain a *Balance Condition* that
- Ensures depth is always $\mathcal{O}(\log{n})$,
- But, it is necessary to keep the tree balanced after performing insert and delete operations.

What is the definition of balance?

<img src="images/week-05/treeunbalanced1.png" width="300">

- Left and right subtrees of the root have equal number of nodes

<img src="images/week-05/treeunbalanced.png" width="300">

- Left and right subtrees of the root have equal height 

<img src="images/week-05/treebalanced.png" width="300">
- Left and right subtrees of **every node** have equal number of nodes

- Left and right subtrees of **every node** have equal height.

## Balancing Binary Search Trees

- Many algorithms exist for keeping binary search trees balanced
     - Adelson-Velskii and Landis (AVL) trees (height-balanced trees)
     - Red-black trees;
     - Splay trees and other self-adjusting trees
     - B-trees and other (e.g. 2-4 trees) multiway search trees

     
## Perfect Balance

- Want a complete tree after every operation
     - tree is full except possibly in the lower right
- This is expensive
     - For example, insert 2 in the tree on the left and then rebuild as a complete tree
     
     <img src="images/week-05/balancedPerfect.png" width="500">
     
- What is the worst-case tree?
     - a completely unbalanced tree
     
     <img src="images/week-04/BSTunbalanced.png" width="200">
     
**Problem**: operations may be inefficient if BST is unbalanced.

     
## Tree Rotations
- Re-balance unbalanced trees with tree rotations

<img src="images/week-05/treeRotation.png" width="500">

Note that inorder traversal is **preserved**.

<img src="images/week-05/treeRotationOrder.png" width="320">

# AVL Trees

- Named after inventors Adelson-Velskii and Landis (AVL)
    - First invented in 1962
    - By mathematicians Georgii Adelson-Velsky and Evgenii Mikhailovich Landis.
    
-  AVL trees are height-balanced binary search trees

- **Balance factor** of a node
     - height(left subtree) - height(right subtree)
- For an AVL tree, balance factor is calculated at every node
    - For every node, heights of left and right subtree can differ by no more than 1
    - Store current heights in each node

## The AVL Balance Condition
- Left and right subtrees of **every node** have heights differing by at most 1

**Definition:** balance(node) = height(node.left) – height(node.right)

**AVL property:** for every node x, –1 ≤ balance(x) ≤ 1

## The AVL Tree Data Structure
- An AVL tree is a self-balancing binary search tree.

### Structural properties
- Binary tree property (same as BST)
- Order property (same as for BST)

ALSO
- **Balance property:**
    - balance of every node is between -1 and 1
- Result: Worst-case depth is O(log n)
- Named after inventors Adelson-Velskii and Landis (AVL)
    - First invented in 1962

### AVL tree height

Suppose $N(h)$ is the minimum of nodes in an AVL tree of height $h$. Then,
- **Base:**
\begin{equation}
N(-1)=null, N(0)=1,\quad N(1)=2
\end{equation}
<img src="images/week-04/binary_tree.png" width="350">
- **Induction step:**
\begin{equation}
N(h)=N(h-1)+N(h-2)+1
\end{equation}
<img src="images/week-05/AVL1.png" width="150">
- In result, 
\begin{equation}
N(h)\ge\phi^{h}, \quad(\phi\approx 1.62) \implies h\le \frac{\log{N(h)}}{\log{\phi}}\implies h\le 1.44\log{N(h)}
\end{equation}
- Namely, AVLs are relatively well balanced trees.

### Node heights

- Recall that 
    - Height of node – The height of a node is the number of edges on the longest downward path between that node and a leaf. 
    
    <img src="images/week-05/AVL2.png" width="500">
    
Suppose that 
- height of a node = $h$
- Then, the balance factor for the node is
\begin{equation}
h_{left}-h_{right}
\end{equation}
- the height of empty tree is -1

- Note that BOTH of the above trees are AVLs.

### An AVL tree is a self-balancing binary search tree with the properties

- **Structural property**- Binary tree
- **Order property** - Same as for BST

ALSO
- **Balance property:**
    - balance factor of every node is between -1 and 1
    
    <img src="images/week-05/AVL7generalAVL.png" width="250">

### After insertion 7
- An AVL tree is a self-balancing binary search tree.
<img src="images/week-05/AVL3.png" width="250">
<img src="images/week-05/AVL4.png" width="250">

## Insert and Rotation in AVL Trees
- Insert operation may cause balance factor to become 2 or –2 for some node
     - only nodes on the path from insertion point to root node have possibly changed in height
    - So after the Insert, go back up to the root node by node, updating heights
    - If a new balance factor (the difference $h_{left}-h_{right}$) is 2 or –2, adjust tree by rotation around the node

## Single Rotation in an AVL Tree

<img src="images/week-05/AVL5.png" width="500">

## Double Rotation in an AVL Tree
<img src="images/week-05/AVL6.png" width="500">

## Insertions in AVL Trees

After insertions AVL property might be destroyed. To restore AVL property there are 4 cases to consider. Let the node that needs rebalancing be $\alpha$.

- Outside Cases (require single rotation) :
    - Insertion into left subtree of left child of $\alpha$.
    - Insertion into right subtree of right child of $\alpha$.
- Inside Cases (require double rotation) :
    - Insertion into right subtree of left child of $\alpha$.
    - Insertion into left subtree of right child of $\alpha$.

The rebalancing is performed through four separate rotation algorithms.

## AVL Insertion: Outside Case

Consider a valid AVL **subtree**.

<img src="images/week-05/AVL7generalAVL.png" width="250">
Inserting into X destroys the AVL property at node j
<img src="images/week-05/AVL8.png" width="250">
- Do **RIGHT ROTATION**.
<img src="images/week-05/AVL9.png" width="250">

- Break connections
<img src="images/week-05/AVL10.png" width="250">
After Right rotation AVL property has been restored.
<img src="images/week-05/AVL11.png" width="250">

- Mirror reflection is resolved by **Left Rotation**

## AVL Insertion: Inside Case
Consider the following valid AVL **subtree**.
<img src="images/week-05/AVL7generalAVL.png" width="250">

- Inserting into $Y$ destroys the AVL property at node $j$.
<img src="images/week-05/AVL12.png" width="250">
- Observe that right rotation does NOT work.
- Instead consider the structure of the subtree $Y$, expand to obtain
<img src="images/week-05/AVL13.png" width="250">
- Now there are two rotations - left rotation solid ellipse and right rotation dashed ellipse (aka left-right rotation, double rotation)
<img src="images/week-05/AVL14.png" width="250">
- After left rotation is completed
<img src="images/week-05/AVL15.png" width="250">
- Do right rotation
<img src="images/week-05/AVL16.png" width="250">
- After right rotation is completet
<img src="images/week-05/AVL17.png" width="250">
- Balance has been restored.