# Balanced Search Trees

**Balanced search trees** are an implementation of symbol tables (with comparable keys) that guarantee efficient operations of search, insert, delete, max, min, rank, floor, ceiling, and select.

## 2-3 Search Trees

Recall from the previous section on Elementary Symbol Tables that the goal for symbol table implementations was $\lg N$ for all operations. **2-3 Trees**, which are left-leaning red-black BSTs, are an old implementation to do this. They allow 1 or 2 keys per node, so there's a **2-node** (one key, two children) or a **3-node** (two keys, three children). The 2-node has two links - one to keys less than the node key and one for keys greater. The 3-node has three links - one for keys less than the smaller key, one for keys between the two keys, and one for keys greater than the larger key.

2-3 trees also have **perfect balance**, so every path from the root to a null link has the same length. They also have **symmetric order** so an in-order traversal (follow left-most paths to keys)  yields the keys in ascending order.

To **insert**, you first search for the key. The easy case is if you end at a 2-node at the bottom, then you just replace that 2-node with a 3-node containing the new inserted key with what was in that 2-node, and add a null link for the third child. To insert a new key to a 3-node at the bottom, first create a temporary 4-node, then move the middle key in the 4-node into the parent. The parent becomes a 3-node, and the 2-node child is split so the children are re-linked (the smaller key becomes the new middle link of the parent and the larger key becomes the right link). If the parent were already a 3-node, it would become a temporary 4-node and that process would propagate up the tree. The only time the height of a 2-3 tree grows is when the root was a 3-node and the process reaches it, so the root has to split.

Splitting a 4-node is a **local** transformation - there are a constant number of operations and they don't touch the subtrees, no matter how many keys are below where the split happens. Each transformation maintains symmetric order and perfect balance.

**Tree height** worst case is $\lg N$ (with all 2-nodes), or best case $\log_{3} N \approx 0.631 \lg N$ (with all 3-nodes). This guarantees **logarithmic** performance for search and insert.

**Implementation** is complicated (see the red-black BST option below instead):
- Maintaining multiple node types is cumbersome
- Need multiple compares to move down the tree
- Need to move back up the tree to split 4-nodes
- Large number of cases for splitting

**Example of inserting in a 2-3 search tree:**

![Wikipedia 2-3 search tree example](https://upload.wikimedia.org/wikipedia/commons/thumb/4/44/2-3_insertion.svg/581px-2-3_insertion.svg.png)

Source: Wikipedia

## Red-Black BSTs

Red-black BSTs are simple data structures that help implement 2-3 trees with very little extra code beyond the basic binary search tree. The idea is to represent a 2-3 tree as a binary search tree, and use "internal" left-leaning links as "glue" for 3-nodes. So the larger of the two keys in a 3-node will be the root in this subtree - its right link goes to keys larger than it and its left link (colored red) connects to the smaller of the two original keys. That key is now a 2-node with a left link to keys smaller and right link to keys that were between the original two keys.

**Black links** connect 2-nodes and 3-nodes, **red links** "glue" nodes within a 3-node.

Some characteristics:
- No node has two red links connected to it
- Every path from the root to a null link has the same number of black links (**perfect black balance**)
- Red links lean left

There's a 1-1 correspondence of left-leaning red-black (LLRB) BSTs and 2-3 trees (think of the red links as horizontal ones, and it looks like a 2-3 tree. You can use the same search code from elementary BST, just ignore the color. It actually runs faster because of better balance. Most other operations (ceiling, selection) are also identical.

Because each node is pointed to by precisely one link (its parent), you can encode the color of the links as data in the node (e.g. `node.left.color == 'RED'` or `node.right.color == 'BLACK'`) and null links are black.

In [1]:
# Updated Node object
class Node:
    def __init__(self, key, val, left=None, right=None, count=1, color='BLACK'):
        self.key = key
        self.val = val
        self.left = left
        self.right = right
        self.count = count  # Number of nodes in subtree including self
        self.color = color
    
    def has_left_child(self):
        return True and self.left
    
    def has_right_child(self):
        return True and self.right
    
    def is_leaf(self):
        return not (self.left or self.right)
    
    def has_any_children(self):
        return self.right or self.left

    def has_two_children(self):
        return self.right and self.left

    def update_val(self, key, val, lc, rc):
        self.key = key
        self.val = val
        self.left = lc
        self.right = rc
        if self.has_left_child():
            self.left.parent = self
        if self.has_right_child():
            self.right.parent = self


### Red-Black BST Operations

A new operation for red-black BSTs is a **rotation**. During an insertion operation, sometimes you end up with a right-leaning red link (the wrong direction). A rotation will re-orient the link so it leans to the left. Rotations maintain symmetric order and perfect black balance.

![Example of a right-to-left rotation. Source: Princeton.edu](https://algs4.cs.princeton.edu/33balanced/images/redblack-left-rotate.png)

Sometimes during an insertion, you'll need to temporarily rotate links to have a red right-leaning link before rotating it left. The rotation implementation is similar.

In [1]:
# Rotate left
def rotate_left(node_h):
    assert node.right.color == 'RED'  # Or implement isRed() and isBlack() methods on the node class
    node_x = node_h.right
    node_h.right = node_x.left  # Move middle keys over so they're h's right link
    node_x.left = node_h
    node_x.color = node_h.color  # Assign h's original color to x
    node_h.color = 'RED'
    node_x.N = node_h.N  # Move h's size to x, its new parent
    node_h.N = 1 + size(node_h.left) + size(node_h.right)
    return node_x

# Rotate right
def rotate_left(node_h):
    assert node.right.color == 'RED'  # Or implement isRed() and isBlack() methods on the node class
    node_x = node_h.right
    node_h.right = node_x.left  # Move middle keys over so they're h's right link
    node_x.left = node_h
    node_x.color = node_h.color  # Assign h's original color to x
    node_h.color = 'RED'
    node_x.N = node_h.N  # Move h's size to x, its new parent
    node_h.N = 1 + size(node_h.left) + size(node_h.right)
    return node_x

Another operation is called a **color flip**, which you use to re-color the local links to split a temporary 4-node. You don't need to change any links, but the parent key will have two red links (both left and right), and will be black itself. You flip the colors so the parent is red and both its links are black.

![Example of a red-black BST color flip. Source: princeton.edu](https://algs4.cs.princeton.edu/33balanced/images/color-flip.png)

In [2]:
# Color flip
def flip_color(node_h):
    node_h.color == 'RED'
    node_h.left.color = 'BLACK'
    node_h.right.color = 'BLACK'

**Insertions** use these three operations (left rotation, right rotation, and color flip) to maintain a legal red-black BST with 1-1 correspondence to a 2-3 tree. Here are the main scenarios:

1. Insert into a tree with exactly 1 node
    - **Left:** search ends at the left null link, create a red link to the new node (converts a 2-node into a 3-node)
    - **Right:** search ends the right null link, you attach a new node with a red link on the right, then rotate left to make a legal 3-node
    - **Generalization:** (insert into a 2-node at the bottom) you do a standard BST insert and color the new link red. If it's on the right, rotate left

2. Insert into a tree with 2 nodes (see image). The generalization is to insert into a 3-node at the bottom
    - Do standard BST insert, color the new link red
    - Rotate to balance the 4-node (if needed)
    - Flip colors to pass the red link up one level
    - Rotate to make left-leaning (if needed)

![Example inserting into a tree with 2 nodes](https://x-wei.github.io/images/algoI_week5_1/pasted_image026.png)

The same code handles all cases:
- Right child red, left child black -> rotate left
- Left child, left-left grandchild red -> rotate right
- Both children red -> flip colors

In [4]:
# New put method for BST object
def _put(node_h, key, val):
    if (node_h is None):
        return Node(key, val, 'RED')
    if key < node_h.key:
        node_h.left = put(node_h.left, key, val)
    elif key > node_h.key:
        node_h.right = put(node_h.right, key, val)
    else:
        node_h.val = val
    
    if (isRed(node_h.right) and not isRed(node_h.left)):
        node_h = rotate_left(node_h)
    if (isRed(node_h.left) and isRed(node_h.left.left)):
        node_h = rotate_right(node_h)
    if (isRed(node_h.left) and isRed(node_h.right)):
        flip_colors(node_h)

## Summary

The worst case (WC) is after $N$ inserts, and the average case (AC) is after $N$ random inserts.

| Implementation | WC Search | WC Insert | WC Delete | AC Search | AC Insert | AC Delete | Ordered Iteration? |
| --- | --- | --- | --- | --- | --- | --- | --- |
| Sequential Search (unordered list) | $N$ | $N$ | $N$ | $N/2$ | $N$ | $N/2$ | No |
| Binary Search (ordered array) | $\lg N$ | $N$ | $N$ | $\lg N$ | $N/2$ | $N/2$ | Yes |
| Binary Search Tree (BST) | $N$ | $N$ | $N$ | $1.39 \lg N$ | $1.39 \lg N$ | ? | Yes |
| 2-3 Tree | $c \lg N$ | $c \lg N$ | $c \lg N$ | $c \lg N$ | $c \lg N$ | $c \lg N$ | Yes |
