# Chapter 29: Advanced Data Structures

> *"Advanced data structures push the boundaries of what we can efficiently represent and query. They are the specialized tools in a programmer's arsenal."* — Anonymous

---

## 29.1 Introduction to Advanced Data Structures

In previous chapters, we covered fundamental data structures like arrays, linked lists, trees, and heaps, as well as more specialized ones like segment trees and Fenwick trees. This chapter delves into **advanced data structures** that offer unique capabilities:

- **Treaps:** A randomized binary search tree combining BST and heap properties.
- **Splay Trees:** A self-adjusting BST with amortized logarithmic operations.
- **Link-Cut Trees:** A structure for dynamic trees that supports link, cut, and path queries.
- **Ordered Statistics Trees:** Augmented BSTs for rank and select operations.
- **Sparse Table:** A static data structure for range minimum queries in O(1) after O(n log n) preprocessing.

These structures are used in specialized contexts, such as network flow algorithms, dynamic connectivity, and problems requiring order statistics.

### 29.1.1 Why These Structures Matter

```
┌─────────────────────────────────────────────────────────────────────┐
│                    IMPORTANCE OF ADVANCED DS                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  1. RANDOMIZED DATA STRUCTURES: Treaps offer simple implementation  │
│     with expected O(log n) performance.                             │
│                                                                      │
│  2. SELF-ADJUSTING STRUCTURES: Splay trees provide amortized        │
│     efficiency and adapt to access patterns.                        │
│                                                                      │
│  3. DYNAMIC TREES: Link-cut trees enable operations on forests      │
│     with changes over time (edge additions/removals).               │
│                                                                      │
│  4. ORDER STATISTICS: Finding the k-th smallest element efficiently │
│     is crucial in many algorithms.                                  │
│                                                                      │
│  5. STATIC QUERIES: Sparse tables provide O(1) queries on static    │
│     arrays, ideal for RMQ problems.                                 │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
```

---

## 29.2 Treaps (Randomized Binary Search Trees)

A **treap** is a combination of a binary search tree (BST) and a heap. Each node has a key (maintaining BST order) and a randomly assigned priority (maintaining heap order: parent priority > child priorities). This randomization ensures the tree remains balanced with high probability, leading to expected O(log n) operations.

### 29.2.1 Definition and Properties

- **BST property:** For any node, left subtree keys < node.key < right subtree keys.
- **Heap property:** For any node, priority(node) > priority(left child) and > priority(right child) (max-heap).
- Priorities are assigned randomly (e.g., using a random number generator) upon insertion.

Because priorities are random, the expected height of a treap is O(log n), and operations are efficient in expectation.

### 29.2.2 Operations

#### Insertion
1. Insert node as in a normal BST.
2. Then rotate it up (like in a heap) until heap property is restored.

#### Deletion
1. Find the node.
2. If it has two children, rotate it down with the child of higher priority until it becomes a leaf, then remove.
3. If it has one child or none, simply replace it with its child.

#### Search
Standard BST search.

### 29.2.3 Implementation

```python
import random

class TreapNode:
    def __init__(self, key):
        self.key = key
        self.priority = random.random()  # or random.randint(0, 2**30)
        self.left = None
        self.right = None

def rotate_right(y):
    x = y.left
    T2 = x.right
    x.right = y
    y.left = T2
    return x

def rotate_left(x):
    y = x.right
    T2 = y.left
    y.left = x
    x.right = T2
    return y

def treap_insert(root, key):
    if root is None:
        return TreapNode(key)
    if key < root.key:
        root.left = treap_insert(root.left, key)
        if root.left.priority > root.priority:
            root = rotate_right(root)
    else:
        root.right = treap_insert(root.right, key)
        if root.right.priority > root.priority:
            root = rotate_left(root)
    return root

def treap_delete(root, key):
    if root is None:
        return None
    if key < root.key:
        root.left = treap_delete(root.left, key)
    elif key > root.key:
        root.right = treap_delete(root.right, key)
    else:
        # found node to delete
        if root.left is None:
            return root.right
        if root.right is None:
            return root.left
        # two children: rotate with child of higher priority
        if root.left.priority > root.right.priority:
            root = rotate_right(root)
            root.right = treap_delete(root.right, key)
        else:
            root = rotate_left(root)
            root.left = treap_delete(root.left, key)
    return root

def treap_search(root, key):
    if root is None or root.key == key:
        return root
    if key < root.key:
        return treap_search(root.left, key)
    else:
        return treap_search(root.right, key)
```

### 29.2.4 Analysis and Applications

- **Expected time per operation:** O(log n) due to random priorities.
- **Space:** O(n).
- **Applications:** When a simple balanced BST is needed and randomization is acceptable; also used as a basis for ordered statistics trees (by augmenting size).

---

## 29.3 Splay Trees

A **splay tree** is a self-adjusting binary search tree that moves recently accessed elements to the root via **splaying** operations. It provides amortized O(log n) time for all operations.

### 29.3.1 Splay Operations

Splaying is a series of rotations that bring a node to the root. There are three cases:

- **Zig:** Node is child of root – single rotation.
- **Zig-Zig:** Node and parent are both left (or both right) children – two rotations in same direction.
- **Zig-Zag:** Node is left child of right child (or vice versa) – double rotation opposite direction.

After splaying, the node becomes the new root.

### 29.3.2 Implementation

```python
class SplayNode:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None
        self.parent = None

class SplayTree:
    def __init__(self):
        self.root = None

    def _rotate_left(self, x):
        y = x.right
        x.right = y.left
        if y.left:
            y.left.parent = x
        y.parent = x.parent
        if x.parent is None:
            self.root = y
        elif x == x.parent.left:
            x.parent.left = y
        else:
            x.parent.right = y
        y.left = x
        x.parent = y

    def _rotate_right(self, x):
        y = x.left
        x.left = y.right
        if y.right:
            y.right.parent = x
        y.parent = x.parent
        if x.parent is None:
            self.root = y
        elif x == x.parent.left:
            x.parent.left = y
        else:
            x.parent.right = y
        y.right = x
        x.parent = y

    def _splay(self, x):
        while x.parent:
            if x.parent.parent is None:
                # Zig case
                if x == x.parent.left:
                    self._rotate_right(x.parent)
                else:
                    self._rotate_left(x.parent)
            elif x == x.parent.left and x.parent == x.parent.parent.left:
                # Zig-Zig (left-left)
                self._rotate_right(x.parent.parent)
                self._rotate_right(x.parent)
            elif x == x.parent.right and x.parent == x.parent.parent.right:
                # Zig-Zig (right-right)
                self._rotate_left(x.parent.parent)
                self._rotate_left(x.parent)
            else:
                # Zig-Zag
                if x == x.parent.left:
                    # right-left
                    self._rotate_right(x.parent)
                    self._rotate_left(x.parent)
                else:
                    # left-right
                    self._rotate_left(x.parent)
                    self._rotate_right(x.parent)

    def insert(self, key):
        node = SplayNode(key)
        if self.root is None:
            self.root = node
            return
        curr = self.root
        while True:
            if key < curr.key:
                if curr.left is None:
                    curr.left = node
                    node.parent = curr
                    break
                curr = curr.left
            elif key > curr.key:
                if curr.right is None:
                    curr.right = node
                    node.parent = curr
                    break
                curr = curr.right
            else:
                # duplicate keys: we can either ignore or handle (here ignore)
                return
        self._splay(node)

    def search(self, key):
        curr = self.root
        while curr:
            if key == curr.key:
                self._splay(curr)
                return curr
            elif key < curr.key:
                curr = curr.left
            else:
                curr = curr.right
        return None

    def delete(self, key):
        node = self.search(key)
        if node is None:
            return
        # After splay, node is root
        self._splay(node)
        # Merge left and right subtrees
        left = self.root.left
        right = self.root.right
        if left is None:
            self.root = right
            if right:
                right.parent = None
        elif right is None:
            self.root = left
            left.parent = None
        else:
            # Find max in left subtree, splay it to root of left
            max_left = left
            while max_left.right:
                max_left = max_left.right
            self._splay(max_left)  # now max_left is root of left subtree
            max_left.right = right
            right.parent = max_left
            self.root = max_left
            self.root.parent = None
```

### 29.3.3 Amortized Analysis (Overview)

Splay trees have amortized O(log n) per operation using the **potential method**. The potential function is based on the logarithm of subtree sizes. Each splay operation has amortized cost O(log n). For details, see CLRS.

### 29.3.4 Applications

- **Cache implementation:** Frequently accessed items move to root.
- **Garbage collection:** Used in some algorithms for memory management.
- **Link-Cut Trees:** Use splay trees as auxiliary structures.

---

## 29.4 Link-Cut Trees (Dynamic Trees)

**Link-Cut Trees** (also known as **dynamic trees**) maintain a forest of rooted trees that can be modified by linking and cutting edges. They support operations like:

- `link(u, v)`: Add edge between u and v, making v a child of u (assuming they are in different trees).
- `cut(u, v)`: Remove edge between u and v.
- `findRoot(u)`: Find the root of u's tree.
- `pathAggregate(u, v)`: Query aggregate (e.g., sum, min, max) along the path between u and v.

They are used in network flow algorithms (e.g., Dinic with capacity updates), dynamic connectivity, and various graph algorithms.

### 29.4.1 Structure

Link-cut trees represent each tree in the forest as a set of **preferred paths** (like heavy-light decomposition but dynamic). Each path is stored as a splay tree keyed by depth. The overall structure consists of:

- Each node has a pointer to its parent in the represented tree.
- Each node has a pointer to its left and right child in the auxiliary splay tree.
- Edge direction: In the auxiliary tree, nodes are ordered by depth along the path.

### 29.4.2 Key Operations

- **access(v):** Makes the path from root to v the preferred path, bringing v to the root of its auxiliary tree.
- **makeRoot(v):** Reverses the path from root to v, making v the new root (by performing `access(v)` and then flipping the path).
- **findRoot(v):** `access(v)`, then go leftmost in auxiliary tree.
- **link(u, v):** `makeRoot(u)`, then set parent of u to v.
- **cut(u, v):** `makeRoot(u)`, `access(v)`, then detach v's left child (which is u if edge exists).

### 29.4.3 Implementation Sketch

Implementing a full link-cut tree is complex; here's a simplified structure using splay trees for auxiliary paths.

```python
class LCTNode:
    def __init__(self, id):
        self.id = id
        self.left = None
        self.right = None
        self.parent = None
        self.rev = False  # flag for path reversal

def _is_root(x):
    return not x.parent or (x.parent.left != x and x.parent.right != x)

def _push(x):
    if x.rev:
        x.rev = False
        x.left, x.right = x.right, x.left
        if x.left:
            x.left.rev ^= True
        if x.right:
            x.right.rev ^= True

def _rotate(x):
    y = x.parent
    z = y.parent
    if y.left == x:
        y.left = x.right
        if x.right:
            x.right.parent = y
        x.right = y
    else:
        y.right = x.left
        if x.left:
            x.left.parent = y
        x.left = y
    y.parent = x
    x.parent = z
    if z:
        if z.left == y:
            z.left = x
        elif z.right == y:
            z.right = x

def _splay(x):
    stack = []
    y = x
    while not _is_root(y):
        stack.append(y.parent)
        y = y.parent
    stack.append(y)
    while stack:
        _push(stack.pop())
    while not _is_root(x):
        y = x.parent
        if not _is_root(y):
            if (y.left == x) == (y.parent.left == y):
                _rotate(y)
            else:
                _rotate(x)
        _rotate(x)

def access(x):
    last = None
    while x:
        _splay(x)
        x.right = last
        last = x
        x = x.parent
    return last

def make_root(x):
    access(x)
    _splay(x)
    x.rev ^= True

def find_root(x):
    access(x)
    _splay(x)
    while x.left:
        x = x.left
    return x

def link(u, v):
    make_root(u)
    if find_root(v) != u:
        u.parent = v

def cut(u, v):
    make_root(u)
    access(v)
    _splay(v)
    if v.left == u:
        v.left.parent = None
        v.left = None
```

**Note:** This is a minimal implementation; real-world usage requires careful handling of path aggregates.

### 29.4.4 Applications

- **Dynamic connectivity** (fully dynamic, with edge additions and removals).
- **Maximum flow** (Dinic algorithm with dynamic trees achieves O(VE log V)).
- **Tree path queries** with edge updates.

---

## 29.5 Policy-Based Data Structures (Ordered Statistics Trees)

**Ordered statistics trees** are BSTs augmented with subtree sizes to support:

- **select(k):** Find the k-th smallest element.
- **rank(x):** Find the order (index) of element x.

These operations are efficiently implemented by augmenting each node with the size of its subtree. Any balanced BST can be augmented; treaps or splay trees are common choices.

### 29.5.1 Augmented BST

We augment a treap with a `size` field. During rotations, we update sizes.

```python
class OSTNode:
    def __init__(self, key):
        self.key = key
        self.prio = random.random()
        self.left = None
        self.right = None
        self.size = 1

def update_size(node):
    if node:
        node.size = 1
        if node.left:
            node.size += node.left.size
        if node.right:
            node.size += node.right.size

def rotate_right(y):
    x = y.left
    y.left = x.right
    x.right = y
    update_size(y)
    update_size(x)
    return x

def rotate_left(x):
    y = x.right
    x.right = y.left
    y.left = x
    update_size(x)
    update_size(y)
    return y

def insert(root, key):
    if root is None:
        return OSTNode(key)
    if key < root.key:
        root.left = insert(root.left, key)
        if root.left.prio > root.prio:
            root = rotate_right(root)
    else:
        root.right = insert(root.right, key)
        if root.right.prio > root.prio:
            root = rotate_left(root)
    update_size(root)
    return root

def kth_smallest(root, k):
    left_size = root.left.size if root.left else 0
    if k <= left_size:
        return kth_smallest(root.left, k)
    elif k == left_size + 1:
        return root.key
    else:
        return kth_smallest(root.right, k - left_size - 1)

def rank(root, key):
    if root is None:
        return 0
    if key < root.key:
        return rank(root.left, key)
    elif key > root.key:
        left_size = root.left.size if root.left else 0
        return left_size + 1 + rank(root.right, key)
    else:
        return (root.left.size if root.left else 0) + 1
```

### 29.5.2 C++ STL __gnu_pbds

In C++, the GNU Policy-Based Data Structures library provides an `ordered_set` that supports order statistics via `find_by_order` and `order_of_key`. This is implemented using a balanced tree (often a red-black tree) with subtree sizes.

### 29.5.3 Applications

- **Median maintenance** (find k-th smallest on the fly).
- **Inversion count** (using order statistics).
- **Range queries** (e.g., number of elements in [L, R]) by rank differences.

---

## 29.6 Sparse Table for RMQ

The **sparse table** is a static data structure for answering **range minimum (or maximum) queries** on an immutable array. After O(n log n) preprocessing, it answers queries in O(1) time.

### 29.6.1 Idea

We precompute `st[i][j]` = minimum in the range starting at i of length 2^j. Then any query [l, r] can be covered by two overlapping intervals of length 2^k, where k = floor(log2(r-l+1)). The answer is min(st[l][k], st[r-2^k+1][k]).

### 29.6.2 Implementation

```python
import math

class SparseTable:
    def __init__(self, arr):
        self.n = len(arr)
        self.k = math.floor(math.log2(self.n)) + 1
        self.st = [[0] * self.k for _ in range(self.n)]
        for i in range(self.n):
            self.st[i][0] = arr[i]
        j = 1
        while (1 << j) <= self.n:
            i = 0
            while i + (1 << j) <= self.n:
                self.st[i][j] = min(self.st[i][j-1], self.st[i + (1 << (j-1))][j-1])
                i += 1
            j += 1

    def query(self, l, r):
        """0-indexed inclusive [l, r]"""
        j = math.floor(math.log2(r - l + 1))
        return min(self.st[l][j], self.st[r - (1 << j) + 1][j])
```

### 29.6.3 Limitations

- **Static array:** Cannot handle updates efficiently.
- **Memory:** O(n log n), which can be large for huge n.
- **Other aggregates:** Works for idempotent functions (min, max, gcd, etc.) but not for sum (use prefix sums or segment tree).

### 29.6.4 Comparison with Segment Trees and Fenwick Trees

| Structure     | Preprocessing | Query  | Update | Use Case                    |
|---------------|---------------|--------|--------|-----------------------------|
| Sparse Table  | O(n log n)    | O(1)   | N/A    | Static array, many queries  |
| Segment Tree  | O(n)          | O(log n)| O(log n) | Dynamic array               |
| Fenwick Tree  | O(n)          | O(log n)| O(log n) | Prefix sums, dynamic        |

---

## 29.7 Summary

```
┌────────────────────────────┬────────────────────────────────────────┐
│ Data Structure             │ Key Features & Use Cases               │
├────────────────────────────┼────────────────────────────────────────┤
│ Treap                       │ Randomized BST with heap property;    │
│                             │ simple, expected O(log n).            │
├────────────────────────────┼────────────────────────────────────────┤
│ Splay Tree                  │ Self-adjusting; amortized O(log n);   │
│                             │ good for access patterns.             │
├────────────────────────────┼────────────────────────────────────────┤
│ Link-Cut Tree               │ Dynamic trees; supports link/cut,     │
│                             │ path queries; used in flow, dynamic   │
│                             │ connectivity.                         │
├────────────────────────────┼────────────────────────────────────────┤
│ Ordered Statistics Tree     │ Augmented BST for kth/rank queries;   │
│                             │ often built on treap or balanced tree.│
├────────────────────────────┼────────────────────────────────────────┤
│ Sparse Table                │ Static RMQ in O(1) with O(n log n)    │
│                             │ preprocessing; no updates.            │
└────────────────────────────┴────────────────────────────────────────┘
```

---

## 29.8 Practice Problems

### Treaps
1. **Implement a treap** with insert, delete, search.
2. **Merge two treaps** (if keys in first all less than second).
3. **Split treap by key** (into two treaps with keys ≤ x and > x).

### Splay Trees
4. **Implement a splay tree** and compare performance with AVL on access patterns.
5. **Use splay tree as a cache** for recently used items.

### Link-Cut Trees
6. **Dynamic connectivity** – simulate adding and removing edges and query connectivity.
7. **Path sum queries** with updates on tree edges.

### Ordered Statistics Trees
8. **Find k-th smallest in a stream** (online) using order-statistic treap.
9. **Count inversions** using order-statistic tree (insert and count greater).

### Sparse Table
10. **Range minimum queries** on static array.
11. **Range GCD queries** using sparse table.

---

## 29.9 Further Reading

1. **"Introduction to Algorithms" (CLRS)** – Chapters 13 (Red-Black Trees), 14 (Augmenting Data Structures), 19 (Splay Trees), 21 (Disjoint Sets) [Link-Cut not covered].
2. **"The Art of Computer Programming, Vol 3"** by Donald Knuth – Sorting and Searching.
3. **"Data Structures and Algorithms in C++"** by Mark Allen Weiss – Chapters on advanced trees.
4. **"Algorithm Design"** by Kleinberg & Tardos – Not extensive on these, but covers applications.
5. **Original Papers**:
   - Aragon, C. R., & Seidel, R. (1989) – "Randomized Search Trees" (Treaps)
   - Sleator, D. D., & Tarjan, R. E. (1985) – "Self-adjusting binary search trees" (Splay Trees)
   - Sleator, D. D., & Tarjan, R. E. (1983) – "A data structure for dynamic trees" (Link-Cut Trees)
   - Bender, M. A., & Farach-Colton, M. (2000) – "The LCA Problem Revisited" (Sparse Table for RMQ)

---

> **Coming in Chapter 30**: **Two Pointers and Sliding Window** – We'll explore these essential problem-solving patterns.

---

**End of Chapter 29**