# Chapter 18: Minimum Spanning Trees

> *"A minimum spanning tree connects all points at minimum cost—it's the skeleton of efficiency in network design."* — Anonymous

---

## 18.1 Introduction to Minimum Spanning Trees

A **minimum spanning tree (MST)** of a connected, undirected, weighted graph is a subset of edges that connects all vertices together without cycles and with the minimum possible total edge weight. MSTs are fundamental in network design, clustering, and approximation algorithms.

### 18.1.1 Formal Definition

Given a connected, undirected graph G = (V, E) with a weight function w: E → ℝ, a **spanning tree** T ⊆ E is a set of |V|-1 edges that connects all vertices without cycles. A **minimum spanning tree** is a spanning tree with the smallest possible sum of edge weights.

### 18.1.2 Why MSTs Matter

```
┌─────────────────────────────────────────────────────────────────────┐
│                    IMPORTANCE OF MSTs                                │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  1. NETWORK DESIGN: Laying cables, pipelines, or roads to connect   │
│     all locations with minimum cost                                  │
│                                                                      │
│  2. CLUSTERING: Construct MST of data points, then remove heavy     │
│     edges to form clusters                                           │
│                                                                      │
│  3. APPROXIMATION ALGORITHMS: MST is used in approximation for      │
│     NP-hard problems like Traveling Salesman (within factor 2)      │
│                                                                      │
│  4. IMAGE SEGMENTATION: Graph-based segmentation uses MST           │
│                                                                      │
│  5. NETWORK ROUTING: Multicast routing (Steiner tree approximations)│
│                                                                      │
│  6. TAXONOMY: Phylogenetic tree construction in biology             │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
```

### 18.1.3 Key Properties

- **Cut Property:** For any cut (partition of vertices), the minimum-weight edge crossing the cut belongs to some MST.
- **Cycle Property:** For any cycle, the maximum-weight edge in the cycle cannot be in any MST (if weights are distinct; otherwise, there exists an MST that excludes it).
- **Uniqueness:** If all edge weights are distinct, the MST is unique.
- **Number of edges:** Exactly |V|-1 edges.

---

## 18.2 Kruskal's Algorithm

Kruskal's algorithm is a greedy algorithm that builds the MST by repeatedly adding the smallest edge that does not create a cycle.

### 18.2.1 Intuition

Sort edges by weight, then iterate through them. For each edge (u, v), if u and v are already connected (in the same component), skip it; otherwise, add it to the MST and union the components.

### 18.2.2 Data Structures

Kruskal relies on a **Union-Find (Disjoint Set Union)** data structure to efficiently check connectivity and merge components.

```python
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    
    def find(self, x):
        # Path compression
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        xr, yr = self.find(x), self.find(y)
        if xr == yr:
            return False
        # Union by rank
        if self.rank[xr] < self.rank[yr]:
            self.parent[xr] = yr
        elif self.rank[xr] > self.rank[yr]:
            self.parent[yr] = xr
        else:
            self.parent[yr] = xr
            self.rank[xr] += 1
        return True
```

### 18.2.3 Algorithm

```python
def kruskal(n: int, edges: List[Tuple[int, int, int]]) -> List[Tuple[int, int, int]]:
    """
    Kruskal's MST algorithm.
    
    Args:
        n: number of vertices (0-indexed)
        edges: list of (u, v, weight)
    
    Returns:
        List of edges in MST (u, v, weight)
    """
    # Sort edges by weight
    edges.sort(key=lambda e: e[2])
    uf = UnionFind(n)
    mst = []
    total_weight = 0
    
    for u, v, w in edges:
        if uf.union(u, v):  # if they were in different components
            mst.append((u, v, w))
            total_weight += w
            if len(mst) == n - 1:
                break
    return mst, total_weight
```

### 18.2.4 Complexity

- **Time:** O(E log E) for sorting + O(E α(V)) for union-find operations, where α is the inverse Ackermann function (essentially constant). Overall O(E log E) dominated by sorting.
- **Space:** O(V) for union-find + O(E) for storing edges.

### 18.2.5 Example

Graph:
```
Vertices: 0,1,2,3
Edges:
0-1 weight 1
0-2 weight 3
0-3 weight 4
1-2 weight 2
2-3 weight 5
```

Sorted edges: (0-1,1), (1-2,2), (0-2,3), (0-3,4), (2-3,5)
Processing:
- Add 0-1 (components: {0,1}, {2}, {3})
- Add 1-2 (components: {0,1,2}, {3})
- Add 0-3 (0-3 connects components {0,1,2} and {3}) → MST complete.
MST edges: (0-1,1), (1-2,2), (0-3,4), total weight 7.

### 18.2.6 Correctness Proof Sketch

Kruskal's algorithm maintains the invariant that the set of selected edges is a subset of some MST. When adding an edge (u,v) that connects two different components, by the cut property (cut = one component vs rest), this edge is the minimum crossing that cut, hence belongs to some MST. The algorithm never creates cycles, so final set is a spanning tree.

---

## 18.3 Prim's Algorithm

Prim's algorithm grows the MST from a starting vertex, always adding the cheapest edge from the current tree to a vertex outside the tree.

### 18.3.1 Intuition

Start with a single vertex, then repeatedly add the smallest edge that connects a vertex in the tree to a vertex outside. This is similar to Dijkstra's algorithm but minimizes total tree weight rather than path distances.

### 18.3.2 Data Structures

Prim's can be implemented with a priority queue (min-heap) keyed by the minimum weight edge connecting each vertex to the current tree. We maintain:
- `in_mst`: boolean array marking vertices already in tree.
- `key[v]`: minimum weight edge from v to any vertex in tree.
- `parent[v]`: the vertex in tree that gives that minimum edge.

### 18.3.3 Algorithm (Lazy Version)

```python
import heapq

def prim_lazy(graph: List[List[Tuple[int, int]]], start: int = 0) -> Tuple[List[Tuple[int, int, int]], int]:
    """
    Prim's MST algorithm (lazy version).
    
    Args:
        graph: adjacency list where graph[u] = [(v, weight), ...]
        start: starting vertex
    
    Returns:
        mst: list of edges (u, v, weight) in MST
        total_weight: sum of weights
    """
    n = len(graph)
    in_mst = [False] * n
    mst_edges = []
    total_weight = 0
    
    # Priority queue stores (weight, u, v) where u is in tree, v is outside
    # Actually we can store (weight, v, u) and process when we pop
    pq = []
    in_mst[start] = True
    for v, w in graph[start]:
        heapq.heappush(pq, (w, start, v))
    
    while pq and len(mst_edges) < n - 1:
        w, u, v = heapq.heappop(pq)
        if in_mst[v]:
            continue  # both ends already in tree
        # Add edge to MST
        in_mst[v] = True
        mst_edges.append((u, v, w))
        total_weight += w
        # Add new edges from v to vertices not yet in tree
        for to, weight in graph[v]:
            if not in_mst[to]:
                heapq.heappush(pq, (weight, v, to))
    
    return mst_edges, total_weight
```

### 18.3.4 Complexity (Lazy)

- **Time:** O(E log E) because each edge may be pushed to heap once.
- **Space:** O(E) for heap in worst case.

### 18.3.5 Algorithm (Eager Version with Decrease-Key)

We can maintain a priority queue where keys are the minimum edge weight connecting each vertex to the tree, and update them when we find a better edge.

```python
def prim_eager(graph: List[List[Tuple[int, int]]], start: int = 0) -> Tuple[List[Tuple[int, int, int]], int]:
    n = len(graph)
    in_mst = [False] * n
    key = [float('inf')] * n
    parent = [-1] * n
    key[start] = 0
    pq = [(0, start)]  # (key, vertex)
    total_weight = 0
    mst_edges = []
    
    while pq:
        k, u = heapq.heappop(pq)
        if in_mst[u]:
            continue
        in_mst[u] = True
        if parent[u] != -1:
            mst_edges.append((parent[u], u, k))
            total_weight += k
        for v, w in graph[u]:
            if not in_mst[v] and w < key[v]:
                key[v] = w
                parent[v] = u
                heapq.heappush(pq, (w, v))
    return mst_edges, total_weight
```

**Complexity:** O(E log V) with binary heap, because each edge may cause a decrease-key (implemented as push with new key, and lazy ignoring stale entries). With Fibonacci heap, O(E + V log V).

### 18.3.6 Example

Using same graph as before. Start at 0:
- Add 0 to tree, update keys: 1 via 0-1 (1), 2 via 0-2 (3), 3 via 0-3 (4). PQ: (1,1), (3,2), (4,3).
- Pop (1,1): add edge 0-1, update keys from 1: to 2 weight 2 (<3) → update key[2]=2, parent[2]=1; to 3 weight? none. PQ now: (2,2), (3,2 old), (4,3).
- Pop (2,2): add edge 1-2, update keys from 2: to 3 weight 5 (>4) ignore. PQ: (3,2 old), (4,3).
- Pop (3,2): skip (already in tree).
- Pop (4,3): add edge 0-3 (since parent[3] still 0). MST complete.

### 18.3.7 Correctness Proof Sketch

Prim's algorithm maintains the invariant that the tree built so far is a subset of some MST. At each step, we add the minimum-weight edge crossing the cut (tree vertices vs outside). By the cut property, this edge belongs to some MST. Therefore final tree is an MST.

---

## 18.4 Borůvka's Algorithm

Borůvka's algorithm (also known as Sollin's algorithm) is the oldest MST algorithm, dating to 1926. It is particularly suited for parallel implementation.

### 18.4.1 Intuition

Borůvka's algorithm proceeds in phases. In each phase, for each component, we select the cheapest edge that connects it to another component. Then we add all these edges (contracting components) and repeat until one component remains.

### 18.4.2 Algorithm

```python
def boruvka(n: int, edges: List[Tuple[int, int, int]]) -> Tuple[List[Tuple[int, int, int]], int]:
    """
    Borůvka's MST algorithm.
    
    Args:
        n: number of vertices
        edges: list of (u, v, weight)
    
    Returns:
        mst edges, total weight
    """
    # Initialize union-find for components
    uf = UnionFind(n)
    mst = []
    total_weight = 0
    
    # Keep list of edges (with original indices maybe) but we'll work with copies
    # We need to track cheapest edge per component in each phase
    # Borůvka runs at most log V phases
    num_components = n
    while num_components > 1:
        # cheapest[comp] = (u, v, weight) cheapest edge from this component
        cheapest = [-1] * n
        
        # For each edge, find which components it connects
        for i, (u, v, w) in enumerate(edges):
            ru = uf.find(u)
            rv = uf.find(v)
            if ru == rv:
                continue
            # Update cheapest for ru and rv
            if cheapest[ru] == -1 or w < edges[cheapest[ru]][2]:
                cheapest[ru] = i
            if cheapest[rv] == -1 or w < edges[cheapest[rv]][2]:
                cheapest[rv] = i
        
        # Add edges that connect components
        added = False
        for comp in range(n):
            if cheapest[comp] != -1 and uf.find(edges[cheapest[comp]][0]) != uf.find(edges[cheapest[comp]][1]):
                u, v, w = edges[cheapest[comp]]
                if uf.union(u, v):
                    mst.append((u, v, w))
                    total_weight += w
                    num_components -= 1
                    added = True
        if not added:
            break  # graph disconnected?
    
    return mst, total_weight
```

### 18.4.3 Complexity

- Each phase: O(E) to scan edges and update cheapest per component.
- Number of phases: O(log V) because each phase at least halves the number of components.
- Total: O(E log V).

### 18.4.4 Advantages

- Naturally parallelizable: each component can find its cheapest edge independently.
- Works well in distributed settings and external memory.

### 18.4.5 Example

Start with each vertex as its own component.
Phase 1:
- Component {0}: cheapest edge 0-1 (1)
- {1}: also 0-1
- {2}: cheapest 1-2 (2)
- {3}: cheapest 0-3 (4)
Add edges: 0-1, 1-2, 0-3. Now components: {0,1,2} and {3}.
Phase 2:
- Component {0,1,2}: cheapest edge connecting to {3} is 0-3 (4) or 2-3 (5) → pick 0-3.
- Component {3}: also picks 0-3.
Add 0-3, now one component. MST same as before.

---

## 18.5 Comparison of MST Algorithms

| Algorithm   | Time Complexity (typical) | Space   | Pros                                  | Cons                                  |
|-------------|---------------------------|---------|---------------------------------------|---------------------------------------|
| Kruskal     | O(E log E)                | O(E+V)  | Simple, works on edge list            | Requires sorting all edges            |
| Prim (binary heap) | O(E log V)        | O(V+E)  | Good for dense graphs                 | Needs adjacency list                  |
| Prim (Fibonacci)   | O(E + V log V)    | O(V+E)  | Optimal for dense graphs              | Complex to implement                  |
| Borůvka     | O(E log V)                | O(E+V)  | Parallelizable, good for dense too    | Multiple passes over edges             |

**Choosing the right algorithm:**
- If graph is dense (E ~ V²), Prim's with Fibonacci heap is theoretically best, but binary heap often fine.
- If graph is sparse, Kruskal's sorting may be fine; Prim's with binary heap also good.
- For distributed or parallel settings, Borůvka is natural.
- For simplicity, Kruskal is often easiest to implement correctly.

---

## 18.6 Applications of MSTs

### 18.6.1 Network Design

- **Telecommunication:** Laying fiber optic cables to connect cities with minimum cost.
- **Electrical grids:** Connecting substations with minimum wiring.
- **Road/rail networks:** Building roads connecting towns at minimum cost.

### 18.6.2 Clustering

Build MST of data points (using Euclidean distance as weight), then remove the k-1 heaviest edges to obtain k clusters. This is **single-linkage clustering**.

### 18.6.3 Approximation Algorithms

- **Traveling Salesman Problem (TSP):** For metric TSP (triangle inequality), an MST-based algorithm (double the tree, then shortcut) gives a 2-approximation.
- **Steiner tree problem:** MST on metric closure gives 2-approximation.

### 18.6.4 Image Segmentation

In graph-based image segmentation, pixels are vertices, edges connect neighboring pixels with weight based on intensity difference. MST helps find connected components (segments) by removing high-weight edges.

### 18.6.5 Phylogenetics

Construct evolutionary trees from genetic distance data using MST (or more sophisticated methods).

### 18.6.6 Maze Generation

Randomized Prim's algorithm can generate perfect mazes.

---

## 18.7 Variants and Related Problems

### 18.7.1 Maximum Spanning Tree

Replace weights with negatives or reverse comparison; Kruskal/Prim work with `max` instead of `min`.

### 18.7.2 Minimum Bottleneck Spanning Tree

A spanning tree that minimizes the maximum edge weight. Any MST is also a minimum bottleneck spanning tree (property: the heaviest edge in an MST is minimal among all spanning trees).

### 18.7.3 Steiner Tree

Find a minimum-weight tree connecting a subset of vertices (terminals), possibly using additional vertices (Steiner points). This is NP-hard, but MST gives a 2-approximation in metric spaces.

### 18.7.4 k-Minimum Spanning Tree (k-MST)

Find minimum-weight tree that spans exactly k vertices. NP-hard.

---

## 18.8 Practice Problems

### Problem 1: Connect Cities with Minimum Cost (LeetCode 1135)
Given n cities and possible roads with costs, find minimum cost to connect all cities. If impossible, return -1.

**Hint:** MST; if after MST edges count < n-1, return -1.

### Problem 2: Minimum Cost to Connect All Points (LeetCode 1584)
Given points (x, y), connect all points with Manhattan distance. Find min cost.

**Hint:** Build complete graph with Manhattan distances, then MST. O(n²) edges, use Prim's for O(n²) time (dense graph).

### Problem 3: Critical Connections in a Network (LeetCode 1192)
Find bridges in an undirected graph. (This is not MST but related to graph connectivity.)

### Problem 4: Find if an Edge is in Some MST
Given a graph and an edge e, determine if e belongs to some MST.

**Hint:** Check if e is the minimum-weight edge across some cut, or if it is not the heaviest in any cycle (cycle property).

### Problem 5: Second Best Minimum Spanning Tree
Find spanning tree with second smallest total weight.

**Hint:** Compute MST, then try replacing one edge with another not in MST that creates a cycle, and compute weight difference.

### Problem 6: Minimum Spanning Tree for Directed Graphs (Arborescence)
For directed graphs, the equivalent is the **minimum spanning arborescence** (rooted directed tree). Use Chu-Liu/Edmonds algorithm.

### Problem 7: Network Design with Redundancy
Find minimum cost to connect all cities with at least two distinct paths (2-edge-connected). This is more complex.

### Problem 8: Minimum Cost to Repair Edges (LeetCode 1168)
Given wells cost and pipes cost, find min cost to provide water to all houses. Equivalent to MST with virtual node representing water source, edges from source to houses with well cost.

---

## 18.9 Summary

```
┌─────────────────────────────────────────────────────────────────────┐
│                    MINIMUM SPANNING TREE SUMMARY                     │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Key Properties: Cut property, Cycle property, Uniqueness if distinct│
│                                                                      │
│  Algorithms:                                                         │
│    • Kruskal: Sort edges, union-find, O(E log E)                    │
│    • Prim: Grow tree from start, priority queue, O(E log V)         │
│    • Borůvka: Merge components via cheapest edge, O(E log V)        │
│                                                                      │
│  Applications: Network design, clustering, approximation algorithms │
│                                                                      │
│  Variants: Maximum ST, Bottleneck ST, Steiner tree                  │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
```

---

## 18.10 Further Reading

1. **"Introduction to Algorithms" (CLRS)** – Chapter 23 (Minimum Spanning Trees)
2. **"Algorithms"** by Robert Sedgewick – Chapter 4 (Minimum Spanning Trees)
3. **"The Algorithm Design Manual"** by Steven Skiena – Chapter 6 (Weighted Graph Algorithms)
4. **Original Papers**:
   - Kruskal, J. B. (1956) – "On the shortest spanning subtree of a graph and the traveling salesman problem"
   - Prim, R. C. (1957) – "Shortest connection networks and some generalizations"
   - Borůvka, O. (1926) – "O jistém problému minimálním" (in Czech)

---

> **Coming in Chapter 19**: **Advanced Graph Algorithms** – Network flow, matching, and advanced decompositions.

---

**End of Chapter 18**

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='17. shortest_path_algorithms.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='19. advanced_graph_algorithms.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
