# Minimum Cost Spanning Trees (Prim's Algorithm)

### Minimum Cost Spanning Tree (MCST)

* Weighted undirected graph, $G = (V, E), W:E \rightarrow R$
  - $G$ assumed to be connected
* Find a minimum cost spanning tree
  - Tree connecting all the vertices in $V$
* Strategy #2
  - Start with $n$ components, each a single vertex
  - Process edges in ascending order of cost
  - Include edge if it does not create a cycle

**Example**

![Graph](https://firebasestorage.googleapis.com/v0/b/fb-sandbox-25.appspot.com/o/mcspt-2.png?alt=media&token=2634c518-541d-4e6e-b2c9-c993a2041cf2)

* Start with the smallest edge $(1, 3)$
* Add next smallest edge $(2, 4)$
* Add the next smallest edge $(0, 1)$
* Can't add $(0, 3)$, forms a cycle
* Add the next smallest edge $(1, 2)$

### Kruskal's Algorithm

* $G = (V, E), W:E \rightarrow R$
* Let $E = \{e_0, e_1, ..., e_{m - 1}\}$ be the edges sorted in ascending order by their weight
* Let $TE \subseteq E$ be the set of tree edges already added to the MCST
* Initially, $TE = \phi$
* Scan $E$ from $e_0$ to $e_{m - 1}$
  - If adding $e_i$ to $TE$ creates a loop, skip it
  - Otherwise, add $e_i$ to $TE$

![Graph](https://firebasestorage.googleapis.com/v0/b/fb-sandbox-25.appspot.com/o/ka-1.png?alt=media&token=6b28fd09-3b46-41a7-9919-d1982f739205)

### Correctness of Kruskal's Algorithm

**Minimum Separator Lemma**
* Let $V$ be partitioned into 2 non-empty sets $U$ and $W = V \setminus U$
* Let $e = (u, w)$ be the minimum cost edge with $u \in U, w \in W$
* Every MCST must include $e$

-------------------------------------------------------------------------------
* Edges in $TE$ partition vertices into connected components
  - Initially, each vertex is a separate component
* Adding $e = (u, w)$ merges components of $u$ and $w$
  - If $u$ and $w$ are in the same component, $e$ forms a cycle and is discarded
* Let $U$ be component of $u, W$ be $V \setminus U$
  - $U, W$ forms a partition of $V$ with $u \in U$ and $w \in W$
  _ Since, we are scanning edges in ascending order of cost, $e$ is the minimum cost edge connecting $U$ and $W$, so it must be a part of any MCST

### Implementing Kruskal's Algorithm

* Collect edges in a list as `(d, u, v)`
  - Weight as first component for easy sorting
* Main challenge is to keep track of connected components
  - Dictionary to record component of each vertex
  - Initially, each vertex is an isolated component
  - When we add an edge `(u, v)`, merge the components of `u` and `v`

In [None]:
def kruskal(WList):
  (edges, components, TE) = ([], {}, [])

  for u in WList.keys():
    edges.extend([(d, u, v) for (v, d) in WList[u]])
    component[u] = u
  edges.sort()

  for (d, u, v) in edges:
    if component[u] != component[v]:
      TE.append((u, v))
      c = component[u]

      for w in WList.keys():
        if component[w] == c:
          component[w] = component[v]
  
  return TE

**Analysis**
* Sorting the edges is $O(m.logm)$
  - Since $m$ is at most $n^2$, equivalently $O(m.logn)$
* Outer loop runs $m$ times
  - Each time we add a tree edge, we have to merge components - $O(n)$ scan
  - $n - 1$ tree edges so this is done $O(n)$ times
* Overall, $O(n^2)$

-------------------------------------------------------------------------------
* Complexity is $O(n^2)$
* Bottleneck is naive strategy to lable and merge components
* Components **partition** vertices
  - Collection of disjoint sets
* Data structure to maintain collection of disjoint sets
  - `find(v)` - return the set containing `v`
  - `union(u, v)` - merge sets of `u, v`
* Efficient **union-find** brings complexity down to $O(m.logn)$

### Summary

* Kruskal's algorithm builds an MCST bottom up
  - Start with $n$ components, each an isolated vertex
  - Scan the edges in ascending order of cost
  - Whenever an edge merges disjoint components, add it to the MCST
* Correctness follows from Minimum Separator Lemma
* Complexity is $O(n^2)$ due to naive handling of components
  - We will see how to improve to $O(m.logn)$
* If edge weights repeat, MCST is not unique
* "Choose minimum cost edge" will allow choices
  - Consinder a triangle on 3 vertices with all edges equal
* Different choices lead to different spanning trees
* In general, there may be a very large number of minimum cost spanning trees