# Minimum Cost Spanning Trees (Prim's Algorithm)

### Minimum Cost Spanning Tree (MCST)

* Weighted undirected graph, $G = (V, E), W:E \rightarrow R$
  - $G$ assumed to be connected
* Find a minimum cost spanning tree
  - Tree connecting all the vertices in $V$
* Strategy
  - Incrementally grow the minimum cost spanning tree
  - Start with a smallest weight edge overall
  - Extend the current tree by adding the smallest edge from the tree to a vertex, which is not yet in the tree

**Example**

![Graph](https://firebasestorage.googleapis.com/v0/b/fb-sandbox-25.appspot.com/o/mcspt-2.png?alt=media&token=2634c518-541d-4e6e-b2c9-c993a2041cf2)

* Start with the smallest edge $(1, 3)$
* Extend the tree with $(1, 0)$
* Can't add $(0, 3)$, forms a cycle
* Instead, extend the tree with $(1, 2)$
* Extend the tree with $(2, 4)$

### Prim's Algorithm

* $G = (V, E), W:E \rightarrow R$
* Incrementally, build an MCST
  - $TV \subseteq V$: tree vertices, already added to MCST
  - $TE \subseteq E$: tree edges, already added to MCST
* Initially, $TV = TE = \phi$
* Choose minimum weight edge $e = (i, j)$
  - Set $TV = \{i, j\}, TE = \{e\}$ MCST
* Repeat $n - 2$ times
  - Choose minimum weight edge $f = (u, v)$ such that $u \in TV, v \notin TV$
  - Add $v$ to $TV$, $f$ to $TE$

**Example**

![Graph](https://firebasestorage.googleapis.com/v0/b/fb-sandbox-25.appspot.com/o/mcspt-2.png?alt=media&token=2634c518-541d-4e6e-b2c9-c993a2041cf2)

$TV = \{1, 3, 0, 2, 4\}$

$TE = \{(1, 3), (1, 0), (1, 2), (2, 4)\}$

### Correctness of Prim's Algorithm

**Minimum Separator Lemma**
* Let $V$ be partitioned into 2 non-empty sets $U$ and $W = V \setminus U$
* Let $e = (u, w)$ be the minimum cost edge with $u \in U, w \in W$
* Every MCST must include $e$

-------------------------------------------------------------------------------
* Assume for now, that all the edge weights are distinct
* Let $T$ be an MCST, $e \notin T$
* $T$ contains a path $p$ from $u$ to $w$
  - $p$ starts with $U$, ends in $W$
  - Let $f = (u', w')$be the first edge on $p$ crossing from $U$ to $W$
  - Drop $f$, add $e$ to get a cheaper spanning tree
-------------------------------------------------------------------------------
* Now, what if 2 edges have the same weight?
* Assign each edge a unique index from $0$ to $m - 1$
* Define $(e, i) \lt (f, j)$ if $W(e) \lt W(j)$ or $W(e) = W(j)$ and $i \lt j$

![Graph](https://firebasestorage.googleapis.com/v0/b/fb-sandbox-25.appspot.com/o/prim-1.png?alt=media&token=31134ba9-0f43-4101-af35-903d7f66e358)

### Correctness of Prim's Algorithm

* In Prim's Algorithm, $TV$ and $W = V \setminus TV$ partition $V$
* Algorithm picks the smallest edge connecting $TV$ and $W$, which must belong to every MCST
* In fact, for any $v \in V, \{v\}$ and $V \setminus \{v\}$ form a partition
* The smallest weight edge leaving any vertex must belong to every MCST
* We started with overall minimum cost edge
* Instead, can start at any vertex $v$, with $TV = \{v\}$ and $TE = \phi$
* First iteration will pick minimum cost edge from $v$

### Implementation

* Keep track of
  - `visited[v]` - is `v` in the spanning tree?
  - `distance[v]` - shortest distance from `v` to the tree
  - `TreeEdges` - edges in the current spanning tree
* Initialize `visited[v]` to `False`, `distance[v]` to `infinity`
* First add vertex `0` to tree
* Find edge `(u, v)` leaving the tree where `distance[v]` is minimum, add it to the tree, update `distance[w]` of neighbours

In [None]:
def prim_list(WList):
  infinity = 1 + max([d for u in WList.keys()
                          for (v, d) in WList[u]])
  (visited, distance, TreeEdges) = ({}, {}, [])
  
  for v in WList.keys():
    (visited[v], distance[v]) = (False, infinity)
  
  visited[0] = True

  for (v, d) in WList[0]:
    distance[v] = d
  
  for i in WList.keys():
    (mindist, nextv) = (infinity, None)

    for u in WList.keys():
      for (v, d) in WList[u]:
        if visited[u] and (not visited[v]) and d < mindist:
          (mindist, nextv, nexte) = (d, v, (u, v))
    
    if nextv is None:
      break
    
    visited[nextv] = True
    TreeEdges.append(nexte)

    for (v, d) in WList[nextv]:
      if not visited[v]:
        distance[v] = min(distance[v], d)
  
  return TreeEdges

### Complexity

* Initialization takes $O(n)$
* Loop to add notes to the trees runs $O(n)$ times
* Each iteration takes $O(m)$ time to find a node to add
* Overall time is $O(mn)$, which could be $O(n^3)!$
* Can we do better than this?

### Improved Implementation

* For each `v`, keep track of its nearest neighbours in the tree
  - `visited[v]` -  is `v` in the spanning tree?
  - `distance[v]` - shortest distance from `v` to the tree
  - `nbr[v]` - nearest neighbour of `v` in tree
* Scan all the non-tree vertices to find `nextv` with minimum distance
* Then `(nbr[nextv], nextv)` is the tree edge to add
* Update `distance[v]` and `nbr[v]` for all the neighbours of `nextv`

In [None]:
def prim_list2(WList):
  infinity = 1 + max([d for u in WList.keys()
                          for (v, d) in WList[u]])
  (visited, distance, nbr) = ({}, {}, {})

  for v in WList.keys():
    (visited[v], distance[v], nbr[v]) = (False, infinity, -1)
  
  visited[0] = True

  for (v, d) in WList[0]:
    (distance[v], nbr[v]) = (d, 0)
  
  for i in range(1, len(WList.keys())):
    nextd = min([distance[v] for v in WList.keys()
                    if not visited[v]])
    nextvlist = [v for v in WList.keys()
                    if (not visited[v]) and distance[v] == nextd]
    
    if nextvlist == []:
      break
    
    nextv = min(nextvlist)
    visited[nextv] = True

    for (v, d) in WList[nextv]:
      if not visited[v]:
        (distance[v], nbr[v]) = (min(distance[v], d), nextv)
  return nbr

### Improved Implementation - Complexity

* Now the scan to find the next vertex to add is $O(n)$
* Very similar to Dijkstra's algorithm, except for the update rule for distance
* Like Dijkstra's algorithm, this is still $O(n^2)$ even for adjacency lists
* With a more clever data structure to extract the minimum, we can do better

### Summary

* Prim's algorithm grows an MCST starting with any vertex
* At each step, connect one more vertex to the tree using minimum cost edge from inside the tree to outside the tree
* Correctness follows from Minimum Seperator Lemma
* Implementation similar to Dijkstra's algorithms
  - Update rule for distance is different
* Cmplexity is $O(n^2)$
  - Even with adjacency lists
  - Bottleneck is identifying unvisited vertex with minimum distance
  - Need a better data structure to identify and remove minimum (or maximum) from a collection