In [1]:
# automatically reload dependant notebooks
%load_ext autoreload
%autoreload 2
import import_ipynb

# Minimum Spanning Trees

Minimum spanning tree (MST) algorithm is one of the fundamental graph algorithms. Given a connected, undirected graph $G = (V, E)$, MST discovers a spanning tree $T = (V, E)$ with the minimum total weight $w(E) = \sum_{(u, v) \in E}{w(u, v)}$, where $w : E \rightarrow \mathbb{R}$ is the weight function that associates each edge with a weight value. See CLRS 4ed Chapter 21 *Minimum Spanning Trees* p.607.

CLRS begins by presenting a generic MST algorithm on p.587. This algorithm, as the name suggests, is just a conceptual description. The core idea is to maintain a subset $A$ of edges containing safe edges. A *safe edge* is one that can be added to $A$ without violating the condition that $A \subseteq T$, where $T$ is the MST of the graph. The algorithm incrementally adds safe edges to $A$, one at a time, until $A = T$.

There are two actual implementations for the generic algorithm: Kruskal's algorithm and Prim's algorithm, named after the inventors. Both are greedy algorithms in a sense that at each step, they choose the safe edge with the minimum weight. We will use the structure of the generic MST algorithm and customise it in accordance with Kruskal's and Prim's specifications to obtain the actual implementations.

By the way, many algorithms in CS are named after the inventors. In academia, naming your invention after yourself is poor form, but others naming it after you is high praise. In this instance, the CS community had named these two important algorithms after their inventors so as to honour them.

## MST graph with weighted edges

MST algorithms use weighted edges. So, we define the weighted edge `WgtEdge` and `MSTGraph` which uses this new edge.

In [2]:
from graph import *
from util import *

class WgtEdge(Edge):
    def __init__(self, u: Vertex, v: Vertex, wgt: float=Infinity):
        super().__init__(u, v)
        self.wgt: float = wgt

    def __str__(self) -> str: return f"{self.tag}: {self.showWeight()}"
    def show(self) -> str: return f"{self.showWeight()}"
    def showWeight(self) -> str: return str(self.wgt) if self.wgt != Infinity else ""

class MSTGraph(Graph):
    def __init__(self, tag: Tag):
        super().__init__(tag)
    
    def makeVEw(self, vs: List[Tag], es: Dict[Tag, List[Tag]], ew: Dict[Tag, float]) -> None:
        self.makeV(vs)
        self.makeEw(es, ew)
    def makeEw(self, es: Dict[Tag, List[Tag]], ew: Dict[Tag, float]) -> None:
        for utag, vids in es.items():
            for vtag in vids:
                u = self.getV(utag)
                v = self.getV(vtag)
                e = WgtEdge(u, v, ew[makeEtag(u, v)])
                self.ee[e.tag] = e

importing Jupyter notebook from graph.ipynb
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
importing Jupyter notebook from util.ipynb


## Kruskal's algorithm

Kruskal's algorithm starts out with little forests and connects them with safe edges into a single tree that spans the graph. Initially, each vertex is an individual forest. These forests are stored in the `ff` disjoint set, which is defined in the `util` notebook. `Graph` and `Tree`, of course, came from the `graph` notebook. And because we assign weights as edge attributes during graph construction, we do not need to pass the $w : E \rightarrow \mathbb{R}$ weight function into `mstKruskal()`.

In [3]:
def mstKruskal(g: MSTGraph) -> Tree:
    # initialize
    a: ESet = {} # edge set of MST
    ds = DSet(attr=lambda e: e.wgt) # forests disjoint set
    for u in g.getVV(): ds.makeSet(u)
    # discover MST in graph g
    for e in sorted(g.getEE(), key=lambda e: e.wgt): # edges ascending sorted by their weights
        if ds.findSet(e.u) != ds.findSet(e.v):
            a[e.tag] = e
            ds.union(e.u, e.v)
    # extract MST t from graph g using tree edge set a
    t = Tree(f"{g.tag}†")
    for e in a.values():
        t.insE(e)
        t.insV(e.u)
        t.insV(e.v)
    return t

## Prim's algorithm

Prim's algorithm starts out with one edge and collects more safe edges until a spanning tree forms. Again, since we use the `wgt` attribute of `Edge`, we need not supply the $w : E \rightarrow \mathbb{R}$ weight function into `mstPrim()`. The argument `r` is the root vertex chosen arbitrarily from the vertex set of the connected, undirected graph `g`. Prim's algorithm employs a min-priority queue to keep track of vertices that are not yet part of the MST. For that, we use Python's built-in `PriorityQueue`. See §6.5 *Priority queues* p.172. And CLRS uses the priority attribute called $v.key$ on each vertex, which is the minimum weight of any edge connecting $v$ to a vertex in the growing tree $A$. So, we define the prioritised vertex `PriVertex` and `PrimMSTGraph` which uses this new vertex type.

In [4]:
from queue import PriorityQueue
from util import *

class PriVertex(Vertex):
    def __init__(self, tag: Tag):
        super().__init__(tag)
        self.pri: float = Infinity

    def __str__(self) -> str: return f"{super().__str__()} {self.priority()}"
    def priority(self) -> str: return f"{self.pri if self.pri != Infinity else ''}"

    def __lt__(self, v: "PriVertex") -> bool: return self.pri < v.pri # needed by PriorityQueue

class PrimMSTGraph(MSTGraph):
    def __init__(self, tag: Tag):
        super().__init__(tag)
    
    def makeV(self, vs: List[Tag]) -> None:
        for vtag in vs: self.vv[vtag] = PriVertex(vtag)

def mstPrim(g: MSTGraph, r: PriVertex) -> Tree:
    # initialize
    for u in g.getVV():
        u.par = None
        u.pri = Infinity
    r.pri = 0
    q = PriorityQueue()
    for u in g.getVV(): q.put(u)
    # discover MST in graph g
    while not q.empty():
        u = q.get()
        for v in g.adj(u):
            e = g.getE(makeEtag(u, v))
            if v in q.queue and e.wgt < v.pri:
                v.par = u
                v.pri = e.wgt
                q.queue.sort() # rearrange q to account for decreased v.pri
    # extract MST t from graph g using tree vertices vv
    t = Tree(f"{g.tag}†")
    for v in g.getVV():
        t.insV(v)
        if not v.isRoot(): t.insE(g.getE(makeEtag(v, v.par))) # see p.596
    return t

# Conclusion

In this notebook, we implemented Kruskal's and Prim's MST algorithms. The MSTs produced by this code may look different. But upon close inspection, you will note that our trees are equivalent to the ones shown in Figure 21.4 p.593 (Kruskal's) and Figure 21.5 p.595 (Prim's): the total edge weights between our trees and those in CLRS are identical. Such is the nature of graph algorithms; different, but correct, results may be obtained by different implementations of the same algorithm. Many factors can affect the appearance of the results: the choice of the first vertex to process, the order of vertex processing, the choice of the first edge to process, the order of edge processing, implementation details of the data structures like FIFO queues and priority queues, whether it is raining outside, and so on.