# Minimum spanning tree

A minimum spanning tree (MST) is a subset of the edges of connected, weighted and undirected graph which:

* Connectes all vertices together
* No cycle
* Minimum total edge

<u>NB:</u> Différence avec SSSP

* Minimum spanning tree finds the cheapest way to connect all vertices
* Single source shortest takes a source vertex and find the cheapest way from this source vertex

> We need to be careful about these 2 concepts

## Disjoint Set

We are seeing this because we will use disjoint set in the algorithm of Minimum spanning tree.

It is a data structure that keeps track of set of elements which are partitioned into a number of disjoint and non overlapping sets and each sets have representative which helps in identifying that sets.

### MakeSet(N)

Used to create initial set

```py
A   B   C   D   E
```

### Union(x, y)

Merge two given sets

e.g: union(A, B)

```python
AB  C   D   E
```

e.g: union(A, E)


```python
ABE  C   D
```

### findSet(x)

Returns the set name in which the element is there

e.g: findSet(B) $\rarr$ AB

findSet(E) $\rarr$ ABE

Let's create Disjoint Set in python

In [4]:
class DisjointClass:
    def __init__(self, vertices):
        self.vertices = vertices
        self.parent = {}
        for v in vertices:
            self.parent[v] = v
        self.rank = dict.fromkeys(vertices, 0)

    def find(self, item):
        if self.parent[item] == item:
            return item
        else:
            return self.find(self.parent[item])

    def union(self, x, y):
        x_root = self.find(x)
        y_root = self.find(y)

        if self.rank[x_root] < self.rank[y_root]:
            self.parent[x_root] = y_root
        elif self.rank[x_root] > self.rank[y_root]:
            self.parent[y_root] = x_root
        else:
            self.parent[y_root] = x_root
            self.rank[x_root] += 1

vertices = ["A", "B", "C", "D", "E"]

ds = DisjointClass(vertices=vertices)
ds.find("A")

ds.union("A", "B")
ds.union("A", "C")
print(ds.find("A"))
print(ds.find("B"))

A
A


## Kruskal's algorithm

Kruskal algorithm is a greedy algorithm (we'll see it later). A greedy algorithm tries to pick the best solution at each step.

It finds a **minimum spanning tree** for **weighted undirected graphs** in two ways:
* Add increasing cost edges at each step
* Avoid any cycle at each step

The pseudo-code is like this 

```python
kruskal(G):
for each vertex:
    makeSet(v) # ===> O(V)
sort each edge in non decreasing order by weight       # ===> O(eloge) 
for each edge (u, v):  # ===> O(e)
    if findSet(u) != findSet(v):  # ===> O(1)       |
        union(u,v)              # ===> O(v)         |   O(ev)
    cost = cost + edge(u, v)  # ===> O(1)           |
```

Time complexity: $\Omicron(V + ElogE + EV) = \Omicron(ElogE)$
Space complexity: $\Omicron(V+E)$

Let's implement Kruskal implement in python

In [10]:
class Graph:
    def __init__(self, vertices) -> None:
        self.vertices = vertices
        self.graph = []
        self.nodes = []
        self.mst = []

    def add_edge(self, s, d, w):
        self.graph.append([s, d, w])
    
    def add_node(self, value):
        self.nodes.append(value)

    def print_solution(self, s, d, w):
        for s, d, w in self.mst:
            print("%s - %s - %s" % (s, d, w))

    def kruskal(self):
        i, e = 0, 0
        ds = DisjointClass(self.nodes)
        self.graph = sorted(self.graph, key=lambda item: item[2])
        while e < self.vertices - 1:
            s, d, w = self.graph[i]
            i += 1
            x = ds.find(s)
            y = ds.find(d)
            if x != y:
                e += 1
                self.mst.append([s, d, w])
                ds.union(x, y)
        self.print_solution(s, d, w)


g = Graph(5)
g.add_node("A")
g.add_node("B")
g.add_node("C")
g.add_node("D")
g.add_node("E")

g.add_edge("A", "B", 5)
g.add_edge("A", "C", 13)
g.add_edge("A", "E", 15)
g.add_edge("B", "A", 5)
g.add_edge("B", "C", 10)
g.add_edge("B", "D", 8)
g.add_edge("C", "A", 13)
g.add_edge("C", "B", 10)
g.add_edge("C", "E", 20)
g.add_edge("C", "D", 6)
g.add_edge("D", "B", 8)
g.add_edge("D", "C", 6)
g.add_edge("E", "A", 15)
g.add_edge("E", "C", 20)

g.kruskal()

A - B - 5
C - D - 6
B - D - 8
A - E - 15


## Prim's algorithm

Also a greedy algorithm. It finds the minimum spanning tree for weighted undirected graphs in following ways:

1. Take any vertex as a source set its weight to $0$ and all other vertices weight to infinity
2. For every adjacent vertices if the current weight is more than current edge then we set it to current edge
3. Then we mark current vertex as visited
4. Repeat these steps for all vertices in increasing order of weight

In [12]:
import sys
class Graph:
    def __init__(self, vertex_num, edges, nodes):
        self.vertex_num = vertex_num
        self.edges = edges
        self.nodes = nodes
        self.mst = []

    def print_solution(self):
        print("Edge : Weight")
        for s, d, w in self.mst:
            print("%s - %s - %s" % (s, d, w))

    # Time complexity O(V^3) | Space complexity is O(V)
    def prim(self):
        visited = [0]*self.vertex_num
        edge_num = 0
        visited[0] = True
        while edge_num < self.vertex_num - 1:
            min = sys.maxsize
            for i in range(self.vertex_num):
                if visited[i]:
                    for j in range(self.vertex_num):
                        if((not visited[j] and self.edges[i][j])):
                            if min > self.edges[i][j]:
                                min = self.edges[i][j]
                                s = i
                                d = j

            self.mst.append([self.nodes[s], self.nodes[d], self.edges[s][d]])
            visited[d] = True
            edge_num += 1
        self.print_solution()


edges = [[0, 10, 20, 0, 0],
         [10, 0, 30, 5, 0],
         [20, 30, 0, 15, 6],
         [0, 5, 15, 0, 8],
         [0, 0, 6, 8, 0]]

nodes = ["A", "B", "C", "D", "E"]

g = Graph(5, edges=edges, nodes=nodes)
g.prim()


Edge : Weight
A - B - 10
B - D - 5
D - E - 8
E - C - 6


### Comparison of Kruskal and Prim

#### Kruskal

* Concentrates on edges
* Finalize edge in each iteration


#### Prim

* Concentrate on vertices
* Finalize vertex in each iteration

### Applications

#### Krruskal applications

* Landing cables
* TV network
* Tour operations
* LAN networks
* A network of pipes for drinking water or natural gas
* An electric grid
* Single link cluster


#### Prim applications
* Network for roads and rail tracks connecting all the cities
* Irrigation channels and placing microware towers
* Desinging a fiber optic grid or ICs
* Traveling Salesman Problem
* Cluster analysis
* Pathfinding algorithms used in AI