## Kruzkal's Minimum Spanning Tree Algorithm

This algorithm has 3 main steps:

- Sort edges by weight
- Walk through the sorted edges and look at the 2 nodes an edge belongs to
  - If nodes are already unified, don't include this edge
  - Else if not, include it and unify the nodes
- Algorithm Terminates when every edge has been processed or all vertices are unified

## Kruzkal Example

Input Graph:

![](../../../%20images/kruzkal_mst_input.png)

Output Graph:

![](../../../%20images/kruzkal_output_graph.png)

## Implementation

A relevant problem to practice Kruzkal's MST is this leetcode problem called [Min Cost to Collect All Points](https://leetcode.com/problems/min-cost-to-connect-all-points/description/)

Essentially, we are given a random set of coordinates on a 2D grid. The cost to make an edge between any two coordinates has been chosen for us to be the **manhattan distance**, which is `x1 - y1 + x2 - y2`. We must figure out what would be the least costly way to make all points connected with edges.

## Brainstorming

After learning about Minimum Spanning Trees, it should be evident that an MST is pretty much what this problem is asking for, but without the edges already generated. So essentially, we need to generate all the different combinations of two points as edges and their costs.

- Create the UnionFind Class since this is an MST problem
- Generate Edge object tuples as (distance, pointA, pointB)
  - We set distance as the first object so it's super easy to just sort them by distance
- Sort edges in increasing order
- Loop through edges in that order
  - if a set of points are able to be unified, then add the cost to a global cost variable
  - also subtract from a variable that keeps track of the number of points, so that we know when to stop out of the algorithm
- Return cost

## Algorithm


In [4]:
# Good example of a shorter Union Find implementation made for this situation, no extra fluff :)
class UnionFind:

    def __init__(self, size):

        self.sizes = [1 for x in range(size)]
        self.ids = [x for x in range(size)]

    def find(self, node):
        if node != self.ids[node]:
            self.ids[node] = self.find(self.ids[node]) # path compression
        return self.ids[node]

    def unify(self, node1, node2):
        root1 = self.find(node1)
        root2 = self.find(node2)

        if root1 == root2:
            return False # Return boolean false to prevent adding the distance to the cost if already counted

        # Rank
        if self.sizes[root1] > self.sizes[root2]:
            self.ids[root2] = root1
            self.sizes[root1] += self.sizes[root2]
        else:
            self.ids[root1] = root2
            self.sizes[root2] += self.sizes[root1]
        return True

def minCostConnectPoints(points):

    def manhattan_dist_helper(coord1, coord2):
        return abs(coord1[0] - coord2[0]) + abs(coord1[1] - coord2[1])


    edges = []
    num_edges = len(points) - 1
    cost = 0

    # Grabs all couple combinations of coordinate points
    for i in range(len(points)):
        for j in range(i + 1, len(points)):
            # i represents point1, j represents point2 in the union find
            edges.append([manhattan_dist_helper(points[i], points[j]), i, j])

    
    edges.sort()
    uf = UnionFind(len(points))
    for distance, node1, node2 in edges:
        if uf.unify(node1, node2):
            cost += distance

        if num_edges == 0:
            break
    
    return cost

minCostConnectPoints([[0,0],[2,2],[3,10],[5,2],[7,0]]) # -> 20


20

## Analysis:

Time Complexity: O(E * log(v)) because of the sorting in Kruzkal's Algorithm

Space: O(n) which is the maximum length of union find