Input: connected weighted graph as edge list. Output: MST

Sort the edges for ascending weights.

For all edges $e$ in ascending order:

- If neither of $e$'s endpoints are colored, give them a new color. Give $e$ the same color.
- If precisely 1 endpoint is colored, give the other one and e the same color.
- If both endpoints have the same color, ignore $e$, conitnue the for loop with the next edge.
- If both endpoints have different colors, change the color of the smallest tree, to the color of the largest one. Give $e$ the same color.

Output the set of colored edges, there is only one color.

complexity: $O(m \log m)$ for graphs with $m$ edges.

The last step, each time flip the color of colored nodes and edges, the flipped tree is smaller or equal to the larger tree. This can happen at most in $\log m$ time.

And we need to iterate over all the edges, it's $O(m \log m)$ 

__Correctness__

The minimal spanning tree contains an edge of __one of the minimal weight__ from __each__ cut.

prove:

obeservation  
1. Adding an edge to a tree => graph with unique cycle.  
2. Removing and edge from the cycle => get back to a tree

We have partitioned the nodes into two sets, the left ellipse and the right ellipse. We know that these two sites are connected by several edges, the edges of the cut.

Let us consider a minimal spanning tree and assume it does not contain the edge of minimal weight of the cut. We add the edge of minimal weight. We get a cycle. We know if we remove an edge with larger from the cycle, we get back to a spanning tree, and the weight of the tree decreased. 

Thus the __spanning tree must contain the minimal weight of any cut, otherwise it's not a minimal spanning tree.__

1. We will show that each time Kruskal's algorithm add some edge, it is a minimal weight edge from some cut, and so it must belong to the minimal spanning tree. 

let's first make the minimal spanning tree unique by assuming that __all weights are different__. Then we know that every cut has a unique minimal weight and so we can add it. 

Because each time we add a edge with minimal weight of the remaining edges. So it must be a minimal weight of some cut. And we Kruskal's algorithm joining two trees, there is no edge with smaller weight because if there was one, it would have been considered earlier.

2. What happens in the weights that are equal. We slightly modify the weights, we have broken ties, but the sorting remains. same as the case above, we proven it's correctness.

In [32]:
import numpy 

def kruskal(g):
#####################################
#                                   #
#  Defining the function            #
#  kruskal(g) here!                 #
#                                   #
#  -  The function                  #
#     kruskal(g) takes              #
#     the weighted adjacency matrix #
#     of graph G, as input.         #
#                                   #
#  -  The function                  #
#     kruskal(g) must               #
#     return a list of all edges in #
#     the Minimum Spaning Tree(MST) #
#     of G.                         #
#                                   #
#  Example:                         #
#            g: [[0 0 7 5]          #
#                [0 0 7 1]          #
#                [7 7 0 9]          #
#                [5 1 9 0]]         #
#                                   #
#   kruskal(g):[(1,3),(0,3),(0,2)]  #
#                                   #
#####################################
    n = len(g)
    
    adjacency_list = {}
    
    for i in range(0, n):
        for j in range(i, n):
            if g[i][j] > 0:
                adjacency_list[(i, j)] = g[i][j]
                
    adjacency_list_sorted = {k: v for k, v in sorted(adjacency_list.items(), key=lambda item: item[1])}
    
    nodes_color = {}
    color_nodes = {}
    colored_edges = []
    colors = 0
    
    for edge, weight in adjacency_list_sorted.items():
        # If neither of e's endpoints are colored, give them a new color. Give e the same color.
        if edge[0] not in nodes_color and edge[1] not in nodes_color:
            nodes_color[edge[0]] = colors
            nodes_color[edge[1]] = colors
            
            color_nodes[colors] = [edge[0], edge[1]]
            
#             print(color_nodes)
            
            colors += 1
        # If precisely 1 endpoint is colored, give the other one and e the same color.
        elif edge[0] not in nodes_color and edge[1] in nodes_color:
            nodes_color[edge[0]] = nodes_color[edge[1]]
            
            color_nodes[nodes_color[edge[1]]].append(edge[0])
            
#             print(color_nodes)
            
        elif edge[0] in nodes_color and edge[1] not in nodes_color:
            nodes_color[edge[1]] = nodes_color[edge[0]]

            color_nodes[nodes_color[edge[0]]].append(edge[1])
            
#             print(color_nodes)
            
        # If both endpoints are colored
        elif edge[0] in nodes_color and edge[1] in nodes_color:
            # both endpoints are colored and have the same color, ignore and continue
            if nodes_color[edge[0]] == nodes_color[edge[1]]:
                continue
            else:
            # If both endpoints have different colors, 
            # change the color of the smallest tree, 
            # to the color of the largest one. Give e the same color.
                if len(color_nodes[nodes_color[edge[0]]]) > len(color_nodes[nodes_color[edge[1]]]):
                    # change the color of small tree to the color of larger tree 
                    color_nodes[nodes_color[edge[0]]] = \
                    color_nodes[nodes_color[edge[0]]] + color_nodes[nodes_color[edge[1]]]
                    
                    deleted_color_nodes = color_nodes.pop(nodes_color[edge[1]], None)
                    
                    for n in deleted_color_nodes:
                        nodes_color[n] = nodes_color[edge[0]]
                        
#                     print("delete", nodes_color[edge[1]], color_nodes[nodes_color[edge[1]]],\
#                          nodes_color[edge[0]], color_nodes[nodes_color[edge[0]]])

                else:
                    # change the color of small tree to the color of larger tree
                    color_nodes[nodes_color[edge[1]]] = \
                    color_nodes[nodes_color[edge[1]]] + color_nodes[nodes_color[edge[0]]]
                    
#                     print("delete", nodes_color[edge[0]], color_nodes[nodes_color[edge[0]]],\
#                          nodes_color[edge[1]], color_nodes[nodes_color[edge[1]]])
                    
                    deleted_color_nodes = color_nodes.pop(nodes_color[edge[0]], None)
                        
                    for n in deleted_color_nodes:
#                         print(n, edge[1], nodes_color)
                        nodes_color[n] = nodes_color[edge[1]]
                    
#                     print('after delete', color_nodes, nodes_color)
                    
        # aftered colored two nodes, add their edge to MST
        colored_edges.append(edge)
#     print(colored_edges)
    return colored_edges
            
      
# MAIN ---------------------------------------------------------
def main(input):
 
    output = kruskal(input)
    return output

if __name__ == '__main__':

#     input = numpy.array([
#                 [0, 0, 7, 5,3],          
#                [0, 0, 7, 1,0],          
#                [7, 7, 0, 9,0],          
#                [5, 1, 9, 0,0],
#                 [3, 0, 0, 0,0],
#     ]) 
        
    main(input)