### Karger's Algorithm w/ Union Find


The `Union-Find` data structure can be used to maintain a collection of disjoint sets a.k.a. `connected components`. This data structure supports the following operations:

1) `MakeUnionFind(S)`: Given a set of elements `S`, this operation initializes a collection of singleton components on the elements of `S`
2) `Find(u)`: Given an element u, this operation returns the connected component containing `u` 
3) `Union(A,B)`: Given two connected components `A` and `B`, this operation merges them into a single connected component.

##### Union-Find Pointer-based Implementation:

We define a `record` as an object that contains an element and a pointer. `MakeUnionFind(S)` creates a record for each element of $S$, each record represents a singleton connected component whose `label` is the `id` of the element it contains (element ids are integers in the range $[0,|S|-1]$). The pointers are initially set to null. These records are stored in a fixed size array indexed by the element `id`. A record with a null pointer corresponds to the element which labels a connected component.

We also create an array `Size[0:n-1]`, where $n=|S|$, which is filled with $1$. Thus the value `Size[i]` represents the the number of elements contained in the connected component with label `i`.

Given two connected components $A$ and $B$, let $u \in A$ and $v \in B$ be the records that label the respective components, also called the `root` records. Then to implement `Union(A,B)`, we first check `Size[u]` and `Size[v]`, then whichever component is smaller, we set the pointer of it's label record to point to the record labelling of the larger copmonent, e.g. if `Size[u] > Size[v]`, then we set `record[v].pointer = record[u]`. This merges $B$ into $A$.

Given an element whose id is `u`, the `Find(u)` operation first accesses the record containing that element, then follows the pointer back to the `root` record which labels the connected component containing element `u`. To optimize this operation, we can use the `path compression` technique. During each call to `Find(u)`, path compression involves simply updating the pointers of all records on the path from `u`, upto and excluding the root, to point directly to the root. So the next time we call `Find()` on any of these elements, their pointer will take us straight to the root instead of having to traverse over a long chain of records to get there.    


##### Implementing Karger Edge Contractions using Union-Find:

In Karger's algorithm, an edge contraction results in the merger of two vertices into a single super-vertex. We can implement this using a Union-Find data structure. Initially, each record contains a vertex. We also maintain a dynamic array, called the edge list, containing all the edges of the graph. Then given to contract an edge $(u,v)$, we first check if $u$ and $v$ are in the same connected component by comparing $Find(u)$ and $Find(v)$. If $Find(u) \neq Find(v)$, then we invoke a `Union()` on their respective connected components to merge the vertices and delete that edge from the edge list. Otherwise, if $Find(u) = Find(v)$, then we don't need to merge them, and only delete that edge from the edge list.

So, after performing $n-2$ `Union()` operations, we will be left with only two super-vertices and the edges remaining in the edge list will be the edges corresponding to the cut. This approach is simpler and more efficient than the adjacency list approach that we previously implemented.


In [None]:
class record():
    def __init__(self, item_id):
        self.item_id = item_id
        self.pointer =  None

    def update_pointer(self, pointer):
        self.pointer = pointer


class UnionFind():
    def __init__(self):
        pass
    
    def makeUnionFind(self, S):    
        # S contains a list of n items 
        self.S = S
        # we assign ids 0, 1, 2, 3, ...n-1 to the items in S
        self.item_id = {self.S[i]: i for i in range(len(S))}
        # create records for each item
        self.records = [record(i) for i in range(len(S))]
        self.Size = [1 for i in range(len(S))]

    def find(self, u): 
        # convert u to its corresponding id
        u = self.item_id[u]
        # follow the pointers back to root, store every record along the way in a list
        path = []
        while self.records[u].pointer:
            path.append(u)
            u = self.records[u].pointer.item_id
        root_id = u
        # path compression: reset the pointers of all the records in the path to the root
        for item_id in path:
            self.records[item_id].update_pointer(self.records[root_id])        
        return self.S[root_id]
    
    def union(self, A, B):
        # convert A and B to their corresponding ids
        A = self.item_id[A]
        B = self.item_id[B]
        # update the pointer of the root of the smaller component to the root of the larger component
        if self.Size[A] < self.Size[B]:
            self.records[A].update_pointer(self.records[B])
            self.Size[B] += self.Size[A]
            self.Size[A] = 0
        else:
            self.records[B].update_pointer(self.records[A])
            self.Size[A] += self.Size[B]
            self.Size[B] = 0    

    def __str__(self):
        # create dictionary of component lists
        component_dict = {i:[] for i in self.S}
        for i in range(len(self.S)):
            component_dict[self.find(self.S[i])].append(self.S[i]) 
        
        return str(component_dict)                