# Union Find

Union Find, also known as the **disjoint set**, is a data structure that keeps track of elements which are split into 1 or more disjoint sets.

Functions:
- Union - Join sets together
- Find - Find a set

## What it's used for 

- [Kruzkal's Minimum Spanning Tree Algorithm](https://youtu.be/JZBQLXgSGfs)
- Grid Percolation
- Network Connectivity
- LCA (Least Common Ancestor) in trees
- Image Processing

## Creating a Union Find

We want to create an array based union find since it's pretty efficient and easy to work with.

- Construct a bijection
  - mapping between objects and the integers in the range [0, n)

### Example bijection

Let's say we're given these nodes and numbers to attach to them:

![](../../%20images/bijection_example_input.png)

Next we want to create a mapping of them like this:

```mermaid
graph LR
    subgraph mappings
        E --> 0
        F --> 1
        I --> 2
        D --> 3
        C --> 4
        A --> 5
        J --> 6
        L --> 7
        G --> 8
        K --> 9
        B --> 10
        H --> 11
    end
```

After this, we need to put these mapped objects into an array:

![](../../%20images/mapped_bijection_array.png)

To understand how to setup unions, [watch this clip from youtube with nice animations](https://youtu.be/0jNmHPfA_yE?t=180)

Too lazy to draw all that out.

This is without path compression.

## Union Find Operations

Find Operation:

To find the root of a component by following the parent nodes until a self loop is reached (when a node's parent is itself)

Union Operation:

To unify two elements, find the root nodes of each element's component.
  - If they are different, make one of the root nodes be the parent of the other (usually the smaller points to the larger) 
  - If both are the same, ignore b/c already unified

## Path Compression

With [path compression](https://youtu.be/VHRhJWacxis), we are dynamically making links to the root node as needed. 
- Makes the algorithm far more efficient since we don't have to traverse to find the root node each time.
- amortized constant time to search using this implementation instead

# Data Structure

Here's the actual data structure in python utilizing path compression. Translated Java from WilliamFiset's Video

In [56]:
class UnionFind:

    def __init__(self, size):
        if size < 0:
            raise Exception("Size can't be < 0 for UnionFind")

        # Number of elements in the union find
        self.size = size

        # Number of components in the union find
        self.num_components = size

        # Track sizes of each component
        # Each component is initially a size of 1
        self.sizes = [1 for x in range(size)]

        # ids[i] points to the parent of i, ex: if id[i] = i, it is a root node
        self.ids = [x for x in range(size)]


    # Find root of node
    def find(self, node):
        if node != self.ids[node]:
            self.ids[node] = self.find(self.ids[node]) # Path compression which updates current node's root if they're not their own root node. Gives the amortized time complexity
        return self.ids[node]
    

    def unify(self, node1, node2):

        root1 = self.find(node1)
        root2 = self.find(node2)

        # If already in same group, return!
        if (root1 == root2):
            return

        # Merge 2 components together, merge smaller into larger one
        if (self.sizes[root1] < self.sizes[root2]):
            self.sizes[root2] += self.sizes[root1]
            self.ids[root1] = root2
        else:
            self.sizes[root1] += self.sizes[root2]
            self.ids[root2] = root1
        
        # One less component after merging, optional if the problem you're on requires this info
        self.num_components -= 1
        
# Real Example with 8 objects
ex_unionfind = UnionFind(8)
print(ex_unionfind.sizes, ex_unionfind.ids) # Shows sizes and ids array
print(ex_unionfind.find(2)) # -> 2


[1, 1, 1, 1, 1, 1, 1, 1] [0, 1, 2, 3, 4, 5, 6, 7]
2


In [57]:
# Merge first Half
ex_unionfind.unify(0, 1)
ex_unionfind.unify(1, 2)
ex_unionfind.unify(2, 3)

print("After merging 0,1,2,3: ")
print("ID's: ", ex_unionfind.ids)
print("Sizes: ", ex_unionfind.sizes, '\n')

After merging 0,1,2,3: 
ID's:  [0, 0, 0, 0, 4, 5, 6, 7]
Sizes:  [4, 1, 1, 1, 1, 1, 1, 1] 



In [58]:
# Merge 2nd half
ex_unionfind.unify(6, 7)
ex_unionfind.unify(7, 4)

print("After merging 4,6,7: ")
print("ID's: ", ex_unionfind.ids)
print("Sizes: ", ex_unionfind.sizes, '\n')

After merging 4,6,7: 
ID's:  [0, 0, 0, 0, 6, 5, 6, 6]
Sizes:  [4, 1, 1, 1, 1, 1, 3, 1] 



In [59]:
# Merging group of 4 with group of 3
ex_unionfind.unify(1, 7)

print("After merging 1,7: ")
print("ID's: ", ex_unionfind.ids)
print("Sizes: ", ex_unionfind.sizes)

After merging 1,7: 
ID's:  [0, 0, 0, 0, 6, 5, 0, 6]
Sizes:  [7, 1, 1, 1, 1, 1, 3, 1]


In [60]:
# At this point, 7's root is actually 1 and sort of "lags" with it's old root value
# The next time we do a unify/find operation on that node, it will correct itself to the actual root
# When we merge 5 and 7, now both have their roots point to 0 thanks to path compression to the root node
ex_unionfind.unify(5, 7)
print("After merging 5,7: ")
print("ID's: ", ex_unionfind.ids)
print("Sizes: ", ex_unionfind.sizes)

After merging 5,7: 
ID's:  [0, 0, 0, 0, 6, 0, 0, 0]
Sizes:  [8, 1, 1, 1, 1, 1, 3, 1]
