# Union Find algorithm
https://cuvids.io/app/video/97/watch/

## Applications
* Percolation (electricity, fluid flow, social interaction)
    * For N x N grid, check percolation by creating a virtual top site and a virtual bottom site. This prevents having to check each top square against each bottom square. (@ 7:14 in Applications video)
    * For N x N grid, to open a square, you union on any of the four surrounding sqaures that are open.
* Games (Go, Hex)
* Dynamic Connectivity
* Least common ancestor
* Equivalence of finite state automata
* Hinley-Milner polymorphic type inference
* Kruskal's minimum spanning tree algorithm
* Matlab's bwlabel() function in image processing

## Dynamic connectivity problem
Given a set of N objects:
* Union: connect two objects
* Find: is there a path connecting two objects?

Note that we're not looking for the path between objects. That's a different algorithm. Here, we're just looking for a boolean stating whether objects are connected.

## Quick Find
* Eager algorithm
* integer array id[] of size N
* Interpretation: `p` and `q` are connected iff they have the same id. We want to find out whether some integers are connected, so we use those integers as indexes into an array, where we store that information.

id[]  =  [0, 1, 1, 8, 8, 0, 0, 1, 8, 8]
(index)   0  1  2  3  4  5  6  7  8  9

0, 5, and 6 are connected
1, 2, and 7 are connected
3, 4, 8, and 9 are connected

**Find O(1)**: To implement `find`, just check if `id[p]` and `id[q]` are equal to each other.
**Union O(n)**: To merge `p` and `q`, we make `id[p]` and `id[q]` to be equal. We arbitrarily choose to set `id[p]` to `id[q]`.

**Defect**: Union too expensive (N array accesses)

In [1]:
class QuickFind:
    def __init__(self, n):
        self.id = list(range(n))

    def find(self, p: int, q: int) -> bool:
        return self.id[p] == self.id[q]

    def union(self, p: int, q: int) -> bool:
        for i in range(len(self.id)):
            if self.id[i] == self.id[p]:
                self.id[i] = self.id[q]

## Quick Union
* integer array `id[]` of size N
* interpretation: `id[i]` is the parent of `i`
* Root of `i` is `id[id[id[...id[i]...]]]`

Think of the array as representing a set of tree nodes.
id[] = [0, 1, 9, 4, 9, 6, 6, 7, 8, 9]
index   0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Means that 3's parent is 4, and 4's parent is 9
0   1   9   6   7   7   8   9
       2 4  5
         3

**Find O(n)**: check if `p` and `q` have the same root.
**Union O(n)**: to merge all components containing `p` and `q`, set the id of p's root to the id of q's root.

**Defect**: trees can get tall. which makes Find too expensive.

In [2]:
class QuickUnion:
    def __init__(self, n):
        self.id = list(range(n)) # set id of each object to itself

    def root(self, i: int) -> int:
        # chase pointers until reach root
        while i != self.id[i]:
            i = self.id[i]
        return i

    def find(self, p: int, q: int) -> bool:
        # check if p and q have same root
        return self.root(p) == self.root(q)

    def union(self, p: int, q: int) -> bool:
        # change root of p to point to root of q
        i = self.root(p)
        j = self.root(q)
        self.id[i] = j


## Weighted Quick Union
* modify quick union to avoid tall trees
* keep track of size of each tree
* balance by linking root of smaller tree to root of larger tree

**Find O(lg N)**: check if `p` and `q` have the same root
**Union O(lg N)**: change root of smaller tree to point to root of larger tree

**Analysis**:
* Find takes time proportional to depth of `p` and `q`.
* Union is O(n) because of having to find roots
* Depth of any node is at most `lg N`


In [3]:
class WeightedQuickUnion:
    def __init__(self, n):
        self.id = list(range(n))
        self.sz = [1] * n    # add array to store sizes of trees

    def root(self, i: int) -> int:
        while i != self.id[i]:
            i = self.id[i]
        return i

    def find(self, p: int, q: int) -> bool:
        return self.root(p) == self.root(q)

    def union(self, p: int, q: int) -> bool:
        # change root of smaller tree to point to root of larger tree
        # update size of larger tree
        p_root = self.root(p)
        q_root = self.root(q)
        if self.sz[p] < self.sz[q]:
            self.id[p_root] = q_root
            self.sz[q_root] += self.sz[p_root]
        else:
            self.id[q_root] = p_root
            self.sz[p_root] += self.sz[q_root]


## Weighted Quick Union with Path Compression
when we're trying to find the root of a given tree, we're touching all the nodes on the path from that node to the root. While we're at it, we might as well make each one of those point to the root.

Two-pass implementation: add second loop to `root()` to set the `id[]` of each examined node to the root

Simpler one-pass variant: Make every other node in path point to its grandparent (thereby halving path length).

In [None]:
def root(self, i: int) -> int:
    while i != self.id[i]:
        id[i] = id[id[i]]  # <- Only one line of extra code for the one-pass variant
        i = self.id[i]
    return i