### Disjoint-Set Data Structure 

In [1]:
"""
Also Known As:
    - Disjoint Set Union (DSU)
    - Union-Find
    - Merge-Find Set

What is Disjoint Set Union:
    Disjoint-Set Union is a data structure used to manage a collection of disjoint (non-overlapping) sets. 
    It efficiently supports operations to group elements, determine which set a specific element belongs to, 
    and check the connectivity between sets.

Operations:
    1. Union: Merges two sets into a single set.
    2. Find: Identifies the set to which a particular element belongs.
    x. Make_set: Creates a new set containing a single element.
       (Used to initialize elements before performing Union or Find operations.)

links:
    - https://medium.com/@rishu__2701/getting-started-with-disjoint-set-data-structure-0a971a68f731
    - https://www.youtube.com/watch?v=C0O8T3C8irU
    - 
"""

pass

### Applications of Disjoint-Set (Union-Find):


In [2]:
"""
The Disjoint-Set data structure can be applied to problems involving `partitioning`, `connectivity`, or `equivalence relations`. 
Examples include:

1. Graph Algorithms (Connectivity):
   - Cycle Detection: Check if adding an edge creates a cycle.
   - Kruskal's Minimum Spanning Tree: Detect cycles and merge connected components.
   - Connected Components: Identify clusters of connected nodes in an undirected graph.

2. Resource Management:
   - Memory Allocation: Track memory regions that are allocated or freed.
   - File Systems: Manage partitions or file blocks.

3. Network Modeling:
   - Computer Networks: Determine if two devices are in the same subnet or can communicate.
   - Social Networks: Check if two people belong to the same group or community.

4. Image Processing:
   - Connected Component Labeling: Identify and label connected regions (e.g., grouping pixels into shapes).
"""

pass

### Implementations

In [3]:
"""
Disjoint Set Union (DSU) has several implementations, but the Forest Implementation is the most commonly used. 
In this notebook, we will explore the Forest Implementation, starting with the naive version and gradually optimizing it.
"""

pass

#### Naive implementation

In [4]:
class DisjointSet:
    def __init__(self):
        """
        Initialize Empty Disjoint set.
        """
        self.parent = {}

    def make_set(self, x: int) -> None:
        """
        Create a new set.
        """
        if x not in self.parent:
            self.parent[x] = x  # Each element is its own parent initially

    def find(self, x: int) -> int:
        """
        Find the representative of the set containing `x`.
        """
        if x == self.parent[x]:
            return x

        return self.find(self.parent[x])

    def union(self, x, y) -> None:
        """
        Merge the sets containing `x` and `y`.
        """
        root_x = self.find(x)
        root_y = self.find(y)

        if root_x != root_y:  # Only merge if they belong to different sets
            self.parent[root_y] = root_x

    def print_set(self):
        """
        Print the current state of the Disjoint Set in a table format.
        """

        print("+------------------+")
        print("| Element | Parent |")
        print("+------------------+")
        for value, parent in self.parent.items():
            print(f"| {value:^7} | {parent:^6} |")
        print("+------------------+\n")


dsu = DisjointSet()

for n in range(5):
    dsu.make_set(n)

dsu.print_set()
dsu.union(0, 3)
dsu.union(0, 2)
dsu.union(1, 4)
dsu.print_set()

+------------------+
| Element | Parent |
+------------------+
|    0    |   0    |
|    1    |   1    |
|    2    |   2    |
|    3    |   3    |
|    4    |   4    |
+------------------+

+------------------+
| Element | Parent |
+------------------+
|    0    |   0    |
|    1    |   1    |
|    2    |   0    |
|    3    |   0    |
|    4    |   1    |
+------------------+



In [5]:
"""
The issue in this naive implementation is it forms log chains and makes `find` operation O(n).
Since the `union` operation relies on `find` operation thus making it O(n) in time complexity.

Complexity:
    - Space O(n)
    - Time O(n)
"""

pass

#### Optimization: Path compression + Union by size/rank 

In [6]:
class DisjointSet:
    """
    Disjoint Set Union (DSU) implementation with optimizations:
        - Path Compression
        - Union by Rank

    Complexity:
        Space: O(n)
        Find: O(log n) (amortized due to path compression)
        Union: O(α(n)) (α is the inverse Ackermann function)
    """

    def __init__(self):
        """
        Initialize an empty Disjoint Set with rank support.
        """
        self.parent = {}  # Maps each element to its parent
        self.rank = {}  # Rank(tree height) is a heuristic.

    def make_set(self, x: int) -> None:
        """
        Create a new set containing a single element.
        Initializes the element's parent as itself and rank as 0.
        """
        if x not in self.parent:
            self.parent[x] = x  # Each element is its own parent initially
            self.rank[x] = 0  # Initial rank is 0

    def find(self, x: int) -> int:
        """
        Find the representative of the set containing `x` with path compression.
        This ensures that the tree height remains minimal.
        """
        if x != self.parent[x]:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]

    def union(self, x: int, y: int) -> None:
        """
        Merge the sets containing `x` and `y` using union by rank.
        Ensures that smaller trees are always attached under larger trees.
        """
        root_x = self.find(x)
        root_y = self.find(y)

        if root_x != root_y:
            # Attach the smaller tree under the larger tree
            if self.rank[root_x] > self.rank[root_y]:
                self.parent[root_y] = root_x
            elif self.rank[root_x] < self.rank[root_y]:
                self.parent[root_x] = root_y
            else:
                self.parent[root_y] = root_x
                self.rank[root_x] += 1
                # Increment rank if both trees have the same rank

    def print_set(self) -> None:
        """
        Print the current state of the Disjoint Set in a table format.
        Displays each element, its parent, and the rank of the set.
        """
        print("+----------------------------+")
        print("| Element | Parent |  Rank   |")
        print("+----------------------------+")
        for key, parent in self.parent.items():
            print(f"| {key:^7} | {parent:^6} | {self.rank[key]:^7} |")
        print("+----------------------------+\n")


dsu = DisjointSet()

for n in range(5):
    dsu.make_set(n)
dsu.print_set()

dsu.union(3, 1)
dsu.union(3, 0)
dsu.union(4, 2)
dsu.union(0, 2)

dsu.print_set()

+----------------------------+
| Element | Parent |  Rank   |
+----------------------------+
|    0    |   0    |    0    |
|    1    |   1    |    0    |
|    2    |   2    |    0    |
|    3    |   3    |    0    |
|    4    |   4    |    0    |
+----------------------------+

+----------------------------+
| Element | Parent |  Rank   |
+----------------------------+
|    0    |   3    |    0    |
|    1    |   3    |    0    |
|    2    |   4    |    0    |
|    3    |   3    |    2    |
|    4    |   3    |    1    |
+----------------------------+



In [7]:
"""
Union by Rank vs Size:
Both heuristics optimize the union operation by minimizing the height of the resulting trees,

    - Rank: Attach the tree with smaller height to the tree with larger height.
    - Size: Attach the smaller tree (by number of elements) to the larger tree.
"""

pass