## 📌 Kruskal's Algorithm: Overview

* **Goal**: Find the Minimum Cost Spanning Tree (MST) of a connected weighted graph.
* **Approach**: Bottom-up. Start with all nodes as disjoint sets and connect them using the smallest-weight edge that doesn’t form a cycle.

### Kruskal's Steps:

1. Sort all edges in increasing order of weight.
2. Initialize all vertices as disjoint sets.
3. For each edge (u, v) in sorted order:

   * If `Find(u) ≠ Find(v)` → they belong to different components:

     * Add edge to MST.
     * `Union(u, v)` to merge the components.

---

## 📌 Disjoint Set (Union-Find) Data Structure

### Purpose:

Efficiently keep track of which vertices are in which components (connected subgraphs).

### Core Operations:

1. `MakeUnionFind(n)`
   Initializes n singleton components.
2. `Find(x)`
   Returns the component name or ID to which x belongs.
3. `Union(x, y)`
   Merges components of x and y.

---

## 📌 Naive Union-Find Implementation

### Component Mapping:

* Use an array/dictionary: `component[i]` = name of component that contains vertex i.

### `Find(i)`:

* Direct lookup: O(1)

### `Union(i, j)`:

* Let `c_old = component[i]`, `c_new = component[j]`
* Scan all elements; for each element `k`, if `component[k] == c_old`, update to `c_new`
* ⏱ Time Complexity: **O(n)** per union → **O(m·n)** for m edges (bad for large graphs)

---

## 📌 Optimized Union-Find using Reverse Mapping

### New Dictionaries:

* `component[i]`: Which component vertex i belongs to.
* `members[c]`: List of all vertices in component c.
* `size[c]`: Number of vertices in component c.

### Improved Union(i, j):

* Rename smaller component into the larger one:

  * if `size[c_old] ≤ size[c_new]`: move `c_old` → `c_new`
  * Else: move `c_new` → `c_old`
* Only iterate over `members[c_old]`, not all vertices.
* Update:

  * `component[k] = c_new`
  * `members[c_new].append(k)`
  * `size[c_new] += 1`

### Benefit:

* Each `Union` touches only elements in the smaller component.
* Component sizes at least double after each merge.

---

## 📌 Key Theoretical Insight

### Lemma: A vertex changes its component label at most **log m** times.

* Reason: Each time, it's merged into a component at least twice as large.
* So, for any element:

  * Component sizes: 1 → 2 → 4 → 8 → ... → ≤ 2m
  * ⟹ At most **log(2m) ≈ log m** changes

### Total Relabelings:

* At most 2m elements are ever relabeled.
* Each gets relabeled ≤ log m times
* ⇒ Total relabeling work = O(m log m)

### Final Complexities:

| Operation        | Time                   |
| ---------------- | ---------------------- |
| MakeUnionFind(n) | O(n)                   |
| Find(x)          | O(1)                   |
| Union(x, y)      | **O(log m)** amortized |
| All Unions       | O(m log m) total       |

---

## 📌 Final Kruskal's Algorithm Complexity

1. Sort edges → O(m log m)
2. MakeUnionFind → O(n)
3. Find + Union over m edges → O(m log m)
4. Accept n−1 edges in MST → O(n log n)

Since `m ≤ n²`, we get:

* `log m ≤ 2 log n` ⇒ `log m = O(log n)`

### Total Time:

**O(m log n)**
(Using amortized union-find with union-by-size)

---

## 📌 Further Optimization (Advanced Idea - Not implemented here)

### Union-Find with Trees:

* Represent components as trees (with root as representative).
* `Find(x)`: Traverse parent pointers to root.
* `Union(x, y)`: Attach smaller tree as child of larger tree's root.

### With Path Compression:

* Flatten tree during `Find(x)` by pointing all visited nodes directly to root.
* Amortized complexity:

  * `Find`: O(α(n)) (inverse Ackermann, practically constant)
  * `Union`: O(α(n))
* Very efficient in practice.

---

## ✅ Summary Table

| Version               | Find Time | Union Time | Total Time (Kruskal) |
| --------------------- | --------- | ---------- | -------------------- |
| Naive                 | O(1)      | O(n)       | O(mn)                |
| Union-by-size         | O(1)      | O(log m)   | O(m log n)           |
| With Path Compression | O(α(n))   | O(α(n))    | O(m α(n))            |

---