# PDSA Notes: Divide and Conquer - Counting Inversions

## 1. Introduction to Divide and Conquer

**Divide and Conquer** is an algorithmic design paradigm where a problem is broken into smaller, disjoint sub-problems, each solved independently, and the results are then combined efficiently.

### Key Steps:

* **Divide:** Split the problem into sub-problems.
* **Conquer:** Solve each sub-problem independently.
* **Combine:** Merge the sub-solutions into a final result.

### Classic Examples:

* **Merge Sort:** Splits the list into halves, sorts each half, and merges them.
* **Quick Sort:** Partitions based on a pivot, then recursively sorts partitions.

---

## 2. Real-world Motivation: Recommender Systems

* Systems like Amazon and Netflix recommend items by comparing user profiles.
* A profile is based on **ranking or rating** of items (e.g., movies).
* To find similar users, we compare rankings to determine how alike they are.

---

## 3. Comparing Rankings: Inversions

Suppose 5 movies A, B, C, D, E are ranked by two users:

* **You:** D, B, C, A, E → (1, 2, 3, 4, 5)
* **Friend:** B, A, C, D, E → (2, 4, 3, 1, 5)

### Definitions:

* **Inversion:** A pair of items (i, j) such that:

  * You prefer i over j
  * Your friend prefers j over i

### Example:

From your ranking (1,2,3,4,5) and your friend's (2,4,3,1,5):

* Inversions: (1,2), (1,3), (1,4), (3,4) → Total: 4

### Range of Inversions:

* **Minimum:** 0 (identical rankings)
* **Maximum:** n(n-1)/2 (completely opposite rankings)

This count becomes a **measure of dissimilarity**.

---

## 4. Problem Reformulation

* Fix your ranking as 1 to n.
* Your friend's ranking becomes a **permutation** of 1 to n.
* Count how many pairs (i, j) exist such that:

  * i < j but in the permutation, i appears after j.

This is the number of **inversions** in the permutation.

---

## 5. Brute Force Approach

* Compare all pairs (i, j), i < j
* Check if i appears after j in the permutation
* **Time complexity:** O(n^2)

---

## 6. Efficient Approach: Divide and Conquer (Merge Sort Style)

### Strategy:

* Use a modified **Merge Sort** to count inversions efficiently.
* **Divide:** Split array into two halves.
* **Conquer:** Recursively count inversions in both halves.
* **Combine:** Count inversions **across** the left and right halves during merge.

### Key Insight During Merge:

* When picking an element from the **right half** before elements in the **left**, it means inversion.
* Count how many elements in the left are greater than the picked right element.

---

## 7. Merge and Count: Example

Arrays:

* Left: \[1, 3, 7]
* Right: \[2, 6, 8]

Steps:

1. Compare 1 and 2 → 1 < 2, no inversion.
2. Compare 3 and 2 → 3 > 2 → 2 has overtaken 3 and 7 → 2 inversions.
3. Continue this process, updating inversion count.

Total inversions calculated during merge.

---

## 8. Implementation Overview

**Function: merge\_and\_count(A, B)**

* A, B: sorted lists
* Merge them into sorted list C
* For each element picked from B (right list) before A (left list):

  * Count = (number of remaining elements in A)

**Function: sort\_and\_count(arr)**

* If list size = 1: return list, 0
* Split list into two halves
* Recursively sort\_and\_count each half
* Use merge\_and\_count to merge and count cross-inversions
* Total inversion = left inversions + right inversions + cross-inversions

### Time Complexity:

* Same recurrence as Merge Sort:

  * T(n) = 2T(n/2) + O(n)
  * **T(n) = O(n log n)**

---

## 9. Comparison: Brute Force vs Divide and Conquer

| Approach       | Output                      | Time Complexity |
| -------------- | --------------------------- | --------------- |
| Brute Force    | Explicit list of inversions | O(n^2)          |
| Divide-Conquer | Total inversion count       | O(n log n)      |

* Brute force gives **all inversion pairs**, but is slow.
* Divide and conquer gives **just the count**, but is fast.

---

## 10. Application in Recommender Systems

* We do **not** need the exact pairs of disagreements.
* We only need the **count of inversions** to determine **similarity**.
* Divide and conquer method is **optimal** for large datasets.

---

## 11. Summary

* Inversions measure dissimilarity in rankings.
* Naive comparison takes O(n^2) time.
* Modified merge sort (sort and count) achieves O(n log n).
* Efficient for applications like recommender systems where only count matters.

---

## 12. Final Notes

* Divide and conquer is powerful when the **merge** step can be optimized.
* Merge and count combines sorting with inversion counting.
* Can be extended to other problems involving **pairwise relationships**.

---

End of Notes.
