# Sorting Algorithms (part 4)

## Merge Sort

* To sort **A** into **B**, both of length $n$
* If $n \leq 1$, nothing is to be done
* Otherwise
  - Sort `A[:n//2]` into **L**
  - Sort `A[n//2:]` into **R**
  - Merge **L** and **R** into **B**

Merging two sorted lists **A** and **B** into **C**

* If **A** is empty, copy **B** into **C**
* If **B** is empty, copy **A** into **C**
* Otherwise, compare the first elements of **A** and **B**
  - Move the smaller of the two to **C**
* Repeat till all the elements of **A** and **B** have been moved

#### Analysing `merge()`

* Merge **A** of length `m`, **B** of length `n`
* Output list **C** has length $m + n$
* In each iteration, we add (at least) one element to **C**
* Hence, `merge()` takes time $O(m+n)$
* Recall that $m+n \leq 2(max(m, n))$
* If $m \approx n$, `merge()` takes time $O(n)$

#### Analysing `merge_sort()`

* Let $T(n)$ be the time taken for input of size $n$
  - For simplicity, assume $n = 2^{k}$ for some $k$
* Recurrence
  - $T(0) = T(1) = 1$
  - $T(n) = 2T(n/2) + n$
    - Solve two sub-problems of size $n/2$
    - Merge the solutions in time $n/2 + n/2 = n$
* Unwind the recurrence to solve

![Analysis of Merge Sort](https://firebasestorage.googleapis.com/v0/b/kashif-resume.appspot.com/o/analysis.png?alt=media&token=7230156d-c072-407d-a457-cf4938dc74c5)

### Summary

* Merge sort takes time $O(nlogn)$ so can be effectively used on large inputs
* Variations on merge are possible
  - Union of two sorted lists - discard duplicates, if `A[i] == B[j]` move just one copt to **C** and increment both `i` and `j`
  - Intersection of two sorted lists - when `A[i] == B[j]`, move one copy to **C**, otherwise discrd the smaller of `A[i]`, `B[j]`
  - List difference - elements in **A** but not in **B**
* Merge needs to create a new list to hold the merged elements
  - No obvious way to effeciently merge two lists in place
  - Extra storage can be costly
* Inherently recursive
  - Recursive calls and returns are epensive