# Mergesort

**Mergesort** is one of two classic sorting problems (along with quicksort) that is still heavily used even 50 years later. They are critical components in the world's computational infrastructure.

The basic idea for mergesort is to divide the array into two halves, recursively sort each half, then merge the two halves. Merging is the key here, given two sorted subarrays `a[lo]` to `a[mid]` and `a[mid+1]` to `a[hi]`, your goal is to replace them with a sorted subarray `a[lo]` to `a[hi]`.

Good practice to put assertions into your code to check that what you expect is true (here that the merge function is expecting the two subarrays are actually sorted). You can enable or disable at runtime - by default they're disabled.

Next step is to implement the **sort** part of the procedure, which recursively splits the array in half, then merges the pieces in order.

In [29]:
def merge(arr, mid):
    '''
    Assumes there's a midpoint mid such that the subarray from the indices
    0 to mid is sorted and the subarray from indices mid+1 to the end is
    sorted. Returns the sorted array
    '''
    if len(arr) == 1:
        return arr
    lo = 0
    hi = len(arr) - 1
#     assert all(arr[i] <= arr[i+1] for i in range(lo, mid))
#     assert all(arr[j] <= arr[j+1] for j in range(mid + 1, hi))
    
    aux = arr[lo:hi + 1]
    i, j = lo, mid + 1
    for k in range(lo, hi + 1):
        if i > mid:
            # i index is exhausted
            arr[k] = aux[j]
            j += 1
        elif j > hi:
            # j index is exhausted
            arr[k] = aux[i]
            i += 1
        elif aux[j] < aux[i]:
            arr[k] = aux[j]
            j += 1
        else:
            arr[k] = aux[i]
            i += 1

    return arr

a = ['A', 'G', 'L', 'O', 'R', 'H', 'I', 'M', 'S', 'T']
mid = 4

print("Merge: {}".format(merge(a, mid)))

# Force an assertion error
# b = ['G', 'A', 'L', 'R', 'O', 'H', 'I', 'M', 'S', 'T']
# merge(b, mid)

Merge: ['A', 'G', 'H', 'I', 'L', 'M', 'O', 'R', 'S', 'T']


In [26]:
def mergeSort(arr):
    if len(arr) == 1:
        return arr
    else:
        mid = (len(arr) - 1) // 2
        return merge(mergeSort(arr[:mid + 1]) + mergeSort(arr[mid + 1:]), mid)


c = ['A', 'G', 'L', 'O', 'R', 'H', 'I', 'M', 'S', 'T']
mergeSort(c)

['A', 'G', 'H', 'I', 'L', 'M', 'O', 'R', 'S', 'T']

## Mergesort Analysis

Mergesort is a **divide and conquer** algorithm. Good algorithms are more effective than super computers - the empirical analysis shows insertion sort would take 37 years on an average home computer for 1 billion items, where mergesort would take a week or so.

The proposition is that mergesort uses at most $N \lg N$ compares and $6 N \lg N$ array accesses to sort any array of size $N$.

**Running time**: You can draw a recurrence relation - the number of compares $C(N)$ and array accesses $A(N)$ to mergesort an array of size $N$ satisfy the recurrences:

$$
C(N) \leq C([N/2]) + C[N/2]) + N \text{ for } N \gt 1 \text{, with } C(1) = 0 \\
A(N) \leq A([N/2]) + A[N/2]) + 6N \text{ for } N \gt 1 \text{, with } A(1) = 0
$$

**Memory usage**: Mergesort uses extra space proportional to $N$ because the `aux[]` array needs to be of size $N$ for the last merge. **NOT in place**.

**Practical improvements**:

- Use insertion sort for small subarrays, mergesort has too much overhead for tiny subarrays. Have a cutoff to use insertion sort (7 items or less)
- Stop if already sorted - this happens if the biggest item in the first half is $\leq$ the smallest item in the second half. This helps for partially sorted arrays
- Eliminate the copy to the auxiliary array (saves time, not space) by switching the role of the input and auxiliary arrays in each recursive call

## Bottom-up Mergesort

**Botton-up mergesort** is a version without recursion - it starts with mini subarrays of size one, merges them into order by twos, then fours, etc.

There are $\lg N$ passes, using about $N$ compares, for a total cost of about $N \lg N$.

In [42]:
def mergeSortBU(arr):
    N = len(arr)
    for sz in [2**i for i in range(N) if 2**i < N]:
        for lo in range(0, N - sz, sz + sz):
            mid = sz - 1
            hi = min(lo + sz + sz - 1, N - 1)
            arr[lo:hi + 1] = merge(arr[lo:hi + 1], mid)
    return arr

d = ['A', 'G', 'L', 'O', 'R', 'H', 'I', 'M', 'S', 'T']
print("Bottom-up mergesort: {}".format(mergeSortBU(d)))

size: 1
Merging ['A', 'G'] with mid 0
Arr: ['A', 'G', 'L', 'O', 'R', 'H', 'I', 'M', 'S', 'T']
Merging ['L', 'O'] with mid 0
Arr: ['A', 'G', 'L', 'O', 'R', 'H', 'I', 'M', 'S', 'T']
Merging ['H', 'R'] with mid 0
Arr: ['A', 'G', 'L', 'O', 'H', 'R', 'I', 'M', 'S', 'T']
Merging ['I', 'M'] with mid 0
Arr: ['A', 'G', 'L', 'O', 'H', 'R', 'I', 'M', 'S', 'T']
Merging ['S', 'T'] with mid 0
Arr: ['A', 'G', 'L', 'O', 'H', 'R', 'I', 'M', 'S', 'T']
size: 2
Merging ['A', 'G', 'L', 'O'] with mid 1
Arr: ['A', 'G', 'L', 'O', 'H', 'R', 'I', 'M', 'S', 'T']
Merging ['H', 'I', 'M', 'R'] with mid 1
Arr: ['A', 'G', 'L', 'O', 'H', 'I', 'M', 'R', 'S', 'T']
size: 4
Merging ['A', 'G', 'H', 'I', 'L', 'M', 'O', 'R'] with mid 3
Arr: ['A', 'G', 'H', 'I', 'L', 'M', 'O', 'R', 'S', 'T']
size: 8
Merging ['A', 'G', 'H', 'I', 'L', 'M', 'O', 'R', 'S', 'T'] with mid 7
Arr: ['A', 'G', 'H', 'I', 'L', 'M', 'O', 'R', 'S', 'T']
Bottom-up mergesort: ['A', 'G', 'H', 'I', 'L', 'M', 'O', 'R', 'S', 'T']
