## 14.5 Merge sort

Let's now look at a divide-and-conquer approach to sorting.

### 14.5.1 Algorithm

**Merge sort** divides the sequence into two halves, sorts each one recursively,
and merges them by taking the smallest item of each half until both are empty.
The bases cases are sequences of length&nbsp;0 or 1,
as they can't be split into smaller sequences.
When the length is odd, it doesn't matter which half has one element more.
The next figure shows the process applied to our familiar example.

![This diagram shows seven rows of sequences of letters.
When a sequence is split, it is connected with lines to
the two resulting sequences in the next row.
When two sequences are merged, they are connected with straight lines to
the resulting sequence in the next row.
The first row shows sequence SORTING.
The second row shows it has been split into SORT and ING.
The third row shows the first sequence is split into SO and RT,
and the second into IN and G.
In the fourth row, SO, RT and IN are further split into single letters.
In the fifth row, S and O are merged into OS, R and T into RT,
and I and N into IN.
In the fifth row, OS and RT are merged into ORST,
and IN is merged with G into GIN.
In the final seventh row, ORST and GIN are merged into GINORST,
the sorted sequence of letters.](14_5_mergesort.png)

Here's the recursive algorithm for mergesort(_unsorted_, _key_):

1. if _n_ < 2:
   1. let _sorted_ be _unsorted_
   2. stop
1. let _middle_ be floor(_n_ / 2)
1. let _left sorted_ be mergesort(_unsorted_[0:*middle*], _key_)
1. let _right sorted_ be mergesort(_unsorted_[_middle_:*n*], _key_)
1. let _sorted_ be merge(_left sorted_, _right sorted_, _key_)

The previous M269 team produced a [visualisation](https://learn2.open.ac.uk/mod/oucontent/view.php?id=1827810&extra=thumbnail_idm45069228661520) of step&nbsp;5,
so that you can see it in more detail.
The code shown in the visualisation doesn't use a key function.
Here's the algorithm for merge(_left_, _right_, _key_)
with the precondition that both input sequences are sorted.

1. let _left index_ be 0
1. let _right index_ be 0
1. let _sorted_ be the empty sequence
1. while _left index_ < │*left sorted*│ and _right index_ < │*right sorted*│:
   1. let _left item_ be _left sorted_[*left index*]
   1. let _right item_ be _right sorted_[*right index*]
   1. if _key_(_left item_) < _key_(_right item_):
      1. append _left item_ to _sorted_
      2. let _left index_ be _left index_ + 1
   1. otherwise:
      1. append _right item_ to _sorted_
      2. let _right index_ be _right index_ + 1
1. for _index_ from _left index_ to │*left sorted*│ – 1:
   1. append _left sorted_[*index*] to _sorted_
1. for _index_ from _right index_ to │*right sorted*│ – 1:
   1. append _right sorted_[*index*] to _sorted_

Step&nbsp;4 and its sub-steps merge both sequences until one is copied to the output.
The unprocessed items in the other half are then copied by either
step&nbsp;5 or step&nbsp;6. Only one of those for-loops will execute.

#### Exercise 14.5.1

Is merge sort stable? If not, what has to be changed?

_Write your answer here._

[Hint](../31_Hints/Hints_14_5_01.ipynb)
[Answer](../32_Answers/Answers_14_5_01.ipynb)

### 14.5.2 Complexity

Let's first think whether there are different best and worst cases.
Is merge sort adaptive?

___

Even if the input is sorted, it's repeatedly halved, sorted, and merged in
the same order. The algorithm isn't adaptive.

Let's first obtain the complexity in a visual way.
In the above figure, each level splits sequences
in half and a later corresponding level merges the sorted halves.
Each splitting and merging corresponds to one recursive call.
Since the input is always split in half, there are log$_2$&nbsp;_n_ recursive calls
until the sequences have length&nbsp;1 and start being merged.
In the example, _n_ = 7, so there are 3 recursive call levels.

Each level processes the whole input, of length _n_, albeit split in substrings.
For example, the second level processes 4&nbsp;sequences of one or two characters
each, while the third level processes seven sequences of one character each.
Processing each character takes constant time, both
when it's being copied from the input to an unsorted half and
when it's copied from a sorted half to the output.
The complexity is therefore log _n_ × _n_ × Θ(1) = Θ(_n_ log _n_).

Informal reasoning can sometimes have subtle flaws, so it's safest to
systematically define the complexity recursively, following the algorithm.

#### Exercise 14.5.2

Write the recursive definition of T and
confirm that merge sort has log-linear complexity.

- if _n_ < 2: T(_n_) = ...
- if _n_ ≥ 2: T(_n_) = ...

[Hint](../31_Hints/Hints_14_5_02.ipynb)
[Answer](../32_Answers/Answers_14_5_02.ipynb)

Merge sort has two advantages over insertion and selection sort:
it has better than quadratic complexity for unsorted input sequences and,
being a divide-and-conquer algorithm, can be implemented in parallel.

### 14.5.3 Code and performance

The following implements the stable version of the algorithm.

In [1]:
%run -i ../m269_util
%run -i ../m269_sorting

def merge(left: list, right: list, key: Callable) -> list:
    """Merge both lists into one with keys in non-decreasing order.

    Preconditions: left and right have keys in non-decreasing order
    """
    left_index = 0
    right_index = 0
    sorted = []
    while left_index < len(left) and right_index < len(right):
        left_item = left[left_index]
        right_item = right[right_index]
        if key(left_item) <= key(right_item):
            sorted.append(left_item)
            left_index = left_index + 1
        else:
            sorted.append(right_item)
            right_index = right_index + 1
    for index in range(left_index, len(left)):
        sorted.append(left[index])
    for index in range(right_index, len(right)):
        sorted.append(right[index])
    return sorted

def merge_sorted(unsorted: list, key: Callable) -> list:
    """Return a permutation with keys in non-decreasing order.

    Preconditions: for any indices i and j,
    key(unsorted[i]) and key(unsorted[j]) are comparable
    """
    n = len(unsorted)
    if n < 2:
        return unsorted

    middle = n // 2
    left_sorted = merge_sorted(unsorted[0:middle], key)
    right_sorted = merge_sorted(unsorted[middle:n], key)
    return merge(left_sorted, right_sorted, key)

test(merge_sorted, sorting_tests)

Tests finished.


Since the complexity is always the same,
I measure the run-time only for already sorted sequences.

In [2]:
for doubling in range(5):
    items = list(range(100 * 2**doubling))
    %timeit -r 5 merge_sorted(items, identity)

257 µs ± 18.4 µs per loop (mean ± std. dev. of 5 runs, 1000 loops each)
577 µs ± 29.8 µs per loop (mean ± std. dev. of 5 runs, 1000 loops each)
1.21 ms ± 71.3 µs per loop (mean ± std. dev. of 5 runs, 1000 loops each)
2.86 ms ± 423 µs per loop (mean ± std. dev. of 5 runs, 100 loops each)
5.81 ms ± 1 ms per loop (mean ± std. dev. of 5 runs, 100 loops each)


The run-times more than double but don't quadruple when the input size doubles.
This confirms the complexity is between linear and quadratic.

⟵ [Previous section](14_4_selection_sort.ipynb) | [Up](14-introduction.ipynb) | [Next section](14_6_quicksort.ipynb) ⟶