# Comparing Sorting Algorithms

## Lesson Overview

Now that you have seen the most commonly used sorting algorithms, it is important to know the advantages and disadvantages of each algorithm, in terms of:

- Time complexity
- Space complexity
- Code complexity

The questions in this lesson ask you to compare the sorting algorithms you have seen, in order to know which algorithm is appropriate to use for different problems.

## Question

Fill out the following table of time and space complexities. The complexities should be expressed using big-O notation.

**Algorithm**  | **Average time** | **Best case time** | **Worst case time** | **Space**
--- | --- | --- | --- | ---
Bubble sort | | | |
Insertion sort | | | |
Merge sort | | | |
Quicksort | | | |
Selection sort | | | |

In [None]:
#freetext

### Solution

Some of these complexities, such as the [space complexity of quicksort](https://stackoverflow.com/questions/12573330/why-does-quicksort-use-ologn-extra-space), depend on the implementation.

**Algorithm**  | **Average time** | **Worst case time** | **Best case time** | **Space (all cases)**
--- | --- | --- | --- | ---
Bubble sort | $O(n^2)$ | $O(n^2)$ | $O(n)$ | $O(1)$
Insertion sort | $O(n^2)$ | $O(n^2)$ | $O(n)$ | $O(1)$
Merge sort | $O(n\log(n))$ | $O(n\log(n))$ | $O(n\log(n))$ | $O(n)$
Quicksort | $O(n\log(n))$ | $O(n^2)$ | $O(n\log(n))$ | $O(\log(n))$
Selection sort | $O(n^2)$ | $O(n^2)$ | $O(n^2)$ | $O(1)$

## Question

Identify each of the following sorting algorithms, based on the state of the array after each iteration, as one of the following:

- Bubble sort
- Insertion sort
- Merge sort
- Quicksort
- Selection sort

The input array is [4, 2, 5, 4, 8, 1].

**Iteration** | **Algorithm 1** | **Algorithm 2** | **Algorithm 3** | **Algorithm 4** | **Algorithm 5**
--- | --- | --- | --- | --- | ---
0 | [] | [4] [2] [5] [4] [8] [1] | [2, 4, 4, 5, 1, 8] | [] | [2, 4, 1, 4, 5, 8]
1 | [4] | [2, 4] [4, 5] [1, 8] | [2, 4, 4, 1, 5, 8] | [1] | [1, 2, 4, 4, 5, 8]
2 | [2, 4] | [2, 4, 4, 5] [1, 8] | [2, 4, 1, 4, 5, 8] | [1, 2] |
3 | [2, 4, 5] | [1, 2, 4, 4, 5, 8] | [2, 1, 4, 4, 5, 8] | [1, 2, 4] |
4 | [2, 4, 4, 5] |  |  | [1, 2, 4, 4] |
5 | [2, 4, 4, 5, 8] |  | | [1, 2, 4, 4, 5] |
6 | [1, 2, 4, 4, 5, 8] | | | [1, 2, 4, 4, 5, 8] |

### Solution

1. Insertion sort

   A key indicator here is that the first iteration is an empty array, so the algorithm is either insertion or selection sort. At each iteration, the left-most element of the input array is moved to the output array and inserted at the right point.

1. Merge sort

   Merge sort looks different from other algorithms in that it divides the input into singleton arrays then merges them back together.

1. Bubble sort

   The array length remains the same at each iteration, as elements are swapped. A notable feature of bubble sort is that at each iteration, the next highest element is moved to the right. For example, 8 is moved to the right after iteration 0, then 5 in iteration 1, then 4 in iteration 2, and so on.

1. Selection sort

   Like insertion sort, selection sort is characterized by adding elements to an output array one by one. While insertion sort moves the left-most element of the input array, selection sort moves the minimum element of the input array.

1. Quicksort

   This is one of the trickiest to categorize, when the pivot at each iteration is not easy to spot. However, quicksort can be deduced by the process of elimination, since the array length stays the same at each iteration, but is much quicker than bubble sort.

## Question

One of the clients you work with is a grocery chain *QuickShop*, and the IT department there relies on sorting algorithms to sort its employees (e.g. by name, start date, salary) and products (e.g. by price, units sold, expiration date). 

The IT department at *QuickShop* currently uses bubble sort to do this, but the head of IT is pushing for the company to use merge sort instead, since merge sort is $O(n\log(n))$, and therefore more time efficient than bubble sort which is $O(n^2)$. The code for both of these algorithms is given here.

**Bubble sort**

In [None]:
#persistent
def bubble_sort(arr):
  """Sorts an array of integers in ascending order."""
  output = arr.copy()
  n = len(arr)

  for i in range(n-1):
    no_swaps = True # indicates whether any swaps were made

    # Only need to check the sub-array arr[:(n-i)].
    for j in range(n-i-1):
      if output[j+1] < output[j]:
        # Swap the elements if out of order.
        output[j], output[j+1] = output[j+1], output[j]
        no_swaps = False

    # If no single pairs are swapped, the array is sorted.
    if no_swaps:
      break
  
  return output

**Merge sort**

In [None]:
#persistent
def merge(arr1, arr2):
  """Merges two sorted arrays into one such that the final array is sorted."""
  
  output = []
  i, j = 0, 0
  
  # Loop through arr1 and arr2 for indices that are in common.
  while i < len(arr1) and j < len(arr2):
    # Add the smaller element to the output.
    if arr1[i] < arr2[j]:
      output.append(arr1[i])
      i += 1
    else:
      output.append(arr2[j])
      j += 1
  
  # Add the rest of arr1, if any, (which is already sorted) to output.
  while i < len(arr1):
    output.append(arr1[i])
    i += 1
  
  # Add the rest of arr2, if any, (which is already sorted) to output.
  while j < len(arr2):
    output.append(arr2[j])
    j += 1
  
  return output


def merge_sort(arr):
  """Sorts an array of integers in ascending order."""
  if len(arr) < 2:
    return arr

  # Split the array into two.
  midpoint = len(arr) // 2
  left = arr[:midpoint]
  right = arr[midpoint:]

  # Merge the arrays recursively.
  return merge(merge_sort(left), merge_sort(right))

In order to transition from bubble sort to merge sort, the head of IT has to demonstrate to the Chief Technology Officer (CTO) that merge sort is indeed faster than bubble sort, not just theoretically but in practice. To do this, the head of IT has written some code to generate a random list of length `n` and report the runtime for bubble sort and merge sort.

In [None]:
import random
import time


def compare_runtimes(n, a=0, b=999):
  """Runtimes for bubble sort and merge_sort for a random n-length arrays."""
  random_list = [random.randint(a, b) for _ in range(n)]

  bubble_start = time.process_time()
  bubble_sorted = bubble_sort(random_list)
  bubble_time = time.process_time() - bubble_start

  merge_start = time.process_time()
  merge_sorted = merge_sort(random_list)
  merge_time = time.process_time() - merge_start

  return (bubble_time, merge_time)

This function can then be used to compare the runtime of bubble sort and merge sort for a random array.

In [None]:
bubble_time, merge_time = compare_runtimes(100)

print('Bubble sort takes %f seconds. Merge sort takes %f seconds.' %
      (bubble_time, merge_time))

To make this more robust, the head of IT wants to generalize this. She wants to have `n_iters` random runs of `compare_runtimes` and calculate the mean runtime of all of the runs for bubble sort and merge sort. Can you modify `compare_runtimes` to accept an `n_iters` parameter and return the mean runtime over `n_iters` iterations?

In [None]:
import random
import time


def mean(arr):
  """Mean of an array."""
  return 1.0 * sum(arr) / len(arr)


def compare_runtimes(n, n_iters, a=0, b=999):
  """Mean runtime of n_iters random n-arrays of bubble sort and merge sort."""
  # TODO(you): Modify this function to return the mean over n_iters iterations,
  # instead of just one iteration.
  random_list = [random.randint(a, b) for _ in range(n)]

  bubble_start = time.process_time()
  bubble_sorted = bubble_sort(random_list)
  bubble_time = time.process_time() - bubble_start

  merge_start = time.process_time()
  merge_sorted = merge_sort(random_list)
  merge_time = time.process_time() - merge_start

  return (bubble_time, merge_time)

### Solution

You can do this by:

1. Moving the original function code to within a `for _ in range(n_iters)` loop.
2. Instead of returning the runtime, append it to a list of bubble/merge runtimes.
3. Return the mean of each runtime array.

In [None]:
import random
import time


def mean(arr):
  """Mean of an array."""
  return 1.0 * sum(arr) / len(arr)


def compare_runtimes(n, n_iters, a=0, b=999):
  """Mean runtime of n_iters random n-arrays of bubble sort and merge sort."""
  bubble_times = []
  merge_times = []

  for _ in range(n_iters):
    random_list = [random.randint(a, b) for _ in range(n)]

    bubble_start = time.process_time()
    bubble_sorted = bubble_sort(random_list)
    bubble_times.append(time.process_time() - bubble_start)

    merge_start = time.process_time()
    merge_sorted = merge_sort(random_list)
    merge_times.append(time.process_time() - merge_start)

  return (mean(bubble_times), mean(merge_times))

In [None]:
# Mean runtime over 1000 iterations for 100-arrays, for bubble and merge sort.
compare_runtimes(100, 1000)

Based on the above code, it seems like merge sort outperforms bubble sort on average, at least for arrays of length 100.

## Question

The head of IT can now show the CTO how the average runtime of bubble sort compares to merge sort for a given array length. Now, she wants to be able to show how this changes for different array lengths. She has decided that a visualization is the most appropriate way to show this.

Given the new `compare_runtimes` function you helped her with in the previous question, she has written some code to plot how the mean runtime changes as the input array length changes. (The first two code cells are the solution to the previous problem. The third code cell contains the new code.)

In [None]:
#persistent
import random
import time


def mean(arr):
  """Mean of an array."""
  return 1.0 * sum(arr) / len(arr)


def compare_runtimes(n, n_iters, a=0, b=999):
  """Mean runtime of n_iters random n-arrays of bubble sort and merge sort."""
  bubble_times = []
  merge_times = []

  for _ in range(n_iters):
    random_list = [random.randint(a, b) for _ in range(n)]

    bubble_start = time.process_time()
    bubble_sorted = bubble_sort(random_list)
    bubble_times.append(time.process_time() - bubble_start)

    merge_start = time.process_time()
    merge_sorted = merge_sort(random_list)
    merge_times.append(time.process_time() - merge_start)

  return (mean(bubble_times), mean(merge_times))

In [None]:
# Mean runtime over 1000 iterations for 100-arrays, for bubble and merge sort.
compare_runtimes(100, 1000)

In [None]:
# This code cell takes a few seconds to run.

import matplotlib.pyplot as plt


mean_bubble_times = []
mean_merge_times = []
x_range = range(10, 50)
n_iters = 100

for n in x_range:
  mean_bubble_time, mean_merge_time = compare_runtimes(n, n_iters)
  mean_bubble_times.append(mean_bubble_time)
  mean_merge_times.append(mean_merge_time)


plt.plot(x_range, mean_bubble_times, label='bubble')
plt.plot(x_range, mean_merge_times, label='merge')
plt.xlabel('length of array')
plt.ylabel('CPU runtime (seconds)')
plt.legend()
plt.show()

This is exactly the opposite to what the head of IT wanted to see! Bubble sort, with a time complexity of $O(n^2)$ seems to be consistently quicker than merge sort, with a time complexity of $O(n\log(n))$. The head of IT has asked you to help her "fix" this graph. Can you help her?

### Solution

In general, for two functions of *n*, *f* and *g*, if *f* has a lower time complexity than *g*, that does *not* necessarily imply that it has a faster runtime for a specific value of *n*. What it implies is that as *n* increases, the runtime of *f* will grow less quickly than the runtime of *g*.

You might be able to see that in the graph presented to you by the head of IT, even though bubble sort appears to have a faster runtime than merge sort for all array lengths, the gap appears to be closing as $n$ increases. Let's see what happens when we increase `x_range` to values up to 100. 

In [None]:
# This code cell takes a few seconds to run.

mean_bubble_times = []
mean_merge_times = []
x_range = range(10, 100)
n_iters = 100

for n in x_range:
  mean_bubble_time, mean_merge_time = compare_runtimes(n, n_iters)
  mean_bubble_times.append(mean_bubble_time)
  mean_merge_times.append(mean_merge_time)

In [None]:
import matplotlib.pyplot as plt


plt.plot(x_range, mean_bubble_times, label='bubble')
plt.plot(x_range, mean_merge_times, label='merge')
plt.xlabel('length of array')
plt.ylabel('CPU runtime (seconds)')
plt.legend()
plt.show()

Finally! Merge sort seems to be outperforming bubble sort in terms of runtime, for larger values of $n$. Let's push this to 200. (Increasing to more than 200 takes a long time to run.)

In [None]:
# This code cell takes a few seconds to run.

mean_bubble_times = []
mean_merge_times = []
x_range = range(10, 200)
n_iters = 100

for n in x_range:
  mean_bubble_time, mean_merge_time = compare_runtimes(n, n_iters)
  mean_bubble_times.append(mean_bubble_time)
  mean_merge_times.append(mean_merge_time)

In [None]:
import matplotlib.pyplot as plt


plt.plot(x_range, mean_bubble_times, label='bubble')
plt.plot(x_range, mean_merge_times, label='merge')
plt.xlabel('length of array')
plt.ylabel('CPU runtime (seconds)')
plt.legend()
plt.show()

This graph shows clearly that merge sort has a lower runtime than bubble sort at large values of $n$. The "crossover" (where the runtimes are approximately equal) occurs at about $n=60$, and from then on, the runtime of bubble sort grows significantly more quickly than that of merge sort.

When the head of IT shows the CTO of *QuickShop* this graph, the leadership is convinced that choosing merge sort over bubble sort will save time and computational resources going forward, so the company decides to go ahead with the transition.