# Insertion Sort

## Lesson Overview

**Insertion sort** is a sorting algorithm that repeatedly removes the first element of the input array and searches for the right place to put it in a sorted output array.

> The average case time complexity of insertion sort is $O(n^2)$.

### Algorithm

An example implementation of insertion sort is outlined here.

0. **Initialize** an output array that will eventually contain all of the elements of the input array, sorted.

1. **Insert** the first element of the input array to the output array, inserting it such that the output array maintains ordering. (Ensure to remove the moved element from the input array.)

2. **Repeat** the insertion step until the input array is empty.

**Example**

The following table demonstrates sorting the array [2, 1, 4, 5] using insertion sort.

**Iteration** | **Input array** | **Output array**
--- | --- | ---
0 | [2, 1, 4, 5] | []
1 | [1, 4, 5] | [2]
2 | [4, 5] | [1, 2]
3 | [5] | [1, 2, 4]
4 | [] | [1, 2, 4, 5]

### Space complexity

Insertion sort can be implemented either in-place or out-of-place.

Insertion sort may appear like an out-of-place sorting algorithm since it creates a new output array. However, it does not *copy* the input array, it *moves* elements from the input array to the output array. Insertion sort therefore does not create any *new* storage, so is $O(1)$.

## Question

Most of the heavy lifting in insertion sort comes from finding the right place to insert an element into the sorted array, so let's start there. Write a function that takes in a sorted array along with a new element, and inserts the element in the appropriate place in the array. Your function should not `return` anything, instead it should modify the input array `arr`.

In [None]:
def insert_into_sorted_array(arr, el):
  """Inserts an integer el into arr, an array of sorted integers."""
  # TODO(you): Implement
  print('This function has not been implemented.')

### Hint

`arr.insert(el, idx)` inserts the element `el` at index `idx` of an array `arr`.

### Unit Tests

Run the following cell to check your answer against some unit tests.

In [None]:
arr = []

insert_into_sorted_array(arr, 4)
print(arr) # Should print: [4]

insert_into_sorted_array(arr, 2)
print(arr) # Should print: [2, 4]

insert_into_sorted_array(arr, 2)
print(arr) # Should print: [2, 2, 4]

insert_into_sorted_array(arr, 1)
print(arr) # Should print: [1, 2, 2, 4]

### Solution

In [None]:
def insert_into_sorted_array(arr, el):
  """Inserts an integer el into arr, an array of sorted integers."""

  # Iterate through arr and insert el at the first position whose value is
  # greater than or equal to el. If a value, exit the function.
  for i in range(len(arr)):
    if el <= arr[i]:
      arr.insert(i, el)
      return

  # If no element was found in arr whose value is greater than or equal to el,
  # insert el at the end of arr.
  arr.append(el)

## Question

What is the big-O time complexity of `insert_into_sorted_array` in the best, average, and worst case?

In [None]:
def insert_into_sorted_array(arr, el):
  """Inserts an integer el into arr, an array of sorted integers."""

  # Iterate through arr and insert el at the first position whose value is
  # greater than or equal to el. If a value, exit the function.
  for i in range(len(arr)):
    if el <= arr[i]:
      arr.insert(i, el)
      return

  # If no element was found in arr whose value is greater than or equal to el,
  # insert el at the end of arr.
  arr.append(el)

In [None]:
#freetext

### Solution

All of the operations in the function are $O(1)$, so the complexity is the number of iterations before `return` is called. Let $n$ be the length of the input array.

In the trivial case, `len(arr) == 0`, and the function does not requires 0 iterations, so is $O(1)$.

In the best non-trivial case, the element being added is less than the first element of the sorted array, so `el <= arr[0]`. This case requires only 1 iteration, so is $O(1)$.

In the worst case, the element being added is greater than the maximum element of the sorted array, so `el > arr[i]` for all `i`. This case requires $n$ iterations *plus* the final append, which is equivalent to $n+1$ iterations, which is $O(n)$.

Remember that the average case complexity is the mean complexity averaged over all possible insertion indices. If `el` is inserted at index 0, it requires 1 iteration. If `el` is inserted at index 1, it requires 2 iterations. If `el` is inserted at index 3, it requires 3 iterations, and so on. If `el` is inserted at position $n-1$, it requires $n$ iterations. If `el` is appended to the end of the array, it effectively requires $n+1$ iterations. This is equivalent to taking the mean of the integers between 1 and $n+1$. We have

\begin{align*}
\frac{1}{n+1} \sum\limits_{i=1}^{n+1} i &= \frac{1}{n+1} \frac{(n+2)(n+1)}{2} \\
&= \frac{n+2}{2} \\
&= \frac{1}{2} n + 1 \\
&= O(n), \\
\end{align*}

where the first equality comes from the [formula for an arithmetic sum](https://en.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_%E2%8B%AF).

Therefore, the average case complexity, like the worst case, is $O(n)$, while the best case is $O(1)$. For this function, the average case number of iterations, $\frac{n+2}{2}$, is equal to half way between the best case number of iterations, 1, and the worst case number of iterations, $n+1$.

## Question

Use `insert_into_sorted_array` to implement insertion sort.

In [None]:
#persistent
def insert_into_sorted_array(arr, el):
  """Inserts an integer el into arr, an array of sorted integers."""

  # Iterate through arr and insert el at the first position whose value is
  # greater than or equal to el. If a value, exit the function.
  for i in range(len(arr)):
    if el <= arr[i]:
      arr.insert(i, el)
      return

  # If no element was found in arr whose value is greater than or equal to el,
  # insert el at the end of arr.
  arr.append(el)

In [None]:
def insertion_sort(arr):
  """Sorts an array of integers in ascending order."""
  # TODO(you): Implement
  print('This function has not been implemented.')

### Hint

Use the `pop` method to repeatedly remove the first element of the input array into the output array. Call `insert_into_sorted_array` to add the popped element to the output array such that the output array maintains order.

### Unit Tests

Run the following cell to check your answer against some unit tests.

In [None]:
print(insertion_sort([2, 1, 4, 5, 2, 3, 7, 6]))
# Should print: [1, 2, 2, 3, 4, 5, 6, 7]

### Solution

Insertion sort loops through each element in the input array and inserts it into the output array by repeatedly calling `insert_into_sorted_array`.

In [None]:
def insertion_sort(arr):
  """Sorts an array of integers in ascending order."""
  output = []

  while len(arr) > 0:
    i = arr.pop(0)
    insert_into_sorted_array(output, i)
  
  return output

## Question

What is the big-O time complexity of `insertion_sort`, in the best, average, and worst case? 

Remember that `insert_into_sorted_array` is $O(n)$ in the worst and average case, and $O(1)$ in the best case.

In [None]:
def insertion_sort(arr):
  """Sorts an array of integers in ascending order."""
  output = []

  while len(arr) > 0:
    i = arr.pop(0)
    insert_into_sorted_array(output, i)
  
  return output

In [None]:
#freetext

### Solution

As per a previous question and the hint, `insert_into_sorted_array` is $O(n)$ in the average and worst case, and $O(1)$ in the best case. The implementation of `insertion_sort` is essentially $n$ calls of `insert_into_sorted_array`. Therefore, `insertion_sort` is $O(n)$ in the best case, and $O(n^2)$ in the average and worst case.