## InsertionSort

InsertionSort works like sorting cards into a hand. The item to be sorted is swapped from the back of the already sorted subarray until it reaches an element smaller or equal to it. It is therefore also a stable sorting algorithm. At best, it does not have to swap any elements in the sorted subarray, and the time complexity is $O(n)$. At worst, it has to perform $1+2+3+...n-1$ swaps, which then makes the time complexity $O(n^2)$. It has a memory complexity of O(1).

## Generating a random sequence

In [21]:
import numpy as np

In [22]:
seed = 17
np.random.seed(seed)
n = 10
high = 100
low = 0
randseq = np.random.randint(low, high, n).tolist()
print(randseq)

[15, 6, 22, 57, 45, 22, 31, 68, 39, 84]


# The algorithm step-by-step

## Step 1: The first element is already considered sorted

## Step 2: The second element is taken and checked with the first

In [23]:
# ONLY if smaller, perform swap
if randseq[1]<randseq[0]:
    temp = randseq[0]
    randseq[0] = randseq[1]
    randseq[1] = temp

print(randseq)

[6, 15, 22, 57, 45, 22, 31, 68, 39, 84]


## Step 3: Perform the same for the third element

In [24]:
# Starting with storing the index of the element to be inserted
inserterindex = 2
# We use a while loop to automate until an element smaller or equal to is found.
while randseq[inserterindex] < randseq[inserterindex-1]:
    # swap the element to be inserted with the previous element
    temp = randseq[inserterindex-1]
    randseq[inserterindex-1] = randseq[inserterindex]
    randseq[inserterindex] = temp
    # Update the index of the element to be inserted
    inserterindex -=1
    # Break out of the loop of inserterindex is 0, ie, we have the first element
    if inserterindex==0:
        break

print(randseq)

[6, 15, 22, 57, 45, 22, 31, 68, 39, 84]


This is then repeated for whole sequence

## Combining the steps into one algorithm

In [25]:
# Generating the random sequence again
np.random.seed(seed)
n = 10
high = 100
low = 0
randseq = np.random.randint(low, high, n).tolist()
print(randseq)

[15, 6, 22, 57, 45, 22, 31, 68, 39, 84]


In [26]:
def InsertionSort(seq):
    array_length = len(seq)
    # for each element in the sequence except the first
    for i in range(1, array_length):
        # Starting with storing the index of the element to be inserted
        inserterindex = i
        # We use a while loop to automate until an element smaller or equal to is found.
        while seq[inserterindex] < seq[inserterindex-1]:
            # swap the element to be inserted with the previous element
            temp = seq[inserterindex-1]
            seq[inserterindex-1] = seq[inserterindex]
            seq[inserterindex] = temp
            # Update the index of the element to be inserted
            inserterindex -=1
            # Break out of the loop of inserterindex is 0, ie, we have the first element
            if inserterindex==0:
                break
    return seq

In [27]:
print(InsertionSort(randseq))

[6, 15, 22, 22, 31, 39, 45, 57, 68, 84]


# Timing the algorithm

In [28]:
def InsertionSortTester(n, high, low=0):
    randseq = np.random.randint(low, high+1, n).tolist()
    return InsertionSort(randseq)

In [29]:
print(InsertionSortTester(100,100))

[1, 1, 2, 2, 3, 3, 4, 5, 6, 6, 7, 7, 8, 10, 10, 12, 13, 13, 13, 14, 15, 17, 17, 17, 18, 20, 21, 24, 25, 26, 27, 27, 27, 30, 32, 33, 35, 35, 36, 38, 38, 40, 41, 41, 43, 44, 44, 45, 46, 47, 47, 49, 49, 49, 49, 50, 50, 51, 51, 52, 54, 56, 56, 56, 56, 57, 57, 59, 60, 61, 61, 62, 63, 63, 64, 65, 66, 67, 67, 72, 74, 74, 76, 79, 83, 83, 84, 87, 88, 89, 90, 91, 91, 91, 93, 96, 98, 99, 100, 100]


## Standard Cases

In [30]:
%timeit InsertionSortTester(10,10)

4.08 μs ± 29.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [31]:
%timeit InsertionSortTester(100,100)

143 μs ± 511 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [32]:
%timeit InsertionSortTester(10_000,10_000)

1.43 s ± 16.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Best Case when all elements are sorted ($O(n)$)

In [33]:
# Sequence of sorted elements
def best_case(n):
    sorted = [i for i in range(n)]
    return InsertionSort(sorted)

In [34]:
%timeit best_case(10_000)

362 μs ± 3.85 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


## Worst case when all elements are sorted in reverse $(O(n^2))$

In [35]:
def worst_case(n):
    # Sequence of elements sorted in reverse
    rev_sorted = [i for i in range(n-1, -1, -1)]
    return InsertionSort(rev_sorted)

In [36]:
%timeit worst_case(10_000)

2.85 s ± 12.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Case with multiple duplicates

In [37]:
def duplicates(n):
    # Sequence of elements with many duplicates
    dup = [n]+[int(n/2) for i in range(n-2)] + [0]
    return InsertionSort(dup)

In [38]:
%timeit duplicates(10_000)

1.85 ms ± 3.94 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [39]:
def duplicates2(n):
    # Sequence of elements with many duplicates
    dup = [0]+[int(n/2) for i in range(n-2)] + [n]
    return InsertionSort(dup)

In [40]:
%timeit duplicates2(10_000)

729 μs ± 8.79 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


As expected, the worst case does significantly worse than the best case. In the case of many duplicates, the performances seems to be closer to the best case than to the worst, which is to be expected. 