<a href="https://colab.research.google.com/github/Ryan-M-Smith/CS315/blob/main/InClass/insertion_sort.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [12]:
# Template for testing Insertion-Sort on small lists, outputing each intermediate step

In [13]:
import random
import time
import numpy as np
from typing import Any

In [14]:
def insertion(arr: list[Any], *, print_flag: bool = False):

  # For your convenience, here is the psuedocode from the textbook for Insertion-Sort
  # You can adapt it for this problem, there aren't many changes required
  # Note that N is defined in a different execution group and doesn't need to be a parameter
  # Also, it is nice to have a boolean flag controlling whether the intermediate result is printed
  # The psuedocode is written to be "in-place," so the input list changes

  # Insertion-Sort(A,n)
  # for i=2 to n:
  #    key = A[i]
  # insert A[i] into sorted seq A[1]..A[i-1]
  #    j = i-1
  #    while j>0 and A[j]>key:
  #       A[j+1] = A[j]
  #       j = j-1
  #    A[j+1] = key

  for i in range(1, len(arr)):
    key = arr[i]
    j = i - 1

    while j >= 0 and arr[j] > key:
      arr[j + 1] = arr[j]
      j -= 1

    arr[j + 1] = key

    if print_flag:
        print(arr)

In [15]:
n = 8
arr = list(range(n))
random.shuffle(arr)

print(f"Unsorted: {arr}")
insertion(arr, print_flag=True)
print(f"Sorted: {arr}")

Unsorted: [7, 0, 6, 4, 2, 5, 1, 3]
[0, 7, 6, 4, 2, 5, 1, 3]
[0, 6, 7, 4, 2, 5, 1, 3]
[0, 4, 6, 7, 2, 5, 1, 3]
[0, 2, 4, 6, 7, 5, 1, 3]
[0, 2, 4, 5, 6, 7, 1, 3]
[0, 1, 2, 4, 5, 6, 7, 3]
[0, 1, 2, 3, 4, 5, 6, 7]
Sorted: [0, 1, 2, 3, 4, 5, 6, 7]


In [16]:
#
# HW04 - time Insertion Sort on random data for a variety of array sizes.
#

rng = np.random.default_rng()
TEST_RUNS = 10
sizes = [10, 50, 100, 1000, 2500, 7500, 10000, 12500, 17500, 20000]

for i in range(TEST_RUNS):
		n = sizes[i]
		arr = rng.integers(low=0, high=100, size=n).tolist()

		start = time.perf_counter()
		insertion(arr)
		end = time.perf_counter()

		print(f"n = {n:<8,} time = {(end - start)*1_000:.3f} ms")

n = 10       time = 0.010 ms
n = 50       time = 0.091 ms
n = 100      time = 0.296 ms
n = 1,000    time = 30.280 ms
n = 2,500    time = 152.713 ms
n = 7,500    time = 1385.619 ms
n = 10,000   time = 2741.543 ms
n = 12,500   time = 4779.251 ms
n = 17,500   time = 7717.755 ms
n = 20,000   time = 10719.440 ms


## Results

Because Merge Sort has an expected runtime of $\Theta\left(n^2\right)$, we will calculate all ratios with a denominator of $n^2$.

| n      | time (ms) | $\dfrac{T(n)}{n^2}$      |
| ------ | --------- | ------------------------ |
| 10     | 0.010     | 0.0001                   |
| 50     | 0.091     | 0.0000364                |
| 100    | 0.296     | 0.0000296                |
| 1,000  | 30.280    | 0.00003028               |
| 2,500  | 152.713   | 0.00002443               |
| 7,500  | 1385.619  | 0.00002463               |
| 10,000 | 2741.543  | 0.00002742               |
| 12,500 | 4779.251  | 0.00003053               |
| 17,500 | 7717.755  | 0.00002518               |
| 20,000 | 10719.440 | 0.00002680               |


Looking at our ratios, as $n$ gets larger, the results tend to stabilize around a factor of $1.0 \times 10^{-5}$, showing that the runtime tends to stabilize
around a ratio of $n^2$. This strongly suggests that the algorithm runs in $\Theta\left(n^2\right)$ time.


In [17]:
#
# HW04 - time Insertion Sort on sorted data for a variety of array sizes.
#

for i in range(TEST_RUNS):
		n = sizes[i]
		arr = list(range(n))

		start = time.perf_counter()
		insertion(arr)
		end = time.perf_counter()

		print(f"n = {n:<8,} time = {(end - start)*1_000:.3f} ms")

n = 10       time = 0.004 ms
n = 50       time = 0.008 ms
n = 100      time = 0.013 ms
n = 1,000    time = 0.113 ms
n = 2,500    time = 0.292 ms
n = 7,500    time = 0.929 ms
n = 10,000   time = 1.528 ms
n = 12,500   time = 1.543 ms
n = 17,500   time = 4.476 ms
n = 20,000   time = 5.679 ms
