<a href="https://colab.research.google.com/github/Ryan-M-Smith/CS315/blob/main/InClass/selection_sort.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Template for testing Selection Sort on small lists, outputing each intermediate step

In [2]:
import random
import time
import numpy as np
from typing import Any

In [3]:
def selection(arr: list[Any], print_flag: bool = False) -> None:
    n = len(arr)

    for i in range(n):
        smallest = i

        for j in range(i + 1, n):
            if arr[smallest] > arr[j]:
                smallest = j

        arr[i], arr[smallest] = arr[smallest], arr[i]

        if print_flag:
            print(arr)

In [4]:
n = 8
arr = list(range(n))
random.shuffle(arr)

print(arr)
selection(arr, print_flag=True)
print(arr)

[2, 0, 4, 7, 6, 5, 1, 3]
[0, 2, 4, 7, 6, 5, 1, 3]
[0, 1, 4, 7, 6, 5, 2, 3]
[0, 1, 2, 7, 6, 5, 4, 3]
[0, 1, 2, 3, 6, 5, 4, 7]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7]


In [5]:
#
# HW04 - time Selection Sort on random data for a variety of array sizes.
#

rng = np.random.default_rng()
TEST_RUNS = 10
sizes = [10, 50, 100, 1000, 2500, 7500, 10000, 12500, 17500, 20000]

for i in range(TEST_RUNS):
    n = sizes[i]
    arr = rng.integers(low=0, high=100, size=n).tolist()

    start = time.perf_counter()
    selection(arr)
    end = time.perf_counter()

    print(f"n = {n:<8,} time = {(end - start) * 1_000:.3f} ms")

n = 10       time = 0.008 ms
n = 50       time = 0.108 ms
n = 100      time = 0.265 ms
n = 1,000    time = 40.189 ms
n = 2,500    time = 222.407 ms
n = 7,500    time = 3215.181 ms
n = 10,000   time = 2916.149 ms
n = 12,500   time = 5061.201 ms
n = 17,500   time = 7664.191 ms
n = 20,000   time = 13272.571 ms


## Results

Because Selection Sort has an expected runtime of $\Theta\left(n^2\right)$, we will calculate all ratios with a denominator of $n^2$.

| $n$    | $T(n)$    | $\dfrac{T(n)}{n^2}$      |
| ------ | --------- | ------------------------ |
| 10     | 0.008     | 0.0000800                |
| 50     | 0.108     | 0.0000432                |
| 100    | 0.265     | 0.0000265                |
| 1,000  | 40.189    | 0.0000402                |
| 2,500  | 222.407   | 0.0000356                |
| 7,500  | 3215.181  | 0.0000572                |
| 10,000 | 2916.149  | 0.0000292                |
| 12,500 | 5061.201  | 0.0000324                |
| 17,500 | 7664.191  | 0.0000250                |
| 20,000 | 13272.571 | 0.0000332                |

Looking at our ratios, as $n$ gets larger, the results tend to stabilize around a factor of $1.0 \times 10^{-5}$, showing that the runtime tends to stabilize
around a ratio of $n^2$. This strongly suggests that the algorithm runs in $\Theta\left(n^2\right)$ time.


In [6]:
#
# HW04 - time Selection Sort on sorted data for a variety of array sizes.
#

for i in range(TEST_RUNS):
    n = sizes[i]
    arr = list(range(n))

    start = time.perf_counter()
    selection(arr)
    end = time.perf_counter()

    print(f"n = {n:<8,} time = {(end - start)*1_000:.3f} ms")

n = 10       time = 0.006 ms
n = 50       time = 0.074 ms
n = 100      time = 0.173 ms
n = 1,000    time = 26.749 ms
n = 2,500    time = 155.289 ms
n = 7,500    time = 1695.416 ms
n = 10,000   time = 3431.555 ms
n = 12,500   time = 3860.293 ms
n = 17,500   time = 8824.608 ms
n = 20,000   time = 10751.659 ms
