In [6]:
# setup
from IPython.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('../rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import random
import time
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})

<h1>Sorting</h1>

We have seen binary search, which allows us to search a sorted list much faster than we can search an unsorted list. Now, we describe the process of sorting the list in the first place.

We first formalize the problem so that we have a clear understanding of what is allowed.

Assume that the input is given as items on a RAM machine in the cells indexed by $0,1,...,n-1$. Often, these items are integers, but they could be more complicated, like students' names. Items can have the same value but be different. For example, two students may have the same name.


In any step, we are allowed to read the item at an index, overwrite the item at an index or compare two items to see which one is greater.

By using an additional memory cell, we can swap items using $3$ steps. Since $3\in O(1)$, we can consider swapping to be a single operation without affecting the asymptotic runtime analysis.

At the end of the algorithm, we should have a sorted list of the items in the cells $0,\dots,n-1$.

Example: Suppose that we want to sort the list $1,2,3,4,5,\dots,n-2, n-1,0$. We can almost do this in a single step: move the item $0$ to the index $-1$. This is an $O(1)$ algorithm. Except this is not the desired output, because $0$ is at index $-1$, not at index $0$. To correctly solve the problem, each item will need to be moved. For example $1$ is currently at index $0$, so it will need to be moved. This means that we require at least $n$ operations to sort this list, and would have an algorithm with $\Omega(n)$ work.

Some vocabulary related to sorting:

- In-place: Uses a constant amount of extra space. All sorting methods based on comparing and swapping are in-place.
- Stable: If two items have the same value, they should be in the same order after the sorting as before.

<h3>Three basic sorting algorithms</h3>

Below we describe the three basic sorting algorithms. These are not fast or practical at scale. They can be useful to sort small lists because they are simple to implement. They are important because many CS concepts are explained in terms of these basic algorithms.

These three algorithms are just general ideas. You might find slightly different implementations of these ideas, and the different implementation could affect runtime.

1. Bubble Sort:
    - This sorting method compares and swaps the first two items, then the second two items, etc. Lastly, it compares the item at index $n-2$ the item at $n-1$, swapping them if they are out of order. After this, the largest item is on bottom. It sinks like a stone, while the other items "bubble" up to the top. Then, we repeat again on the other $n-1$ items.
    - Bubble sort is the most important sorting method to know, because everyone knows it, and it provides a common example.
2. Selection Sort:
    - This sorting method steps through the list to "select" the smallest element. Then it puts the smallest element first. Then it repeats on the other $n-1$ elements that are not first.
    - Selection sort is almost the reverse of Bubble sort. Bubble sort puts the largest item last, then repeats. Selection sort puts the smallest item first then repeats.
3. Insertion Sort:
    - Insertion sort is the method that contract bridge players use to organize their hands. It consists of a loop from $0$ to $n-1$. At stage $k$ of the loop, the first $k$ items are sorted. The algorithm investigates the item at index $k+1$ and determines where to put it so that the first $k+1$ items are sorted.

In [14]:
#copied from https://www.geeksforgeeks.org/comparison-among-bubble-sort-selection-sort-and-insertion-sort/
def bubble_sort(arr):
    n = len(arr)
    for i in range(n):
        for j in range(n - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
        print(arr)

def selection_sort(arr):
    n = len(arr)
    for i in range(n):
        min_index = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_index]:
                min_index = j
        arr[i], arr[min_index] = arr[min_index], arr[i]
        print(arr)

def insertion_sort(arr):
    n = len(arr)
    for i in range(1, n):
        key = arr[i]
        j = i - 1
        while j >= 0 and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        arr[j + 1] = key
        print(arr)

def time_functions():
    # Generate a list of 10000 random integers
    arr = [random.randint(1, 10000) for i in range(10000)]

    # Sort the list using each algorithm and time it
    start_time = time.time()
    bubble_sort(arr.copy())
    bubble_sort_time = time.time() - start_time

    start_time = time.time()
    selection_sort(arr.copy())
    selection_sort_time = time.time() - start_time

    start_time = time.time()
    insertion_sort(arr.copy())
    insertion_sort_time = time.time() - start_time

    print("Bubble Sort time:", bubble_sort_time)
    print("Selection Sort time:", selection_sort_time)
    print("Insertion Sort time:", insertion_sort_time)


In [15]:
#bubble_sort([8,9,10,2,3,0,1,9,8,3,19])
#selection_sort([8,9,10,2,3,0,1,9,8,3,19])
insertion_sort([8,9,10,2,3,0,1,9,8,3,19])

[8, 9, 10, 2, 3, 0, 1, 9, 8, 3, 19]
[8, 9, 10, 2, 3, 0, 1, 9, 8, 3, 19]
[2, 8, 9, 10, 3, 0, 1, 9, 8, 3, 19]
[2, 3, 8, 9, 10, 0, 1, 9, 8, 3, 19]
[0, 2, 3, 8, 9, 10, 1, 9, 8, 3, 19]
[0, 1, 2, 3, 8, 9, 10, 9, 8, 3, 19]
[0, 1, 2, 3, 8, 9, 9, 10, 8, 3, 19]
[0, 1, 2, 3, 8, 8, 9, 9, 10, 3, 19]
[0, 1, 2, 3, 3, 8, 8, 9, 9, 10, 19]
[0, 1, 2, 3, 3, 8, 8, 9, 9, 10, 19]


<h3>Analysis of basic sorting algorithms</h3>

All of the basic sorting algorithms have worst-case and average-case runtime $O(n^2)$.

The source Geeks for Geeks incorrectly claims that the best-case runtime for Bubble Sort is $O(n)$, and that this occurs when the list is already sorted. It might be possible to implement Bubble Sort like this, but as implemented above, Bubble Sort involves $n(n-1)$ comparisons for every list of length $n$.

We can analyze these runtimes more cleanly by implementing the algorithms recursively and deriving recursive equations for the runtime.

For Bubble Sort, this becomes $Bubble(n)=Bubble(n-1)+O(n) \in \Theta(n^2)$.

Selection Sort obeys the same recursive equation, $Selection(n)=Selection(n-1)+O(n)$, and so

(comes with a catch) Insertion sort with binary search obeys the recursive equation $Insertion(n) = Insertion(n-1) + O(\log(n)) = O(n\log(n))$. But it is well-known that Insertion Sort is an $O(n^2)$ algorithm. Where does this discrepancy come from? 

