In [None]:
# setup
from IPython.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('../rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import random
import time
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})
# Bold formatting
bold_start = "\033[1m"
bold_end = "\033[0m"

<h1>Fast sorting</h1>

Last time, we covered the basic sorting algorithms, Bubblesort, Selectionsort, and Insertionsort. These algorithms which run in $O(n^2)$ time in the worst case, where $n$ is the length of the list. This time, we will cover the faster sorting algorithms, mergesort and quicksort.

<h3>Mergesort</h3>

Mergesort applies the most obvious divide and conquer strategy to sorting. We divide the list into two roughly equal parts, then sort each part. We combine the two parts in a merge step.

The merge step takes two sorted lists (let's call them list1 and list2) and returns a sorted list (new_list) that contains the items of both original lists. 

The basic operation of the merge step proceeds by finding the smallest item in list1 or list2. Since list1 is already sorted, we can extract its smallest item in $O(1)$ time by looking at its first item. Similarly, list2 is already sorted, so we can extract the smallest item of list2 in $O(1)$ time. The smaller of these two item is the smallest item of list1 or list2 and should be at the front of new_list.

The merge step repeats this basic operation to repeatedly find the smallest item of list1 or list2 and puts that item next in new_list.

In [16]:
def merge_step(list1,list2):
    #assumes list1 and list2 are sorted.
    #Returns the list (list1+list2).sort()
    new_list=[]
    while(len(list1)>0 and len(list2)>0):
        #Iteratively move the smaller of the smallest to the new list until one list is empty.
        smallest_of_list1 = list1[0]
        smallest_of_list2 = list2[0]
        if smallest_of_list1<smallest_of_list2:
            list1=list1[1:] #Removes the first item of list1
            new_list.append(smallest_of_list1) #Puts that first item on new_list.
        else:
            list2 = list2[1:]
            new_list.append(smallest_of_list2)
    #At this point, at least one of list1 or list2 is empty. 
    #Put the remaining part of the nonempty list on the end of new_list.
    if len(list1)==0:
        new_list.extend(list2)
    elif len(list2)==0:
        new_list.extend(list1)
    return new_list

def mergesort(list_to_sort,quiet_mode = False):
    #A divide-and-conquer method for sorting.
    #list_to_sort is the input. quiet_mode is a boolean that determines whether partial outputs will be printed.
    if len(list_to_sort)<=1: #Base case.
        return list_to_sort
    #The divide step:
    middle = len(list_to_sort)//2 
    list1 = list_to_sort[:middle] 
    list2 = list_to_sort[middle:] 
    #Conquer, recursively:
    list1 = mergesort(list1,quiet_mode=quiet_mode)
    list2 = mergesort(list2,quiet_mode=quiet_mode)
    #Uncomment the next line to print the partial outputs.
    if not quiet_mode:
        print(','.join([str(l) for l in list1]), "|",','.join([str(l) for l in list2]))
        print(','.join([ str(l) for l in merge_step(list1,list2)]))
        print("---")
    #Complete conquering at this step by calling merge_step to combine the sorted lists, list1 and list2
    return merge_step(list1,list2)

def check_mergesort():
    arbitrary_list = [1,3,1,8,1,3,0,92,2,-4,2,-10,45,6]
    assert mergesort(arbitrary_list,quiet_mode=True)==sorted(arbitrary_list)
check_mergesort()

<h3>Analysis of mergesort</h3>

Mergesort is a divide-and-conquer algorithm, so we can obtain a recursive equation for the worst-case runtime.

This equation is $W_M(n)=2W_M(\frac{n}{2})+\Theta(n)$.

Drawing the recursion tree and solving, we find that we are in the "balanced" case, and the runtime is $\Theta(n\log(n))$.

The recursive equation makes it clear that we can parallelize mergesort easily by making both recursive calls in parallel.

The span of mergesort is

$S_M(n)=S_M(\frac{n}{2})+\Theta(n)$.

Using this recursive equation, we see that the span is $n+\frac{n}{2}+\frac{n}{4}+\dots \leq 2n\in O(n).$

Actually, we can do better by parallelizing the merge step. There is no obvious way to paralellize the merege step, but we will show in the coming week that the merge step can be performed in $O(\log(n))$ span by applying a clever trick. This means that the span of the improved mergesort algorithm $M^\prime$ satisfies $S_{M^\prime}(n) = S_{M^\prime}(\frac{n}{2})+O(\log(n))$. By repeatedly applying this recursive equation, we obtain 
\begin{align*}
S_{M^\prime}(n)=\sum_{i=0}^{\log(n)}O(\log(2^i))=\sum_{i=0}^{\log(n)}O(i) =O((\log(n))^2)\end{align*}.

<h3>Quicksort</h3>

Quicksort is another divide-and-conquer approach to sorting. The idea is simple: Designate some item of the list as the "pivot." Then, compare each other item to the pivot and move the item before or after the pivot according to that comparison.

In [27]:
def quicksort(arr,quietmode=False): #Code generated by ChatGPT3.5
    if len(arr) <= 1:
        return arr
    pivot_index = len(arr) // 2
    pivot = arr[pivot_index] #Chooses the pivot to be in the middle.
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    if not quietmode:
        #We print the unsorted array with the pivot in bold.
        print(','.join([str(l) for l in arr[:pivot_index]]) + f",{bold_start}" + str(pivot) +f"{bold_end}," + ','.join([str(l) for l in arr[pivot_index+1:]]))
        #Then we print what it will look like when you move relative to the pivot.
        print(','.join([str(l) for l in sorted(left)]) +f",{bold_start}"+','.join([str(l) for l in middle])+f"{bold_end}," + ','.join([str(l) for l in sorted(right)]))
        print('---')
    return quicksort(left,quietmode=quietmode) + middle + quicksort(right,quietmode=quietmode) #Makes two recursive calls.

def check_quicksort():#Test was not generated by ChatGPT
    arbitrary_list = [1,3,1,8,1,3,0,92,2,-4,2,-10,45,6]
    assert quicksort(arbitrary_list,quietmode=True)==sorted(arbitrary_list)
check_quicksort()


<h3>Analysis of quicksort</h3>

Quicksort is a divide-and-conquer-method. The recursive formula depends on the input. The combine step takes $O(n)$ work, because we must compare each of $n-1$ items to the pivot. After each comparison, we can move the items to the appropriate place in constant time.

<h5>Details on the combine step</h5>
Let's analyze moving the items to their appropriate places in contsant time. We saw in the case of insertion sort that this movement can sometimes be resource-intensive, so we should be careful here. The Python implementation runs in linear time, but it uses list-slicing so it requires additional memory and is therefore not an in-place sorting method.

Here's one way to perform the combine step of quicksort in-place.

- Loop through the list and count the size of left, middle, and right.
- Swap the items of middle into their places. Note the index of the last item in middle.
- Initialize left and right counters to 0. These counters count the number of items that have been correctly placed in left and right, respectively.
- Loop through the list again. Only progress to the next item if the item currently considered does not move.
- Every time you encounter an item that should go to left, swap it with the item at the index of the left counter. Increment the left counter.
- Every time you encounter an item that should go to right, swap it with the item at the index of the right counter plus the index of the last item in middle. Increment the right counter.
- Every time you encounter an item that should go to middle, keep it where it is.

We looped through the list twice. The first loop just compares and counts, so it takes $O(n)$ time. At each step in the second loop, the left counter increases, or the right counter increases, or an item of middle is encountered. Thus, the total amount of work in the second loop is $O(n)$.

<h5>Worst case analysis of Quicksort</h5>

In the worst case, each pivot is the largest (or smallest) item of the list. This leads to an unbalanced recursion, where all of the work is on the recursive call to the left (or right) part of the list. In this case, the recursive equation for quicksort is

\begin{align*}
W_{Quicksort}(n)=W_{Quicksort}(n-1)+W_{Quicksort}(1) + \Theta(n).
\end{align*}

Unraveling the recursive equation, we find that
\begin{align*}
W_{Quicksort}(n)=\sum_{i=0}^n \Theta(i) \in \Theta(n^2).
\end{align*}

It's clear that the worst-case span of quicksort obeys the same recursive equation and is also $\Theta(n^2)$.

<h3>Average case analysis of Quicksort:</h3>

Sorting problems have an easily-defined average-case behavior. The average is the average over all $n!$ permutations of the list, where we assume that each permutation is equally likely. This relies on the fact that our sorting algorithms only use comparisons, and are blind to the actual items in the list. Thus, effectively, there are only finitely many inputs and we can define an average distribution as the uniform distribution over these finitely many inputs.

<h5>Heuristic argument for average case of Quicksort</h5>

The worst-case behavior of quicksort occurs when every single pivot is the maximal or minimal item. This would be extremely unlucky. In general, we should expect that the pivot will be somewhere in the middle of the list. Let's assume that the pivot is at index $cn$, where $0\leq c\leq n$ each time.

\begin{align*}A_{Quicksort}(n) \approx A(cn) +A((1-c)n) + O(n)\end{align*}.

Each level of the recursion tree has the same cost, $O(n)$. The number of levels is somewhere between $\log_{\frac{1}{c}}(n)$ and $\log_{\frac{1}{1-c}}(n)$. Up to asymptotics, the base of the logarithm does not matter, as long as $c\neq 0,1$. There are $\Theta(\log(n))$ levels. Therefore, the total work in the average case is abount $\Theta(n\log(n))$.

<h5>Careful analysis of Quicksort average case</h5>

The Heuristic argument assumed that the pivot is always at index $cn$. Technically, we must be much more careful and keep track of the probabilities that the index is at $cn$. There is a better argument to analyze the average case behavior of quicksort. Our analsis is simplified if we assume that the values of the items are unique. In this case, each item can be identified with a number in $\{0,1,\dots,n-1\}$ that is its index after the list is sorted.

Let $X_{i,j}$ be a random variable (depending on the permutation of the $n$ items) that is $0$ if item $i$ is not compared to item $j$ and $1$ otherwise. Since $X_{i,j}$ is an indicator random variable(its values are $0$ and $1$), its expected value is the probability that $i$ is compared with $j$. Let $X= \sum_{i<j}X_{i,j}$ be the random variable that records the number of comparisons being made. Linearity of expectation states that $\mathbb{E}(X) = \sum_{i<j}\mathbb{E}(X_{i,j})$. In other words, to calculate the average number of comparisons, we calculate the probability that pairs will be compared and sum these probabilities.

At each step of quicksort, we only compare items to the pivot. To calculate $\mathbb{E}(X_{i,j})$ (the probability a pair of items $x leq y$ will be compared) note that choosing the permutation randomly amounts to choosing the pivots uniformly at random. Since we are assuming that the values of the items are unique, the probability that $i$ is compared with $j$ is precisely the probability that $i$ or $j$ is chosen to be the pivot before any of the items in the set $\{k \in list \mid i<k<j \}$. This gives us $\mathbb{E}(X_{i,j})=\frac{2}{j-i+1}$.

Thus, the total work of quicksort in the average case is 

\begin{align*}
    A_{Quicksort}(n)= \sum_{0\leq i<j< n} \mathbb{E}(X_{i,j}) = \sum_{0\leq i<j< n} \frac{2}{j-i+1}.
\end{align*}

Let's change variables and set $x=i-j+1$. Each value of $x$ can be realized as $n-x$ different values of $i$ and $j$.

\begin{align*}
    = \sum_{2\leq x \leq n} (n-x) \frac{2}{x} = (n\sum_{2\leq x \leq n} \frac{2}{x})-2(n-1)
\end{align*}

Next, we use the nice trick of summing the harmonic series via integral: 
\begin{align*}
\sum_{2\leq x \leq n} \frac{2}{x} \leq 2\sum_{1\leq x \leq n} \frac{1}{x} \leq 2\int_{x=1}^n \frac{1}{x} = \log(n).
\end{align*}

The inequality in the middle comes from interpreting the summation as a Riemann sum. Since the terms in the sum are decreasing, we can approximate the sum with an integral. The $\frac{1}{x}$ curve passes through the top-right corner of each rectangle in Riemann sum, and is therefore an upper bound. The trick completes the argument that

\begin{align*}
A_{Quicksort}(n) \in O(n\log(n))
\end{align*}