In [1]:
from IPython.core.display import HTML
with open('../style.css') as file:
    css = file.read()
HTML(css)

The function `logging` is used as a *decorator*.  It takes a function 
`f` as its argument and returns a new function `logged_f` that returns 
the same result as the function `f`, but additionally it prints its arguments 
before the function is called and when the function returns, both the function 
call and the result is printed.

The *decorator* `logging` is useful for debugging.

In [None]:
def logging(f):
    def logged_f(*a):
        print(f'{f.__name__}{a}')
        r = f(*a)
        print(f'{f.__name__}{a} = {r}')
        return r
    return logged_f

# An Array-Based Implementation of Quick-Sort

The function $\texttt{sort}(L)$ sorts the list $L$ in place.

In [2]:
def sort(L):
    quickSort(0, len(L) - 1, L)

The function `quickSort(start, end, L)` sorts the sublist `L[start:end+1]` in place.

In [3]:
def quickSort(start, end, L):
    if end <= start:
        return  # at most one element, nothing to do
    m = partition(start, end, L)  # m is the split index
    quickSort(start, m - 1, L)
    quickSort(m + 1, end  , L)

The function $\texttt{partition}(\texttt{start}, \texttt{end}, L)$ returns an index $m$ into the list $L$ and 
regroups the elements of $L$ such that after the function returns the following holds:
 
  - $\forall i \in \{\texttt{start}, \cdots, m-1\} : L[i] \leq L[m]$,
  - $\forall i \in \{ m+1, \cdots, \texttt{end} \}  : L[m] <    L[i]$,
  - $L[m] = \texttt{pivot}$.
  
Here, `pivot` is the element that is at the index `end` at the time of the invocation 
of the function, i.e. we have

  - $L[\texttt{end}] = \texttt{pivot}$
  
at invocation time.
  
The for-loop of `partition` maintains the following invariants:

 - $\forall i \in \{\texttt{start}, \cdots, \texttt{left} \} : L[i] \leq \texttt{pivot}$,
 - $\forall i \in \{\texttt{left}+1, \cdots, \texttt{idx}-1\} : \texttt{pivot} < L[i]$,
 - $L[\texttt{end}] = \texttt{pivot}$.

These invariants are depicted below:

![Invariants for partitioning](lomuto.png)

This algorithm has been suggested by *Nico Lomuto*.  It is not the most efficient implementation of `partition`, but
it is easier to understand than the algorithm given by *Tony Hoare* that uses two separate loops.

In [4]:
#@logging
def partition(start, end, L):
    pivot = L[end]
    left  = start - 1
    for idx in range(start, end):
        if L[idx] <= pivot:
            left += 1
            swap(left, idx, L)
    swap(left + 1, end, L)
    return left + 1

The function $\texttt{swap}(x, y, L)$ swaps the elements at index $x$ and $y$ in $L$.

In [5]:
def swap(x, y, L):
    L[x], L[y] = L[y], L[x]

## Testing

In [6]:
import random as rnd

In [7]:
def demo():
    L = [ rnd.randrange(1, 20) for n in range(1, 16) ]
    print("L = ", L)
    sort(L)
    print("L = ", L)

In [8]:
demo()

L =  [17, 17, 1, 19, 14, 12, 8, 5, 15, 11, 6, 3, 5, 7, 1]
L =  [1, 1, 3, 5, 5, 6, 7, 8, 11, 12, 14, 15, 17, 17, 19]


In [9]:
def isOrdered(L):
    for i in range(len(L) - 1):
        assert L[i] <= L[i+1]

In [10]:
from collections import Counter

In [11]:
def sameElements(L, S):
    assert Counter(L) == Counter(S)

The function $\texttt{testSort}(n, k)$ generates $n$ random lists of length $k$, sorts them, and checks whether the output is sorted and contains the same elements as the input.

In [12]:
def testSort(n, k):
    for i in range(n):
        L = [ rnd.randrange(2*k) for x in range(k) ]
        oldL = L[:]
        sort(L)
        isOrdered(L)
        sameElements(oldL, L)
        assert len(L) == len(oldL)
        print('.', end='')
    print()
    print("All tests successful!")

In [13]:
%%time
testSort(100, 20000)

....................................................................................................
All tests successful!
CPU times: user 9.28 s, sys: 157 ms, total: 9.43 s
Wall time: 9.56 s


Next, we sort a million random integers.

In [15]:
%%time
k = 1_000_000
L = [ rnd.randrange(1000) for x in range(k) ]
sort(L)

CPU times: user 2min 12s, sys: 956 ms, total: 2min 13s
Wall time: 2min 15s


Next, we sort a hundred thousand integers.  This time, many of the integers have the same value.

In [16]:
L = [ rnd.randrange(100) for x in range(100_000) ]

In [17]:
%%time
sort(L)

CPU times: user 13.2 s, sys: 140 ms, total: 13.3 s
Wall time: 13.5 s


Finally, we test the worst case and sort 5000 integers that are sorted ascendingly.  Since quicksort is recursive, we have to increment the <em style="color:blue">recursion limit</em> of *Python*, because otherwise we would get an error telling us that we exceed the maximum recursion depth.

In [None]:
import sys

In [None]:
sys.setrecursionlimit(20000)
sys.version

In [None]:
L = list(range(5000))

In [None]:
%%time
sort(L)

If we *shuffle* the list that is to be sorted before calling `sort`, the worst case behaviour disappears.

In [None]:
rnd.shuffle(L)

In [None]:
%%time
sort(L)