In [1]:
#%autosave 0
from IPython.core.display import HTML, display
display(HTML('<style>.container { width:100%; !important } </style>'))

# An Array-Based Implementation of Quick-Sort

The function $\texttt{sort}(L)$ sorts the list $L$ in place.

In [2]:
def sort(L):
    quickSort(0, len(L) - 1, L)

The function $\texttt{quickSort}(a, b, L)$ sorts the sublist $L[a:b+1]$ in place.

In [3]:
def quickSort(a, b, L):
    if b <= a:
        return  # at most one element, nothing to do
    m = partition(a, b, L)  # m is the split index
    quickSort(a, m - 1, L)
    quickSort(m + 1, b, L)

The function $\texttt{partition}(\texttt{start}, \texttt{end}, L)$ returns an index $m$ into the list $L$ and 
regroups the elements of $L$ such that after the function returns the following holds:
 
  - $\forall i \in \{\texttt{start}, \cdots, m-1\} : L[i] \leq L[m]$,
  - $\forall i \in \{ m+1, \cdots, \texttt{end} \}  : L[m] <    L[i]$,
  - $L[m] = \texttt{pivot}$.
  
Here, $\texttt{pivot}$ is the element that is at the index $\texttt{end}$ at the time of the invocation 
of the function, i.e. we have

  - $L[\texttt{end}] = \texttt{pivot}$
  
at invocation time.
  
The for-loop of `partition` maintains the following invariants:

 - $\forall i \in \{\texttt{start}, \cdots, \texttt{left} \} : L[i] \leq \texttt{pivot}$,
 - $\forall i \in \{\texttt{left}+1, \cdots, \texttt{idx}-1\} : \texttt{pivot} < L[i]$,
 - $L[\texttt{end}] = \texttt{pivot}$.
 
This algorithm has been suggested by Nico Lomuto.  It is not the most efficient implementation of `partition`, but
it is easier to understand than the algorithms that use two separate loops.

In [4]:
def partition(start, end, L):
    pivot = L[end]
    left  = start - 1
    for idx in range(start, end):
        if L[idx] <= pivot:
            left += 1
            swap(left, idx, L)
    swap(left + 1, end, L)
    return left + 1

The function $\texttt{swap}(x, y, L)$ swaps the elements at index $x$ and $y$ in $L$.

In [5]:
def swap(x, y, L):
    L[x], L[y] = L[y], L[x]

## Testing

In [6]:
import random as rnd

In [7]:
def demo():
    L = [ rnd.randrange(1, 200) for n in range(1, 16) ]
    print("L = ", L)
    sort(L)
    print("L = ", L)

In [8]:
demo()

L =  [147, 193, 87, 27, 99, 140, 29, 155, 124, 97, 43, 184, 132, 199, 48]
L =  [27, 29, 43, 48, 87, 97, 99, 124, 132, 140, 147, 155, 184, 193, 199]


In [9]:
def isOrdered(L):
    for i in range(len(L) - 1):
        assert L[i] <= L[i+1]

In [10]:
def sameElements(L, S):
    assert set(L) == set(S)

The function $\texttt{testSort}(n, k)$ generates $n$ random lists of length $k$, sorts them, and checks whether the output is sorted and contains the same elements as the input.

In [11]:
def testSort(n, k):
    for i in range(n):
        L = [ rnd.randrange(2*k) for x in range(k) ]
        oldL = L[:]
        sort(L)
        isOrdered(L)
        sameElements(oldL, L)
        assert len(L) == len(oldL)
        print('.', end='')
    print()
    print("All tests successful!")

In [12]:
%%time
testSort(100, 20000)

....................................................................................................
All tests successful!
CPU times: user 8.17 s, sys: 49.4 ms, total: 8.22 s
Wall time: 8.21 s


Next, we sort a million random integers.

In [13]:
k = 1000000
L = [ rnd.randrange(2 * k) for x in range(k) ]

In [14]:
%%time
sort(L)

CPU times: user 4.38 s, sys: 6.61 ms, total: 4.39 s
Wall time: 4.39 s


Again, we sort a million integers.  This time, many of the integers have the same value.

In [15]:
L = [ rnd.randrange(1000) for x in range(k) ]

In [16]:
%%time
sort(L)

CPU times: user 1min 59s, sys: 89 ms, total: 1min 59s
Wall time: 2min


Next, we test the worst case and sort 5000 integers that are sorted ascendingly.  Since quicksort is recursive, we have to increment the *recursion limit* of *Python*, because otherwise we would get an error that we exceed the maximum recursion depth.

In [17]:
import sys

In [18]:
sys.setrecursionlimit(10000)

In [19]:
L = list(range(5000))

In [20]:
%%time
sort(L)

CPU times: user 2.81 s, sys: 3.7 ms, total: 2.82 s
Wall time: 2.82 s


If we shuffle the list to be sorted before sorting, the worst case behaviour disappears.

In [21]:
import random

In [22]:
random.shuffle(L)

In [23]:
%%time
sort(L)

CPU times: user 15.1 ms, sys: 743 µs, total: 15.8 ms
Wall time: 15.3 ms
