In [1]:
from IPython.core.display import HTML
with open('../style.css') as file:
    css = file.read()
HTML(css)

# QuickSelect

The function `quickSelect` takes two arguments:
* `L` is a list of numbers,
* `k` is a natural number such that `k <= len(L)`. 

The function call `quickSelect(L, k)` returns a list of length `k` that contains the `k` smallest
elements of `L`, i.e. it satisfies the following specification:
- `len(quickSelect(L, k)) = k`,
- `set(quickSelect(L, k)) = set(sorted(L)[:k])`.

The function can be implemented recursively via the following equations:
1. $\texttt{len}(L) < k \rightarrow \texttt{quickSelect}(L,k) = \Omega$,

   because when the length of `L` is less than `k` there is no way to select the `k` smallest
   elements from `L`.
2. $\texttt{len}(L) = k \rightarrow \texttt{quickSelect}(L,k) = L$,

   because when the list `L` has exactly `k` elements, then the `k` smallest elements of `L` are
   all elements of `L`.
3. Otherwise we assume that `L = [x] + R` and partition `L` as in *QuickSort*, i.e. we define
   $$ S := [y \in L \mid y \leq x] \quad \mbox{and} \quad B := [y \in L \mid y > x]. $$
   Then there are three cases:
   - $k \leq \texttt{len}(S) \rightarrow \texttt{quickSelect}(L, k) = \texttt{quickSelect}(S, k)$,
   - $k = \texttt{len}(S) + 1 \rightarrow \texttt{quickSelect}(L, k) = S + [x]$,
   - $k > \texttt{len}(S) + 1 \rightarrow 
      \texttt{quickSelect}(L, k) = S + [x] + \texttt{quickSelect}(L, k - \texttt{len}(S) - 1)$

In [2]:
def quickSelect(L, k):
    assert k <= len(L), f'quickSelect({L}, {k})' 
    if len(L) == k:
        return L
    x, *R = L
    S = [y for y in R if y <= x]
    B = [y for y in R if y >  x]
    if k <= len(S):
        return quickSelect(S, k)
    if k == len(S) + 1:
        return S + [x]
    return S + [x] + quickSelect(B, k - len(S) - 1)

## Testing

In [3]:
quickSelect([7, 8, 11, 12, 2, 5, 3, 7, 9, 3, 2], 5)

[2, 2, 3, 3, 5]

In [4]:
import random as rnd
rnd.seed(42)

The function `test_list(n, m)` generates `n` random lists `L` of length `m` and checks whether 
`quickSelect(L, k)` satisfies its specification.  Here, the number `k` is a random number less 
than or equal to the length of the list.

In [5]:
def testSelect(n, m):
    for i in range(n):
        L = [ rnd.randrange(2*m) for x in range(m) ]
        k = rnd.randrange(m)
        C = L[:]
        S = quickSelect(C, k)
        assert len(S) == k
        assert set(sorted(L)[:k]) == set(S)
        print('.', end='')
    print()
    print("All tests successful!")

In [6]:
%%time
testSelect(100, 100_000)

....................................................................................................
All tests successful!
Wall time: 8.1 s
