# Quicksort
Quicksort is an in-place, divide-and-conquer sorting algorithm. Bentley devotes almost an entire column to it in _Programming Pearls_, stepping through several versions. Most of this section is just a rehashing of what he presents.

## Problem statement
- **Input**: A collection of `n` integers called `arr`.
- **Output**: The same collection of `n` integers, rearranged so that for any `i` such that `0 <= i <= n-1`, `arr[i] < arr[i+1]` holds.
- **Constraints**: 
    - `arr` contains only integers, which may be non-unique
    - `-(2^31)-1 <= arr[i] <= (2^31)-1`
    - `0 <= n <= 10^7`
    - The input collection may be in any order (including already sorted!)

In [74]:
import random
import time

INT_MAX = (2**31)-1
INT_MIN = -(2**31)-1
MAX_SIZE = 10**7

def is_sorted(arr):
    i, j = 0, 1
    while j < len(arr):
        if arr[i] > arr[j]:
            print(f"Unsorted: {arr[i]} before {arr[j]}")
            return False
        i+=1
        j+=1
    return True
    

def sort_test(sort_fn, cases=None, use_max_size=True):
    if not cases:
        cases = {
            "Simple": [random.randint(-9, 9) for i in range(15)],
            "Typical, no duplicates": list(set([random.randint(INT_MIN, INT_MAX) for _ in range(1000)])),
            "Typical, with duplicates": [random.randint(INT_MIN, INT_MAX) for _ in range(100)]*10,
            "Null": [],         
            "Singleton": [random.randint(INT_MIN, INT_MAX)],        
            "Even len": [random.randint(INT_MIN, INT_MAX) for _ in range(10)],
            "Odd len": [random.randint(INT_MIN, INT_MAX) for _ in range(11)],
            "All equal elements": [1 for _ in range(1000)]
        }
        if use_max_size:
            cases["Max size, duplicates allowed"] = [random.randint(INT_MIN, INT_MAX) for _ in range(MAX_SIZE)] 
    for name, case in cases.items():
        start = time.time_ns()
        sort_fn(case)
        end = time.time_ns()
        assert is_sorted(case), f"Failed: {name}\n {case}"
        print(f"Pass: {name} ({end-start} ns, n={len(case)}).")
    print("All tests pass!")

## Analysis
Quicksort is a comparison-based sort, so its runtime complexity has a lower bound of $n \cdot log(n)$. 

## Version 1 - Recursive, Lomuto partitioning

This is the most basic version of quicksort. 

In [75]:
"""
lo := index of first / leftmost element in subarray
hi := index of last / rightmost element in subarray
mid := index maintaining invariant; at every iteration, 
       every element left of arr[mid] is less than arr[mid]
"""

def quicksort(arr: list[int]):   
    def qsort(lo, hi: int):
        if lo >= hi:
            return
        mid = lo
        for i in range(lo+1, hi+1):
            if arr[i] < arr[lo]:
                mid += 1
                arr[mid], arr[i] = arr[i], arr[mid]
        arr[lo], arr[mid] = arr[mid], arr[lo]
        qsort(lo, mid-1)
        qsort(mid+1, hi)
    qsort(0, len(arr)-1)
    return

sort_test(quicksort)

Pass: Simple (19098 ns, n=15).
Pass: Typical, no duplicates (1399372 ns, n=1000).
Pass: Typical, with duplicates (1317790 ns, n=1000).
Pass: Null (901 ns, n=0).
Pass: Singleton (867 ns, n=1).
Pass: Even len (7397 ns, n=10).
Pass: Odd len (6983 ns, n=11).
Pass: All equal elements (94377392 ns, n=1000).
Pass: Max size, duplicates allowed (47222568069 ns, n=10000000).
All tests pass!


## Version 2 - Optimization for duplicate entries (two-sided partitioning)
The above quicksort does very poorly on some edgecases, specifically when every element of the input array is equal. Bentley handles this with a different partitioning strategy; whenever we encounter a block of duplicate numbers, the below partitioning strategy "jumps over them" via swapping. This will do more swaps than necessary, but sidestep the $O(n^2)$ runtime.

In [76]:
"""
lo: leftmost / lowest index of subarray
hi: rightmost / highest index of subarray
i: left/lower pivot
j: right / higher pivot
t: target value; 
"""

def quicksort_two_partitions(arr: list[int]):   
    def qsort(lo, hi: int):
        
        
        
    qsort(0, len(arr)-1)
    return

sort_test(quicksort_two_partitions, use_max_size=False)

Pass: Simple (4378 ns, n=15).
Pass: Typical, no duplicates (144239 ns, n=1000).
Pass: Typical, with duplicates (121938 ns, n=1000).
Pass: Null (866 ns, n=0).
Pass: Singleton (547 ns, n=1).
Pass: Even len (1136 ns, n=10).
Pass: Odd len (1201 ns, n=11).
Pass: All equal elements (4247 ns, n=1000).
All tests pass!


## Version 3 - Optimization for pre-sorted subarrays (random partition choice)
Duplicated array entries isn't the only case of $O(n^2)$ performance. This can also occur when the array is already sorted. 