# Quicksort

In [1]:
def quicksort(arr, lower, higher):
    if lower < higher:
        pivot = partition(arr, lower, higher)
        quicksort(arr, lower, pivot - 1)
        quicksort(arr, pivot + 1, higher)
        
def partition(arr, lower, higher):
    pivot = arr[higher]
    i = lower
    for j in range(lower, higher):
        if (arr[j] <= pivot):
            swap(arr, i, j)
            i += 1
    swap(arr, i, higher)
    return i
        
def swap(arr, i, j):
    arr[i], arr[j] = arr[j], arr[i]


arr = [1, 3, 6, 8, 9, 12, 4]
quicksort(arr, 0, len(arr) - 1)
arr

[1, 3, 4, 6, 8, 9, 12]

---

## Proof of average running time

**Claim**: Average runtime of Quicksort is $O(n \log n)$ for every input array A of length n.

$\Omega$: Sample space (all possible outcomes of random choices in QS).

for $\sigma \in \Omega: C(\sigma) = $ # of comparisions between two input elements given random choice $\sigma$.

$z_i$ = $i^{th}$ smallest element of A

For $\sigma \in \Omega$, indices $i < j$, let  
$\quad X_{ij}(\sigma) = $ num of times $z_i$, $z_j$ get compared with pivot sequence $\sigma$

$\Rightarrow$ A fixed pair of elements will get compared zero or one times (only when one is pivot).  
$\Rightarrow$ Thus: Each $X_{ij}$ is indicator (0 - 1) variable.

$\Rightarrow \forall \sigma: C(\sigma) = \sum_{i=1}^{n-1} \sum_{j=i+1}^n X_{ij}(\sigma)$  
$\Rightarrow E(C) = \sum_{i=1}^{n-1} \sum_{j=i+1}^n E(X_{ij}) \quad$ ($E(X_{ij}$ is prob. that $z_i$, $z_j$ get compared)

Fix $z_i$, $z_j$ with $i < j$, consider set $z_i, z_{i+1}, ..., z_j$:

=> Pivot is among $z_i ... z_j$, all are passed to same recursive call.

1) If $z_i$ or $z_j$ gets chosen as pivot first, they get compared.  
2) If one of $z_{i+1}, ... , z_{j-1}$ gets chosen first, they get never compared.

$\Rightarrow Prob(z_i, z_j\ \text{compared}) = \frac{2}{j - i + 1} \quad$ (#choices for 1 / total # of choices)

$\begin{align*}
\Rightarrow E(C) =& \sum_{i=1}^{n-1} \sum_{j=i+1}^n \frac{2}{j - i + 1} \\
=& 2 \sum_{i=1}^{n-1} \sum_{j=i+1}^n \frac{1}{j - i + 1}
\end{align*}$

**Note:** For each fixed i, the inner sum is:

$\sum_{j = i + 1}^n \frac{1}{j - i + 1} = \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + ... + \frac{1}{n}$

$\Rightarrow E(C) \leq 2 n \sum_{k = 2}^n \frac{1}{k}$



$\begin{align*}
\sum_{k = 2}^n \frac{1}{k} \leq& \int_1^n \frac{1}{x} dx \\
=& \ln x \bigg\rvert^n_1 \\
=& \ln n - \ln 1 \\
=& \ln n
\end{align*}$

<br>

$\Rightarrow E(C) \leq 2n \ln n$

---

## $\Omega(n \log n)$ lower bound for comparision based sorting

**Theorem:** Every comparision based sorting algorithm has worst-case running time of $\Omega(n log n)$.

Given a sorting method and input array of length $n$.

$\rightarrow$ Array of length n: {1, 2, ... , n} in some order. There are $n!$ possible input arrays.  
$\rightarrow$ Suppose algorithm makes $\leq k$ comparisions to sort all $n!$ inputs.  
$\Rightarrow$ Across all $n!$ possible inputs, algorithm exhibits $\leq 2^k$ distinct executions.

**By Pigeonhole Principle:**

If $2^k < n!$, then the algorithm executes identically on two distinct inputs.  
$\Rightarrow$ must get one of them incorrect.

**So:** Since method is correct:

$2^k \geq n!$  
$2^k \geq (\frac{n}{2})^\frac{n}{2}$  
$\Rightarrow k \geq \frac{n}{2} \log_2 \frac{n}{2} = \Omega(n \log n)$