$$
\newcommand{proof}{\textbf{Proof: }}
\newcommand{theorem}{\textbf{Theorem: }}
$$

In [1]:
from common.utility import show_implementation

# Recursion

In programming, we frequently use recursion to solve our problems, similar to how we solve the problems in the introduction.
Now, we will look to proving that our recursive algorithms are correct.

## Exponentiation

Suppose that we're tasked to compute $a^n$, given only the multiplication operator.

The naive method is to multiply $a$ by itself $n$ times.

However, notice that we can compute $a^8$ using only 3 multiplication, and not 8.

$$
a^2 = a \times a\\
a^4 = a^2 \times a ^2\\
a^8 = a^4 \times a^4
$$

Following this logic, the recursive algorithm to raise $a$ to any $n$ is:
$$
a^n = 
\begin{cases}
\left(a^{\frac{n}{2}} \right)^2, \quad n \text{ is even}\\
\left(a^{\frac{n-1}{2}}\right)^2 \times n, \quad n \text{ is odd}\\
\end{cases}
$$

Since the recurrence has the runtime of $T(n) \leq T(\frac{n}{2}) + 2$, by masters theorem, this is $T(n) = O(\log n)$.
(Assuming multiplication of two integers is always $O(1)$.

## Merge sort
Merge sort was defined previously as below:


In [2]:
from module.sort import merge_sort, _merge

show_implementation(merge_sort)
show_implementation(_merge)

def merge_sort(A):
    n = len(A)
    if n <= 1:
        return A
    
    m = n //2
    arr1 = merge_sort(A[:m]) # T(floor(n/2))
    arr2 = merge_sort(A[m:]) # T(ceil(n/2))
    return _merge(arr1, arr2) # Theta(n)
def _merge(arr1, arr2):
    i, j = 0, 0
    arr = [None for _ in arr1 + arr2]

    for x in range(len(arr)):
        if i < len(arr1) and j < len(arr2):
            if arr1[i] < arr2[j]:
                arr[x] = arr1[i]
                i += 1
            else:
                arr[x] = arr2[j]
                j += 1
        elif i < len(arr1):
            arr[x] = arr1[i]
            i += 1
        else:
            arr[x] = arr2[j]
            j += 1
    return arr


We know that this is correct because we know that it is correct in the base case.
Secondly, if the two subarray is sorted, then the `merge` function would produce a sorted array.
Using these 2 information, we know that the algorithm is correct by induction.

## Quick sort
Quick sort algorithm is as below:
1. Pick a pivot
2. Partition the array into elements that are smaller or equals to than the pivot, and elements that are larger
3. Recursively call quick-sort on both the left and right partitions

In [3]:
from module.sort import quick_sort

show_implementation(quick_sort)

def quick_sort(arr, pivot_algo=_random_pivot):
    if len(arr) <= 1:
        return arr
    
    index = pivot_algo(arr)
    left, right = _partition(arr, index)
    left = quick_sort(left, pivot_algo)
    right = quick_sort(right, pivot_algo)
    return left + [arr[index]] + right


In [4]:
quick_sort([1, 2, 4, 5, 3, 9, 4, 2])

[1, 2, 2, 3, 4, 4, 5, 9]

Similar to merge sort, we split the array into 2 parts.
Then we use our sorting routine on the smaller array.
Since we know that the resultant array will be sorted if the two subarray is sorted, and the base case is defined, we know that quick sort is also correct.

### Analysis
The recurrence we get depends on $r$, the rank of the pivot.
We get
$$
T(n) = T(r-1) + T(n-r) + O(n)
$$

Notice that the complexity depends on $r$, the rank of the pivot chosen.

#### Bad pivot
If $r$ is always $1$ or $n$ for all the routines, then we can compute the complexity and get $O(n^2)$, which means quick sort is rather inefficient when the pivot is chosen to be the ends of the array.

When we assume that we always pick these pivots, we would obtain:
$$
T(n) = T(n-1) + T(1) + O(n) = T(n-1) + O(n)
$$

which reduces to $O(n^2)$ by the Master theorem.

This happens, for example, when we always select the first element as the pivot, and our input array is already sorted.

In [5]:
import random

n = 1_000
arr = [i for i in range(n)]

In [6]:
%timeit -n 10 quick_sort(arr, pivot_algo=lambda arr:0)

31.1 ms ± 760 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


#### Good pivot

And if we somehow can choose a pivot with rank $n/2$ (which is the median), then the complexity reduces to 

$$
T(n) = T(\lceil n/2 \rceil - 1) + T(\lfloor n/2 \rfloor) + O(n) \leq 2T(n/2) + O(n)
$$


And we can compute that the complexity is $O(n \log n)$.

[Randomizing the pivot](./randomized_algorithm.ipynb#Randomized-quick-sort) can help achieve our goal of finding a "good enough" pivot.

In [7]:
from module.sort import _random_pivot

%timeit -n 10 quick_sort(arr, pivot_algo=_random_pivot)

1.77 ms ± 175 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


## Quick select
Suppose instead, that we wish to find the element with a given rank $r$ in the array.


In [8]:
from module.sort import quick_select

show_implementation(quick_select)

quick_select([1, 2, 4, 5, 3, 9, 4, 2], 6)

def quick_select(arr, r):
    index = randint(0, len(arr) - 1) 
    left, right = _partition(arr, index)
    if len(left) == r:
        return arr[index]

    if r > len(left):
        return quick_select(right, r-len(left)-1)
    
    return quick_select(left, r)


5

### Analysis
Notice that its runtime is very similar to quick sort, albeit it is missing the $O(n)$ merge step, thus we get:
$$
T(n) = T(k-1) + T(n-k) 
$$
where $k$ is the rank of the pivot.
Notice that in both quick sort and quick select, finding a way to determine a good pivot is important.
Picking a pivot with rank that is somewhere in the middle of the array would greatly speed up our algorithm, while always picking pivots on either ends of the array leads to worse runtime.

Hence, we will now demonstrate a deterministic way to find a pivot which has desirable rank.

## Finding good pivot
1. Partition the array into groups of 5 (pad with infinity if needed)
2. Sort each group and find the median
3. The pivot is the median of these medians

Notice that there will be $3n/10$ elements that are smaller than the chosen pivot.
Plugging $r=3n/10$ into our recursive formula for our quick sort, we get

$$
T(n) \leq T(7n/10) + T(n/5) + Cn
$$
where $T(7n/10)$ is the complexity of the subproblem, and $T(n/5)$ is the complexity of sorting the groups of 5.
When we analyze the complexity, we would notice that the complexity is $\log n$.

Hence, we have devised a method to deterministically find a good pivot for quick select.

## Integer multiplication
We have long taken for granted that the time complexity to multiply any 2 numbers is $O(1)$.
What if this assumption does not hold for large numbers?
How would we derive a multiplication algorithm for larger numbers that our builtin integer multiplication can no longer support?

Suppose that we have 2 $n$-digit integers that we need to multiply, and adding or multiply two **digits** is $O(1)$.
How would we compute their product?

Suppose the number that we want to multiply is represented as $a_1, a_2, \dots, a_n$ and $b_1, b_2, \dots , b_n$ in base 10.
The naive way would be to multiply $a_n$ with $b_1, b_2, \dots b_n$,
then multiply $a_{n-1}$ by $b_1, b_2, \dots b_n$, then appending a $0$ at the end, 
then multiply $a_{n-2}$ by $b_1, b_2, \dots b_n$, then appending a $00$ at the end, and so on.
Then, we sum all the products.
This would require $n$ multiplications of a 1-digit number with a $n$-digit number, thus $O(n^2)$.
And since there will be $n$ products, the summation would take $O(n)$.
Thus, the overall complexity would be $O(n^2)$.

First, notice that we have the following identity
$$
pq = (10^m a + b)(10^m c + d) = 10^{2m} ac + 10^m{ad + bc} + bd
$$

Notice that to solve $pq$, we need to simply find $ac, ad, bc$ and $bd$.
And it inspires us that we can compute these recursively.

Computing the complexity, we get 
$$
T(n) = 4T(\lceil n/2 \rceil) + O(n)
$$

Using master theorem, we get $O(n^2)$.

Note that we did not actually need $ad$ and $bc$, but what we actually need is $ac + bd$.
Notice that 
$$
ad + bc = ac + bd - (a-b)(c-d)
$$

And the solution become apparent, we recursively compute $(a-b)(c-d)$ instead of $ad$ and $bc$.
This means the number of multiplication we need is now 3.


Recomputing the complexity, we get 
$$
T(n) = 3T(\lceil n/2 \rceil) + O(n)
$$

Using master theorem, we get $O(n^{\log_2 3})$.

## Matrix multiplication

Suppose that we wish to compute the multiplication of two matrices $A,B$.
The naive method would be to dot multiply the row of $A$ with a column of $B$ to obtain an entry in $AB$.

This means that if the matrices $A,B$ are $n \times $, then would be $O(n)$ to compute one entry of $AB$.
Hence, to compute the whole of $AB$, it would be $O(n \times n^2) = O(n^3)$.

### Recursive

Suppose that we express $A,B$ and $AB = C$ as follows:
$$
A = \pmatrix{A_{11} & A_{12} \\ A_{21} & A_{22}} \quad
B = \pmatrix{B_{11} & B_{12} \\ B_{21} & B_{22}} \quad
C = \pmatrix{C_{11} & C_{12} \\ C_{21} & C_{22}} 
$$

Notice that:
$$
C_{11} = A_{11}B_{11} + A_{12}B_{21} \\
C_{12} = A_{11}B_{12} + A_{12}B_{22} \\
C_{21} = A_{21}B_{11} + A_{22}B_{21} \\
C_{22} = A_{21}B_{12} + A_{22}B_{22}
$$

Hence, we can split the array into 4 quadrant, and solve the smaller sub-problems of matrix multiplication recursively.

The time complexity we would obtain is 
$$
T(n) = 8T(\frac{n}{2}) + O(n^2)
$$

Using the master theorem, this reduces to $O(n^3)$, sadly.

### Strassen's algorithm

Now, we will make a leap and make the following statement.
By setting $M$'s as such:
$$
\begin{align}
M_1 &= (A_{11} + A_{22})(B_{11} + B_{22}) \\
M_2 &= (A_{21} + A_{22}) B_{11}\\
M_3 &= A_{11}(B_{12}-B_{22})\\
M_4 &= A_{22}(B_{21} - B_{11})\\
M_5 &= (A_{11}+A_{12})B_{22}\\
M_6 &= (A_{21}-A_{11})(B_{11}+B_{12})\\
M_7 &= (A_{12}-A_{22})(B_{21}+B_{22})\\
\end{align}
$$

We assert that we would get the following
$$
\begin{align}
C_1 &= M_1 + M_4 - M_5 + M_7\\
C_2 &= M_3 + M_5\\
C_3 &= M_2 + M_4\\
C_4 &= M_1 - M_2 + M_3 + M_6\\
\end{align}
$$

This means we have slightly reduced the number of function calls
$$
T(n) = 7T(\frac{n}{2}) + O(n^2)
$$
which reduces to $O(n^{\log 7})$, a very slight improvement.