# Using the Master's theorem to Improve Efficiency of Programs

### The Master's Theorem

$$T(n) = \Theta(n^d) + a \cdot T({n \over b})$$

1. If $d \neq \log_b a$, then $T(n) \in \Theta(n^{\max{(d, \log_b a)}})$
2. If $d = \log_b a$, then $T(n) \in \Theta(n^d \log n)$

Examples:
* $T(n) = \Theta(n^2) + T({n \over 2}) \in \Theta(n^2)$
* $T(n) = \Theta(n) + T({n \over 2}) \in \Theta(n)$
* $T(n) = \Theta(1) + T({n \over 2}) \in \Theta(\log n)$


### Multiplying two n-digits integers

In a few applications (e.g. crypto), we need to multiply very large numbers.

Question: what is the complexity of multiplying two n-digit integers?

* Input: $X = x_1 x_2 \cdots x_n$ and $Y = y_1 y_2 \cdots y_n$
* Output: $X * Y$

Let $X_l = x_1 \cdots x_{n \over 2}$ and $X_r = x_{n \over 2 + 1} \cdots x_{n}$.

Let $Y_l = y_1 \cdots y_{n \over 2}$ and $Y_r = y_{n \over 2 + 1} \cdots y_{n}$.

Example: X = 123456,  Y = 789321

What are $X_l$, $X_r$, $Y_l$, $Y_r$?

#### Simple way of multiplying 2 n-digit integers

```
              123456  = X
              789321  = Y
             --------
              123456 
             246912
             ....
```

multiply last digit of Y with each digit of X.

multiply second-last digit of Y with each digit of X.

...

add all rows together.

We can write a program to do this.  Essentially, for each digit of y (starting from last), multiply to each digit of x, and keep records of things.  Finally, add them all up.

What's the running time complexity of this simple way of multiplying two n-digit integers? $\Theta(n^2)$

Can we do faster than $\Theta(n^2)$?

This simply way requires a multiplication of each digit of X with each digit of Y?

**Hypothesis: any algorithm that multiplies two n-digit numbers has $\Omega(n^2)$ running time complexity.**

What this means is that: any algorithm/program that multiplies two n-digit numbers must take at least order of $n^2$ steps.

If this is true, we cannot do faster than $\Theta(n^2)$.

### Let's try a recursive strategy

* Input: $X = x_1 x_2 \cdots x_n$ and $Y = y_1 y_2 \cdots y_n$
* Output: $X * Y$

Let $X_l = x_1 \cdots x_{n \over 2}$ and $X_r = x_{n \over 2 + 1} \cdots x_{n}$.

Let $Y_l = y_1 \cdots y_{n \over 2}$ and $Y_r = y_{n \over 2 + 1} \cdots y_{n}$.

---

If X=123456, Y=789321

Xl = 123,  Xr = 456

Yl = 789,  Yr = 321

$123456 * 789321 = (123000 + 456) * (789000 + 321)$

$123456 * 789321 = (123*10^3 + 456) * (789*10^3 + 321)$

$(123*10^3 + 456) * (789*10^3 + 321) = 123*789*10^6 + 123*321*10^3 + 456*789*10^3 + 456+321$


$X*Y = (X_l * 10^{n \over 2} + X_r) * (Y_l * 10^{n \over 2} + Y_r)$

$X*Y = X_l*Y_l*10^n + X_l*Y_r*10^{n \over 2} + X_r*Y_l*10^{n \over 2} + X_r*Y_r$

Let's translate this into code/Python.

In [2]:
#
# Assumption: X, Y have exactly n=2^k digits 
#
def multiply(X, Y):                     # T(n): multiply 2 n-digit numbers
    if len(X)==1:                       # c1
        return X*Y
    X_l, X_r = split_in_half(X)         # c2 * n
    Y_l, Y_r = split_in_half(Y)         # c3 * n
    A = multiply(X_l, Y_l)              # T(n/2)
    B = multiply(X_l, Y_r)
    C = multiply(X_r, Y_l)
    D = multiply(X_r, Y_r)
    return shift(A, n) + shift(B, n/2) + shift(C, n/2) + D    # c4 * n
    

**split_in_half** returns two integers by copying the first n/2 digits of X and the last n/2 digits of X.

**shift(Z, k)** left shift Z by k times, i.e. multiply Z by 10^k, i.e. adding k 0's to the right of Z.

The running time equation of multiple is: $T(n) = \Theta(n) + 4 * T({n \over 2}) \in \Theta(n^2)$

To get the complexity of T, compare d=1 and $\log_2 4 = 2$.

After all this work, we have a recursive program that exactly the same complexity as the simply program.

Let's see if we can do this better.

$X*Y = X_l*Y_l*10^n + X_l*Y_r*10^{n \over 2} + X_r*Y_l*10^{n \over 2} + X_r*Y_r$

We have 4 recursive calls, each has running time $T({n \over 2})$.  Left shifting and splitting in half take linear time.

This gives a $\Theta(n^2)$ complexity.

Which dominates the running time of this algorithm: recursive or non-recursive?  The recursive part.  

If we want to make this faster, we'll need to be more clever with the recursive calculation.

$ X_l*Y_r*10^{n \over 2} + X_r*Y_l*10^{n \over 2} = (X_l*Y_r + X_r*Y_l)*10^{n \over 2} $

---

$(X_l + X_r) * (Y_l + Y_r)$

$(X_l + X_r) * (Y_l + Y_r) = X_l*Y_l + X_l*Y_r + X_r*Y_l + X_r*Y_r$

$(X_l + X_r) * (Y_l + Y_r) - X_l*Y_l - X_r*Y_r  = X_l*Y_r + X_r*Y_l $

$((X_l + X_r) * (Y_l + Y_r) - X_l*Y_l - X_r*Y_r)*10^{n \over 2}  = (X_l*Y_r + X_r*Y_l)*10^{n \over 2} $

---

$X*Y = X_l*Y_l*10^n + (X_l*Y_r + X_r*Y_l)*10^{n \over 2} + X_r*Y_r$

multipy(X,Y) requires 4 recursive calls:
* $multiply(X_l, Y_l)$
* $multiply(X_l, Y_r)$
* $multiply(X_r, Y_l)$
* $multiply(X_r, Y_r)$

We can actually do this with only 3 recursive calls:
* $multiply(X_l, Y_l)$
* $multiply(X_r, Y_r)$
* $multiple(X_l+X_r, Y_l+Y_r)$

$T(n) = \Theta(n) + 3 \cdot T({n \over 2}) \in \Theta(n^{\log_2 3})$

Summary:
+ By reducing 4 recursive calls to 3 recursive calls, we improve the efficiency of multiplication.

+ We could reduce the number of recursive calls because we reused two recursive calls.