# Lab 09 Examples (Karatsuba Fast Multiplication)

Click \<shift> \<enter> in each code cell to run the code. Be sure to start with the ```#include``` directives to load the required libraries.

In [None]:
// For Lab 09, we are limited to using only <iostream>, <string>, and <cmath>

#include <iostream>
#include <string>
#include <cmath>

# Overview

In today's lab, we will compare the number of operations involved in addition and multiplication.
We will then look at the Divide and Conquer approach to multiplication because although the divide and conquer approach alone is not asymptotically faster than the elementary school method, it prepares us for the Karatsuba algorithm, which improves upon the divide and conquer method and is faster.

## The Number of Operations in Addition and Multiplication

When programming, we typically think of additional and multiplication as single operations that complete in constant time. However, this assumes that the numbers being added or multiplied fit within the fixed size of standard data types (like `int` or `float`). When working with very large numbers (e.g., thousands or millions of digits), it becomes important to consider how the number of operations scales as length $n$ grows.

Let $n$ be the number of digits (base 10) or bits (base 2) in the inputs. Then the running times are:

- Addition: $\mathcal{O}(n)$ — linear time.
- Multiplication (grade‑school): $\mathcal{O}(n^2)$ — quadratic time (each digit of one operand is multiplied by each digit of the other).

### Addition

```text
  2518
+ 3841
  ----
  6359
```

The number of single-digit additions is linear to the number of digits $n$. We could iterate over the digits from right to left, and a single loop would execute $n$ times. Even if there is a carry, it only requires a constant amount of extra work per digit, so the loop doesn't have to repeat any extra times. Addition can be performed in a single pass over the length.

### Multiplication (elementary school)

```text
      2518
    x 3841
    ------
      2518
    10072 
   20144  
   7554   
----------
   9671638
```

The number of single-digit multiplications is quadratic to the number of digits $n$. Each digit of one operand is multiplied by each digit of the other operand. If each operand has $n$ digits, there are $n \times n = n^2$ single-digit multiplications. Thus, the running time is $\mathcal{O}(n^2)$. If writing a program, nested loops would be used to multiply each digit of both operands.

Addition is also involved in multiplication, but because addition is asymptotically faster, the multiplication step dominates the overall running time, resulting in a total time complexity of $\mathcal{O}(n^2)$ instead of $\mathcal{O}(n^2 + n) = \mathcal{O}(n^2)$.

### A Number of Length 4

For numbers of length `4`, multiplication will require up to `16` single-digit multiplications ($4^2$). Addition will require up to `4` single-digit additions.  As $n$ increases, the difference becomes even more impactful.

![Linear v Quadratic](https://latessa.github.io/cpp-labs/images/Lab09/linear_v_quadratic.png)

## Divide and Conquer Multiplication

### An Alternate Representation of a Number

Note that another way of expressing a number such as `1984` is to split it in two parts:

$$1984 = (19 * 10^{\frac{n}{2}}) + (84 * 10^0) \mid n = \text{length of the complete number}$$

$$1900 + 84$$

The $10^2$ part shifts the `19` two places to the left (indicated by the 2 in the exponent).  The $10^0$ part shifts the `84` zero places to the left (meaning it stays in the same place) which is indicated by the 0 in the exponent.

This can also be done in binary. For example, the binary number `1101 1010` can be split into two parts:

$$1101\ 1010 = (1101 \texttt{ << } \frac{n}{2}) + (1010 \texttt{ << } 0) \mid n = \text{length of the complete number}$$

$$1101\ 0000 + 1010$$

We use this alternate representation of numbers to implement the divide and conquer multiplication so that we can split the original problem into smaller problems and then reassemble the results of the smaller problems to get the final result.

### Setting Up the Problem

Let's multiply two 4-digit numbers, `1203` times `4536`. Using our alternate representation, we can name each half of each number.  Let's refer to `1203` as `A` and `4536` as `B`. Let's refer to the left half of `A` as `AL` and the right half as `AR`. Similarly, let's refer to the left half of `B` as `BL` and the right half as `BR`.

```text
    AL  AR
    12  03

    BL  BR
    45  36

    LENGTH n = 4
```

In base 10, we can express the multiplication of `A` and `B` as:

$$ (AL * 10^{\frac{n}{2}} + AR * 10^0) * (BL * 10^{\frac{n}{2}} + BR * 10^0) $$

Apply the distributive property (FOIL), we can rewrite the expression.  FOIL stands for First, Outside, Inside, Last. These are the four products we get when multiplying two binomials.

$$ (AL * BL)10^{n} + (AL * BR)10^{\frac{n}{2}} + (AR * BL)10^{\frac{n}{2}} + (AR * BR)10^{0} $$

$$ C_{2}  = (AL * BL) = (12 * 45) = 540 $$
$$ C_{1A} = (AL * BR) = (12 * 36) = 432 $$
$$ C_{1B} = (AR * BL) = (03 * 45) = 135 $$
$$ C_{0}  = (AR * BR) = (03 * 36) = 108 $$

Append the appropriate number of zeros (i.e., multiply by the appropriate power of 10) to each of these four products and then add them together to get the final result.  This is where bit shifts would be used in binary.

$$
\begin{aligned}
C_{2}10^{n} &\;+\; & C_{1A}10^{\frac{n}{2}} &\;+\; &  C_{1B}10^{\frac{n}{2}} &\;+\; & C_{0}10^{0} \\
(540)10^{4} &\;+\; & (432)10^{2} &\;+\; & (135)10^{2} &\;+\; & (108)10^{0} \\
5400000 &\;+\; & 43200 &\;+\; & 13500 &\;+\; & 108 \\
\end{aligned}
$$

$$ = 5,456,808 $$

Splitting the original numbers into halves, though, is often just the beginning. Two 1000-digit numbers split in halves are still two 500-digit numbers. Thanks to recursion, though, the rest of the work can be done with recursive calls to same function.

## The Recursive Pseudocode (Basic Divide & Conquer)

```c++
function multiply(A, B):
    
    if length(A) is single-digit or length(B) is single-digit:
        return A times B  // This is the base case (multiplying single-digit numbers)
    
    // Prepare for the recursive steps by determining the half-length.
    n = max(length(A), length(B))
    half_n = n / 2

    // Split A and B into their respective halves. Assign each half to a variable.
    AL = left half of A
    AR = right half of A
    BL = left half of B
    BR = right half of B

    // Recursively call the multiply function, passing the appropriate
    // parameters for the FOIL components.
    // (AL * BL), (AL * BR), (AR * BL), (AR * BR)
    //  FIRSTS     OUTERS     INNERS      LASTS
    C2  = multiply(AL, BL) // FIRSTS
    C1A = multiply(AL, BR) // OUTERS
    C1B = multiply(AR, BL) // INNERS
    C0  = multiply(AR, BR) // LASTS

    // Calculate the addition part using the results from the recursive
    // calls. Accumulate the final result, appending the appropriate number
    // of zeros (i.e., multiplying by the appropriate power of 10).
    sum = c2 adjusted by 10^n
    sum = sum + c1a adjusted by 10^(n/2)
    sum = sum + c1b adjusted by 10^(n/2)
    sum = sum + c0 adjusted by 10^0

    return sum
```

To get a better intuition for the total number of multiplications involved, we'll now complete our divide and conquer multiplication worksheet for two 4-digit numbers, `1203` times `4536`.

[Divide and Conquer Multiplication Worksheet](https://latessa.github.io/cpp-labs/worksheets/Lab09/Multiply_DC_Worksheet.pdf)

We see that the divide and conquer multiplication method requires `4` multiplications of smaller numbers at each level of recursion. For our two 4-digit numbers, we performed a total of `16` single-digit multiplications, which is the same as the elementary school method for 4-digit numbers.

The Karatsuba algorithm improves upon this by reducing the number of multiplications needed at each level of recursion from `4` to `3`, leading to a faster overall multiplication method for large numbers.

### The Karatsuba Optimization (using A = 1203, B = 4536)

$$
\begin{aligned}
AL &= 12 &\qquad AR &= 03\\
BL &= 45 &\qquad BR &= 36
\end{aligned}
$$

Compute the “FIRSTS” and “LASTS” products. These are the same as in the basic divide and conquer method:

$$
\begin{aligned}
C_2 &= AL \times BL = 12 \times 45 = 540\\[4pt]
C_0 &= AR \times BR = 03 \times 36 = 108
\end{aligned}
$$

Compute the middle term C1. Here is where the original two multiplications (C1A and C1B) are replaced with a single multiplication and along with some additions and subtractions.

First, let's compare how C1 would be computed in the basic divide and conquer method:

$$
\begin{aligned}
C_1 &= (AL \times BR) + (AR \times BL) \\[4pt]
    &= (12 \times 36) + (03 \times 45) = 432 + 135 = 567
\end{aligned}
$$

Karatsuba computes C1 using one multiplication instead of two:
$$
\begin{aligned}
C_1 &= (AL + AR)\times(BL + BR) - C_2 - C_0\\[4pt]
    &= (12 + 03)\times(45 + 36) - 540 - 108\\[4pt]
    &= 15 \times 81 - 648 = 1215 - 648 = 567
\end{aligned}
$$

With Karatsuba's enhancement, each recursion level requires 3 multiplications instead of 4, improving asymptotic performance for large inputs. Instead of $\mathcal{O}(n^2)$, Karatsuba's algorithm runs in $\mathcal{O}(n^{\log_2 3}) \approx \mathcal{O}(n^{1.585})$.

The recursive pseudocode for Karatsuba is not much different from the basic divide and conquer method. The only change is in how C1 is computed.

## The Recursive Pseudocode (Karatsuba)

```c++
function karatsuba(A, B):
    if length(A) is single-digit or length(B) is single-digit:
        return A times B  // This is the base case (multiplying single-digit numbers)
    
    // Prepare for the recursive steps by determining the half-length.
    n = max(length(A), length(B))
    half_n = n / 2
    
    // Split A and B into their respective halves. Assign each half to a variable.
    AL = left half of A
    AR = right half of A
    BL = left half of B
    BR = right half of B

    // Recursively call the katarsuba function, passing the appropriate parameters
    // for C2 and C0, which are the same as in the basic divide and conquer method.
    C2 = karatsuba(AL, BL) // FIRSTS
    C0 = karatsuba(AR, BR) // LASTS

    // Recursively call the karatsuba function to compute C1 using the optimized method.
    // This uses the results from C2 and C0 to reduce the number of multiplications
    // needed to compute C1.
    C1 = karatsuba(AL + AR, BL + BR) - C2 - C0

    // Calculate the addition part using the results from the recursive
    // calls. Accumulate the final result, appending the appropriate number
    // of zeros (i.e., multiplying by the appropriate power of 10).
    sum = c2 adjusted by 10^n
    sum = sum + c1 adjusted by 10^(n/2)
    sum = sum + c0 adjusted by 10^0

    return sum
```

To get a better intuition for the total number of multiplications involved, a similar worksheet is provided to step through the `1203` times `4536` multiplication with the Karatsuba optimization.

[Karatsuba Multiplication Worksheet](https://latessa.github.io/cpp-labs/worksheets/Lab09/Multiply_Karatsuba_Worksheet.pdf)

We see that the Karatsuba multiplication method requires `3` multiplications of smaller numbers at each level of recursion. For our two 4-digit numbers, we performed a total of `9` single-digit multiplications, which is fewer than the `16` single-digit multiplications required by both the elementary school method and the basic divide and conquer method for 4-digit numbers.