## What is the most interesting algorithm?

### The Strassen Algorithm

The most mind-blowing fact of this algorithm is that someone attempted it.

Let me show you what I mean.

To start, we need to know about matrix multiplication. Let's write down two 3-by-3 matrices:

![title](TwoMatrices.png)

Let's call the left matrix $\mathbf{A}$ and the right $\mathbf{B}. $We *define* matrix multiplication between these two as a calculation which results in another matrix with the same numbers of rows as $\mathbf{A}$ and the same number of columns as $\mathbf{B}$. Specifically, we calculate each element of the result as:

![title](CalcAnElement.png)

Let's call the resulting matrix $\mathbf{C}$. So if you'd like to calculate an element of $\mathbf{C}$ that's in the second row and first column, you select out the second row from $\mathbf{A}$ and first column of $\mathbf{B}$. Then you multiple each element of these together and sum them up. This is similarly defined for all other elements of $\mathbf{C}$. So the result is:

![title](Multiply.png)

To understand why Strassen's Algorithm is interesting, we need to *count* how many simple add/multiply calculations are involved in producing $\mathbf{C}$. Well, that's not hard. To produce that number 48, we had 3 multiplications and 2 additions, for a total of 5 simple calcs. Since there are 9 elements in $\mathbf{C}$, we have a total of $9\times 5=45$ calculations.

But we need to get more general. Let's say $\mathbf{A}$ and $\mathbf{B}$ where $n$-by-$n$ matrices. Then the formula for the total number of calculations is:

$$
T(n) = n^2(n + (n-1)) = n^3 + n^2(n-1)
$$

Now, when some mathy people look at this, they want to drop the specifics and consider only the dominate behavior as $n$ get's large. In other words, they care about the generic speed with which this function grows. For this case, they would say $T(n)$ grows in proportion to $n^3$. More specifically, they say '$T(n)$ is in the set $\mathcal{O}(n^3)$, which means you could choose a constant $c$ such that $T(n)<c\cdot n^3$ for all $n$ beyond a certain point. To simplify, just think of this as a technical way of pointing out the highest exponent term in $T(n)$.

So let's think about that statement: $T(n)$ grows in proportion to $n^3$. It has to, right? The result, $\mathbf{C}$, has $n^2$ elements and we have to calculate each in turn. For each of those, we have some near-multiple of $n$ calculations to do. This *must* amount in something on the order of $n^3$ calculations... right?

### No.

Strassen's algorithm was the first strike to chip away at the $n^3$ shell, showing us that we could get the cost of matrix multiplication to grow *below* $n^3$.

That is absolutely wild. How are the hell are we avoiding the work of matrix multiplication that seems baked into it's definition?

I have no clue and no one suspected it was worth an attempt until Volker Strassen (link) came along. His algorithm showed it could be done with a cost near $\mathcal{O}(n^2.8)$. This sparked a flurry of research, and we've since made significant progress:

![title](MatrixMultGrowth.png)

In [1]:
import numpy as np

A = np.array([[1,3,2],[3,7,6],[4,2,1]])
B = np.array([[5,4,8],[3,5,9],[2,4,4]])
C = A.dot(B)
A

array([[1, 3, 2],
       [3, 7, 6],
       [4, 2, 1]])

In [2]:
B

array([[5, 4, 8],
       [3, 5, 9],
       [2, 4, 4]])

In [3]:
C

array([[ 18,  27,  43],
       [ 48,  71, 111],
       [ 28,  30,  54]])