The most important thing about reading this blog post is to not get scared off by the formulas. The post may look like all the crap you normally skim over, so you may be tempted to skim over this one. Don't. __None of this is hard.__ Just read the post top to bottom, and I promise you every individual step and the whole thing put together will make sense.

## Highschool math

In hight school your math teacher may have started a treatment of linear algebra by making you solve a system of linear equations, at which point you very sensibly zoned out because you knew you'd go on to program computers and never have to solve a system of linear equations again (don't worry, I won't be talking much about them here).

$$
\begin{eqnarray}
0x_1 + 1x_2 = 0\\
-1x_1 - 0x_2 = 0
\end{eqnarray}
\tag{1}
$$

You also may have learned that for some reason you can take the coefficients and put them in a 2D array like this:
$A=\begin{bmatrix}
0 & 1 \\
-1 & 0 \\
\end{bmatrix}$.
You've now defined a matrix $A$, and you can re-express the system of linear equations above as follows:

$$
\newcommand\qvec[1]{\begin{bmatrix}#1\end{bmatrix}}
A\qvec{x_1\\x_2}=0
\tag{2}
$$

If you're _really_ hellbent on cleaning things up, you can express the vector $\qvec{x_1, x_2}$ as $x=\qvec{x_1 \\ x_2}$, which now gives you a really clean equation:

$$
Ax=0
\tag{3}
$$

Equations 1 - 3 are just different ways to say the exact same thing. In different situations you may prefer one notation over another, but there is no material difference between them. They are all equivalent.

## Matrix-vector multiplication

I'll talk about what matrix-vector multiplication _means_ in a moment. For now let's look at how the operation is defined. The precise definition of matrix-vector multiplication flows out of the notation above, so you never again have to look it up on wikipedia. If you need to multiply a matrix by a vector, say $\begin{bmatrix}
0 & 1 \\
-1 & 0 \\
\end{bmatrix}
\qvec{1 \\ 2}$, just recall that this is equivalent to the left side of the system of equations above. So you take the $\qvec{1, 2}$ vector, jam it into every row, and add up the columns:
$$
\qvec{
0*1 + 1*2\\
-1*1 - 0*2
}=\qvec{2 \\ -1}
$$

"Jam it into every row and add up the columns" is a slightly less formal way of saying "dot product"-- the sum of the pairwise multiplication of elements of two vectors. We obtain the result $\qvec{2, -1}$ by computing the dot product of $\qvec{1, 2}$ with every row of $A$.

(This in itself is curious. The dot product of two vectors represents the degree to which they point in the same direction. What does that have to do with linear equations or rows of a matrix in matrix-vector multiplication? I will not be answering this question here, but hope to get to it in a future post.)

If you forget how this works, just think of converting the matrix-vector multiplication notation back into the linear equation system notation again.

## Matrices as functions

Now let's look at what matrix-vector multiplication means. This blew my mind when I first learned about it. You can think of a matrix as a function, and you can think of multiplying a matrix by a vector as applying that function to the vector. So when you see $Ax$, autocomplete it in your head to "calling some function $A$ with argument $x$".

This is actually not so strange-- you can think of many structures as functions. For example, you can think of a number $3$ as a function. When you multiply it by things, it makes them three times bigger. Thinking about matrices this way happens to be very convenient.

The fact that $Ax=0$ denotes both the linear system in equation 1, and a call to a function $A$ with argument $x$ (getting the zero vector in return) leads to a curious insight about the relationship between high school math and programming.

In high school you're given equations and asked to find their roots. We already established that a system of equations is equivalent to matrix-vector multiplication, which can be thought of as function application. And so, in high school you're _given_ a function $A$ along with its output, and asked to _find_ the inputs that match that output.

Programming is usually the exact opposite. In programming what you're _given_ is the shape of inputs and outputs, and your job is to _construct_ a function $A$ that converts one to the other. The computer then executes the functions you construct, often at scale.

## What do matrices do?

## Matrix-matrix multiplication

We're now ready to cover matrix-matrix multiplication, what it means, and how it works. Suppose we have two matrices, $M$ and $N$, and a vector $x$. What does $MNx$ mean? What helps to think about this problem is that matrix multiplication is assosiative:
$$
(MN)x=M(Nx)
\tag{5}
$$
On the righthand side, $Nx$ returns a vector, which we then multiply by $M$. Thinking of matrices as functions, if we have two functions, $m$ and $n$ that simply multiply the corresponding matrix by their argument, then $M(Nx)$ is nothing more than $m(n(x))$.

## Type systems

Let's look at another example of matrix-vector multiplication:

$$\begin{bmatrix}
0 & 1 \\
-1 & 0 \\
0 & 0
\end{bmatrix}
\qvec{1 \\ 2}
=\qvec{2 \\ -1 \\ 0}\tag{4}$$

Let's call the matrix on the left $M$. In the equation above we get our result by performing the dot product of $\qvec{1, 2}$ with every row of $M$. In other words, we treat each row of $M$ as a vector, perform a pairwise multiplication of its elements with $\qvec{1, 2}$, and sum them.

Since you can't perform a dot product of two vectors with different dimensions, for matrix-vector multiplication to work the number of elements in $\qvec{1, 2}$ must be equal to the number of columns in $M$. Switching to thinking of $M$ as a function, _we've now learned something about the type of its input_. $M$'s arguments _must_ be vectors of two dimensions.

The opposite is true for $M$'s rows. Because we perform a dot product of $\qvec{1, 2}$ with every row of $M$ and $M$ has three rows, the output vector must necessarily have three elements. And so, the number of rows in $M$ tells us about the type of its output.

Here is a simple way of expressing this in typescript:

```ts
type C = [number, number];
type R = [number, number, number];

let M = (in: C) -> R {
  // ...
}
```

## Conclusion

My goal with these series is to give you an intuitive understanding of common linear algebra concepts, so that when you encounter them, you can have a good sense of what they mean and not be completely lost. This post should give you an intuition for matrices, matrix multiplication, what it means, and how it works.