# Dot Products and Matrix Multiplications
> CogWorks 2019 (Petar Griggs)

### What is a Vector?

We can represent a vector numerically as either a column or row of numbers:
\begin{equation}
\vec{u}=\begin{bmatrix}3 \\ 4\end{bmatrix}\;\;\text{or}\;\;\vec{u}=\begin{bmatrix}3 & 4\end{bmatrix}
\end{equation}


Visually, we can represent a vector as an arrow in space, connecting the origin to the point described by the vector:

\begin{equation}
\vec{u}=\begin{bmatrix}x \\ y\end{bmatrix}\;\;\text{or}\;\;\vec{u}=\begin{bmatrix}x & y\end{bmatrix}
\end{equation}

So, the vector $\vec{u}=\begin{bmatrix}3 \\ 4\end{bmatrix}$ can be represented as:

![vec](pics/vector.png)

The *dimensionality* of a vector is really just the number of elements in that vector. On a 2D plane, we need two specify two values, $x$ and $y$, to localize ourselves; this is what is depicted above. In 3D space, we need three values, $x$, $y$, and $z$, to specify our location in space. And, in general, if we are in $N$-dimensions, we need to specify $N$ values to specify a point in that space. This idea generalizes to higher dimensions that we mere mortals can't visualize so easily.

We can use a NumPy array to represent a vector in code. For example,

\begin{equation}
\vec{u}=\begin{bmatrix}3 & 4\end{bmatrix}
\end{equation}

can simply be represented as

```python
np.array([3, 4])  # 1D array of size 2 represents a 2D vector
```

An important note on terminology: The "dimensionality" of the vector is **not** the same as the "dimensionality" of the array.

The arrays:

```python
np.array([1.2])   # 1D array of size 1 represents a 1D vector
np.array([3, 4])  # 1D array of size 2 represents a 2D vector
np.array([-9.1, 4.0, 5.0])  # 1D array of size 3 represents a 3D vector
```

Are all 1-dimensional NumPy arrays, as they each only require 1 integer-index to uniquely specify an element in the array, but they represent a 1D, 2D, and 3D vector, respectively.

Thus, all NumPy arrays with a single row (or column) of numbers are considered 1-dimensional arrays, in NumPy-lingo. The *size* (i.e. the number of elements in that array) of that array corresponds the dimensionality of the vector.

As we will see below, a 2-dimensional NumPy array can be thought of as a matrix, or as an object that stores multiple vectors. E.g. the matrix

\begin{equation}
X =\begin{bmatrix}3 & 4 & 8 \\ 2 & 4 & 0\end{bmatrix}
\end{equation}

can be thought of as storing two 3-dimensional vectors, $\begin{bmatrix}3 & 4 & 8\end{bmatrix}$ and $\begin{bmatrix}2 & 4 & 0\end{bmatrix}$. 

We can represent this as a shape-(2, 3) NumPy array:

```python
x = np.array([[3, 4, 8], 
              [2, 4, 0]]
```

### The Dot Product

We can define the dot product operation to compute the similarity between two vectors. To start, we can compute the dot product as:
\begin{equation}
\vec{x}\cdot\vec{y}=\sum_{i=0}^{n-1} x_i y_i
\end{equation}

All this notation means is that we find the element-wise product of the two vectors (by multiplying the corresponding elements in each vector together), then sum the resultant vector. As an example, take vectors $\vec{x}=\begin{bmatrix}2 & 9\end{bmatrix}$ and $\vec{y}=\begin{bmatrix}7 & 4\end{bmatrix}$. Then we can compute the dot product as:
\begin{equation}
\vec{x}\cdot\vec{y}=(2 \times 7) + (9 \times 4) = 14 + 36 = 50
\end{equation}

When we compute the dot product, we need to ensure that our vectors are of equal dimensionality. After all, if one vector has more elements than the other, how are we supposed to compute an element-wise product! It is also good to note that the result of a dot product will only be a single value.

NumPy has a handy [`matmul` function](https://docs.scipy.org/doc/numpy/reference/generated/numpy.matmul.html) that will compute the dot product between two vectors for us! 

```python
>>> import numpy as np

>>> x = np.array([2, 9])
>>> y = np.array([7, 4])
>>> np.matmul(x, y)
50
```

The `@` operator is reserved by NumPy to invoke `np.matmul` - this is quite convenient:

```python
>>> x @ y  # the same as `np.matmul(x, y)`
50
```


I mentioned before that the dot product will help us measure the how much two vectors overlap with each other, or how similar they are. Geometrically-speaking, the dot product measures the acute angle, $\theta$, between vectors $\vec{x}$ and $\vec{y}$ :
\begin{equation}
\cos(\theta)=\frac{\vec{x}\cdot\vec{y}}{||\vec{x}||||\vec{y}||}
\end{equation}

where $||\vec{x}||=\sqrt{x_0^2 + x_1^2 + ...}$ is the magnitude of a vector. If our two vectors are already normalized, such that $||\vec{x}||=1$, then the dot product itself is $\cos(\theta)$.

What good is this formula though? Well, it tells us that the angle between two vectors is proportional to the dot product. In that way, the dot product can tell us how much the to vectors overlap!

Let's take a look at a few extreme cases to build an intuition. First, when two vectors are *parallel* and thus overlap completely, the angle between them will be $0$. As a simple example, conside the vectors illustrated below:

![vec](pics/vec1.png)

![vec](pics/vec2.png)

---

When two vectors are parallel, their dot product is simply the product of their maginitudes. In the case that both vectors are normalized, this value will be $1$. $\theta=0$ also happens to be when cosine achieves its maximum. So, when two vectors are parallel, their dot product will be maximized.

Below create two parallel vectors in NumPy and use `matmul` to find the dot product of the two vectors. Then, divide the dot product by the magnitudes of your vectors, using `np.linalg.norm` to find the magnitudes, to confirm that indeed $\cos(\theta)=1$.

In [None]:
x = # vector
y = # vector

dot_prod = # use np.matmul!

cos_theta = dot_prod / # divide by the magnitude of your vectors

print(cos_theta)

When two vectors are *antiparallel*, they face in opposite directions:

![vec](pics/vec3.png)

![vec](pics/vec4.png)

---

In this case, the angle between the two vectors is $180^\circ$. However, when $\theta=180$, cosine is at its minimum value of $-1$. Thus, when two vectors are antiparallel, their dot product will be minimized (and when two normalized vectors are antiparallel, their dot product will be $-1$).

Just as before, create two antiparallel vectors in NumPy and use `matmul` to find the dot product of the two vectors. Divide the dot product by the magnitudes of your vectors to confirm that $\cos(\theta)=-1$.

In [None]:
x = # vector
y = # vector

dot_prod = # use np.matmul!

cos_theta = dot_prod / # divide by the magnitude of your vectors

print(cos_theta)

Lastly, consider two perpendicular vectors:

![vec](pics/vec5.png)

![vec](pics/vec6.png)

---

When two vectors are perpendicular to one another, they do not overlap at all. This also means that $\theta=90^\circ$, and consequently $\cos(\theta)=0$. Thus, the dot product of the two perpendicular vectors will always be $0$.

Again, create two perpendicular vectors and use `matmul` to find the dot product. The result should confirm that for perpendicular vectors, $\cos(\theta)=0$.

In [None]:
x = # vector
y = # vector

dot_prod = # use np.matmul!

print(dot_prod)

### What about Matrices?

Matrices have a number of interpretations, but we can simply consider a matrix to be a collection of vectors of the same dimensionality. If we have $M$ vectors, each of dimension $N$, we could then pack the vectors into a matrix. 

We can have each column of our matrix be a seperate vector:
\begin{align*}
&\quad\;\,\begin{array}{ccc}\leftarrow & M & \rightarrow\end{array} \\
V = \begin{array}{c}\uparrow \\ N \\ \downarrow\end{array}\;\;&\begin{bmatrix}\uparrow & \uparrow & \cdots & \uparrow \\ \vec{v}_1 & \vec{v}_2 & \cdots & \vec{v}_M \\ \downarrow & \downarrow & \cdots & \downarrow\end{bmatrix}
\end{align*}

Here $V$ is a $(N,M)$ matrix. We could alternatively construct a matrix where each row is a distinct vector:
\begin{align*}
&\;\;\begin{array}{ccc}\leftarrow & N & \rightarrow\end{array} \\
W = \begin{array}{c}\uparrow \\ M \\ \downarrow\end{array}\;\;&\begin{bmatrix}\leftarrow & \vec{w}_1 & \rightarrow \\ \leftarrow & \vec{w}_2 & \rightarrow \\ \vdots & \vdots & \vdots \\ \leftarrow & \vec{w}_M & \rightarrow\end{bmatrix}
\end{align*}

Here $W$ is a $(M,N)$ matrix. In either case, our matrix is simply a collection of vectors. In Python, we can represent a matrix as a 2D NumPy array (in fact, there is really no difference between a 2-dimensional NumPy array and a matrix).

With matrices, we can define the matrix multiplication operation. Consider the two matrices $V$ and $W$, which have shapes $(M,N)$ and $(N,L)$, respectively. We will think of $V$ as having $M$ vectors of dimension $N$ (with each vector as a row) and $W$ as having $L$ vectors of dimension $N$ (with each vector as a column). Performing matrix multiplication will yield a $(M,L)$ matrix, where element $i,j$ is equal to:
\begin{equation}
(VW)_{ij}=\sum_{k=0}^{N-1}V_{ik}W_{kj}
\end{equation}

This formula bears a striking resemblance to the dot product we defined earlier. In fact, by performing a matrix multiplication, we are actually performing repeated dot products. The dot product between the $i^\text{th}$ row of $V$ and $j^\text{th}$ column of $W$ is computed, then filled into element $i,j$ of our output matrix:

\begin{align*}
&\begin{bmatrix}\quad\!\uparrow & \quad\;\;\;\;\!\uparrow & \quad\cdots & \!\uparrow \\ \quad\!\vec{w}_1 & \quad\;\;\;\vec{w}_2 & \quad\cdots & \;\;\;\;\vec{w}_L\;\;\;\;\:\!\!\! \\ \quad\!\downarrow & \quad\;\;\;\;\!\downarrow & \quad\cdots & \!\downarrow\end{bmatrix} \\
VW=\begin{bmatrix}\leftarrow & \vec{v}_1 & \rightarrow \\ \leftarrow & \vec{v}_2 & \rightarrow \\ \vdots & \vdots & \vdots \\ \leftarrow & \vec{v}_M & \rightarrow\end{bmatrix}&\begin{bmatrix}\vec{v}_1\cdot\vec{w}_1 & \vec{v}_1\cdot\vec{w}_2 & \cdots & \vec{v}_1\cdot\vec{w}_L \\ \vec{v}_2\cdot\vec{w}_1 & \vec{v}_2\cdot\vec{w}_2 & \cdots & \vec{v}_2\cdot\vec{w}_L \\ \vdots & \vdots & \ddots & \vdots \\ \vec{v}_M\cdot\vec{w}_1 & \vec{v}_M\cdot\vec{w}_2 & \cdots & \vec{v}_M\cdot\vec{w}_L\end{bmatrix}
\end{align*}

It is important to note that the shapes of the matrices do matter. We can matrix multiply our $(M,N)$ and $(N,L)$ matrices ***only*** because the inner dimensions are both $N$. That is, the number of columns in matrix-1 and rows in matrix-2 are the same. We can not perform a matrix multiplication if this condition is not met.

Because matrix multiplication is simply repeated dot products, each value we compute actually represents an angle between vectors. Thus, our out matrix when multiplying matrices $V$ and $W$ of shapes $(M,N)$ and $(N,L)$ is actually a matrix full of $M\times L$ angles, where element $i,j$ tells us the similarity between column $i$ of matrix $W$ and row $j$ of matrix $V$.

The [`matmul` function](https://docs.scipy.org/doc/numpy/reference/generated/numpy.matmul.html) that NumPy  provides also performs matrix multiplication on two arrays. Now, create and matrix multiply several arrays to get a feel for the function. Compare the result of the matrix multiplication with the dot product between a row and column in the two matrices to confirm that matrix multiplication is just a repeated dot product.