# Dot Products and duality

The full understanding the role dot products play in math can only really be found under the light of linear transformations. 

The standard way dot products are introduced. 

Numerically, if you have two vectors of the same dimension, to list of numbers with the same length taking their dot product, means, pairing up all of the coordinates, multiplying those pairs together and adding the result. 

$$\left[\begin{array}{c}
    2 \\
    7\\
    1\\
\end{array}\right]
\cdot
\left[\begin{array}{c}
    8 \\
    2\\
    8\\
\end{array}\right] = 2 \cdot 8 + 7 \cdot 2 + 1 \cdot 8$$

So the vector $[1, 2]$ dotted with $[3,4]$ would be $1\cdot3 + 2\cdot4$, luckily this computation has a really nice geometric interpretation. To think about the dot product between two vectors $\vec{v}$ and $\vec{w}$, imagine projecting $\vec{w}$ onto the line that passes through the origin and the tip of $\vec{v}$. Mutliplying the length of this projection by the length of $\vec{v}$, you have the dot product $\vec{v} \cdot \vec{w}$. 

$$\left[\begin{array}{c}
    4 \\
    1\\
\end{array}\right]
\cdot
\left[\begin{array}{c}
    2 \\
    -1\\
\end{array}\right] = (\text{Length of projected} \ \vec{w})(\text{Length of}\ \vec{v})
$$

**Except** when this projection of $\vec{w}$ is pointing in the opposite direction from $\vec{v}$, that dot product will actually be negative.

#### In General
When two vectors are generally pointing in the same direction, their dot product is positive. When they are perpendicular, meaning the projection of one onto the other is the 0 vector, the dot product is 0. And if they're pointing generally in the opposite direction, their dot product is negative $(\vec{v}\cdot\vec{w}\lt 0)$. 

This interpretation is weirdly asymmetric, it treats the two vectors very differently, order doesn't matter, you could instead project $\vec{v}$ onto $\vec{w}$ multiplt the length of the projected $\vec{v}$ by the length of $\vec{w}$ and get the same result. 

The intuition for why order doesn't matter: if $\vec{v}$ and $\vec{w}$ happened to have the same length, we could leverage some symmetry. Since projected $\vec{w}$ onto $\vec{v}$ then multiplying the length of that projection by the length of $\vec{v}$, is a complete mirror image of projecting $\vec{v}$ onto $\vec{w}$ then mutltiplying the length of that projection by the length of $\vec{w}$. 

Now let's say that we have scaled $\vec{v}$ by 2, so that it now becomes 2$\vec{v}$, in this case the symmetry is broken. Now trying to interpret the dot product between this new vector $2\vec{v}$ and $\vec{w}$. If you think of $\vec{w}$ getting projected onto $\vec{v}$, then the dot product $2\vec{v}\cdot\vec{w}$, will be exactly twice the dot product $\vec{v}\cdot\vec{w}$ or, $2(\vec{v}\cdot\vec{w})$. This is because when you scale $\vec{v}$ by $2$, it doesn't change the length of the projection of $\vec{w}$ but it doubles the length of the vector that you're projecting onto. But, on the other hand, let's say you're thinking about $\vec{v}$ getting projected onto $\vec{w}$. In that case, the length of the projection is the thing to get "scaled" when we multiply $\vec{v}$ by 2. The length of the vector that you're projecting onto stays constant. 

So the overall effect is still to just double the dot product. So, even though symmetry is broken in this case, the effect that this "scaling" has on the value of the dot product, is the same under both interpretations. 

#### What does dot product have to do with projection?

Behold - **duality**. But before we talk about it we need to discuss linear transformations again, from multiple dimensions to one dimension, i.e. the number line. But linear transformations are much more restricted than your run-of-the-mill function with a 2D input and a 1D output. 

As with transformations in higher dimensions, there are some formal properties that make these functions linear. Such as the below, but we will instead focus on a certain visual property that's equivalent to all the formal stuff.

$$L\left(\vec{v} + \vec{w}\right) = L\left(\vec{v}\right) + L\left(\vec{w}\right)$$ 
$$L\left(c\vec{v}\right) = cL\left(\vec{v}\right)$$

If you take a line of evenly spaced dots and apply a transformation, a linear transformation will keep those dots evenly spaced, once they land in the output space, which is the number line. Otherwise if there's some line of dots that gets unevenly spaced then your transformation is not linear. As we have seen before one of these linear transformations is completely determined by where it takes $\hat{i}$ and $\hat{j}$. But in this case, each one of those basis vectors just lands on a number. So when we record where they land as the columns of a matrix, each of those columns just has a single number. a 1x2 matrix. Let's say you have a linear transformation that takes $\hat{i}$ to $1$ and $\hat{j}$ to $-2$. To follow where a vector with coordinates, say $[4, 3]$ ends up. We can think of breaking up this vector as $4\hat{i}$ and $\hat{3}$. As a consequence of linearity, is that after the transformation the vector will be 4 times the place where $\hat{i}$ lands, 1, plus 3 times the place where $\hat{j}$ lands, -2. Which in this case implies that it lands on -2. When you do this calculation purely numerically, it's a matrix-vector multiplication. 

$$\vec{v} = \left[\begin{array}{c} 4 \\ 3 \\ \end{array}\right] = 4(1) + 3(-2)$$

$$\left[\begin{array}{cc} 1 & -2 \\ \end{array}\right] \left[\begin{array}{c}4\\3 \end{array}\right]$$

Now, this numerical operation of multiplying a 1x2 matrix by a vector, feels just like taking the dot product of two vectors. We can say there is a nice association between 1x2 matrices and 2D vectors, defined by tilting the numerical representation of a vector on its side to get the associated matrix, or to tip the matrix back up to get the associated vector. 

$$\left[\begin{array}{c} 1\\2\\ \end{array}\right] \left[\begin{array}{cc} 1 & 2 \end{array}\right] $$

**What does this association mean geometrically?**
This demonstrates that theres some kind of connection between linear transformations that take vectors to numbers and vectors themselves. 

For example, imagine that we don't already know that the dot product relates to projection. Imagine we placed a copy of the number line diagonally in space with the number 0 sitting at the origin. If we think of a two-dimensional unit vector, whose tips sit where the number 1 on the number line is, called $\hat{u}$. 

If we project 2d vectors straight onto this diagonal number line, in effect, we've just defined a function that takes 2D vectors to numbers. What's more this function is actually linear, as it passes the visual test that any line of evenly spaced dots remains evenly spaced once it lands on the number line. To be clear, even though we have embedded the number line in 2d space like this, the outtput of tthe functions are numbers not 2d vectors. A function that takes in coordinates and outputs a single coordinate. However, the vector $\hat{u}$ is a two-dimensional vector living in the input space. Its situated in such a way that overlaps with the embedding of the number line. 

With this projection, we justt defined a linear transformation from 2D vectotrs to numbers, so we're going to be able to find some kind of 1x2 matrix that describes that transformation. To find it, let's zoom in on this diagonal number line setup, and think about where $\hat{i}$ and $\hat{j}$ each land, since those landing spots are going to be tthe columns of the matrix. 

![uhat](uhat_zoom2.png)

Since $\hat{i}$ and $\hat{u}$ are both unit vectors, projecting $\hat{i}$ onto the line passing though $\hat{u}$ looks totally symmetric to projecting $\hat{u}$ onto the x-axis. So when we asked what number does $\hat{i}$ land on when it gets projected, the answer is going to be the same as whatever $\hat{u}$ lands on when its projected onto the x-axis, but projecting $\hat{u}$ onto the x-axis just means taking the x-coordinate of $\hat{u}$. So by symmetry the number where $\hat{i}$ lands when its projected onto that diagonal number line is going to be the coordinate of $\hat{u}$. 

![uhat](uhat-3.png)

The reasoning is almost identical for the $\hat{j}$ case. For all the same reasons, the y-coordinate of $\hat{u}$ gives us the number where $\hat{j}$ lands when its projected onto the number line copy. 

So the entries of the 1x2 matrix describing the projection transformation are going to be the coordinates of $\hat{u}$.

$$\left[\begin{array}{cc}U_x & U_y \\ \end{array}\right]$$

And computing this projection transformation for arbitrary vectors in space, which requires multiplying that matrix by those vectors, is computationally identical to taking a dot productt with $\hat{u}$. This is why taking the dot product with a unit vector can be interpreted as projecting a vector onto the span of that unit vector and taking the length. 

**So what about non-unit vectors?**

Let's say we take that unit vector $\hat{u}$, but we scale it up by a factor of 3. Numerically, each of its components gets multiplied by 3. So looking at the matrix associated with that vector, it takes $\hat{i}$ and $\hat{j}$ tto 3 times the values where they landed before. Since this is all linear, it implies more generally, that the new matrix can be interpreted as projecting any vector onto the number line copy and multiplying where it lands by three. This is why the dot product with a non-unit vector can be interpreted as first projecting onto that vector then scaling up the length of that projection by the length of the vector. 

**Summary**
We had a linear transformation from 2D space to the number line, which was not defined in terms of numerical vectors or numerical dot products. It was just defined by projecting space onto a diagonal copy of the number line. But because the transformation is linear, it was necessarily described by some 1x2 matrix, and since multiplying a 1x2 matrix by a 2D vector is tthe same as turning tthat matrix on its side and taking a dot product, this transformation was related to some 2d vector. 

The lesson here is that anytime we have one of these linear transformations whose outputt space is the number line, no matter how it was defined theres going to be some unique vector $\vec{v}$ corresponding to that transformation in the sense that applying the transformation is the same thing as taking a dot product with that vector. This is an example of **duality**. 

Duality shows up in many different ways and forms throughout math, loosely speaking, it refers to situations where you have a natural butt surprising correspondence between two types of mathemattical thing. We can say that the **dual** of a vector is the linear transformation that it encodes. And a dual of a linear ttransformaiton from space to one dimension, is a certain vector in that space. The dot product is a very useful geometric tool for understanding projections and for testing whether or not vecttors tend to point in the same direction. Which is probably tthe most important piece to remember about the dot productt. But at a deeper level dotting two vectors togethter is a way to translate one of them into the world of transformations.


### Formal linearity properties

$$L\left(\vec{v} + \vec{w}\right) = L\left(\vec{v}\right) + L\left(\vec{w}\right)$$ 
$$L\left(c\vec{v}\right) = cL\left(\vec{v}\right)$$