In [None]:
import numpy
%matplotlib inline
from matplotlib import pyplot

In [None]:
import sys
sys.path.append('../scripts/')

# Our helper, with the functions: 
# plot_vector, plot_linear_transformation, plot_linear_transformations
from plot_helper import *

## What's a vector?

Vectors are everywhere: physics, engineering, mathematics, computer science, video games, and more. Each field's interpretation of what a vector *is* may be different, but  vectors live a similar life in every space.

The first episode in the wonderful video series, [_"Essence of Linear Algebra"_](http://3b1b.co/eola) tells you of three different ideas about vectors [1]:

1. For physicists, a vector is an "arrow" of a given length (magnitude) and direction. It can represent directional quantities like velocity, force, acceleration.
2. For computer scientists, a vector is an ordered list of numbers. It can represent a set of variables or features stored in order.
3. For mathematicians, vectors are generic objects that behave a certain way when they are added or scaled:  $\mathbf{u}+\mathbf{v}$, $\alpha\mathbf{v}$.

<img src="../images/whatsavector.png" style="width: 500px;"/> 
#### How you think of a vector depends on who you are...

In physics, vectors are almost always two- or three-dimensional (although in some fancy branches of physics they do go to higher dimensions). Vectors help physicists describe things like motion and electro-magnetic fields on a plane or in physical 3D space.

In computer science and in data science, vectors are often multi-dimensional, that is, they have many components. They contain a set of ordered variables in a data model, like for example: the age, weight, daily hours of sleep, weekly hours of exercise, and blood pressure of an individual (five dimensions).

Let's start with the idea of a vector as an "arrow" (magnitude plus direction). We visualize a vector by placing this arrow with its tail at the origin of a coordinate system.
But changing the position of the tail doesn't change the vector's magnitude or direction, so the vector is the same no matter where we draw it. 

In the code cell below, we define a list with a single vector of coordinates $(2, 2)$, and we use our custom function `plot_vector()` to plot the vector with its tail at four different positions on a 2D coordinate system. 

In [None]:
vectors = [(2,2)]
tails = [(-3,-2), (-3,1), (0,0), (1,-3)]
plot_vector(vectors, tails)
pyplot.title("The same vector, with its tail at four locations on the 2D plane.");

In the 2D plane, we can see clearly the connection between the "arrow" idea of vector, and the "list of numbers," which in this case represents the coordinates of the arrow head when the tail is at the origin of the coordinate system.

The first coordinate designates the horizontal distance between head and tail, and the second coordinate designates the vertical distance between head and tail. We typically will denote horizontal and vertical axes as $x$ and $y$.

In three dimensions, $x$ and $y$ are usually denoting the perpendicular axes on the horizontal plane, and the vertical axis is denoted by $z$. A 3D vector thus has three components: $(x, y, z)$.

##### Note:

Our helper function `plot_vector()` takes two lists as arguments. It can either plot one vector with its tail on several locations, or several vectors with their tail at one location. It can also plot several vectors with their tails at different locations, but in that case, the two lists have to match in length (if they don't, the function will give an error).

## Fundamental vector operations

Two operations are the foundation of everything: **vector addition**, and **multiplication by a scalar** (i.e., scaling).

Let's first visualize vector addition. Suppose we have two vectors: 

$$
   \mathbf{a} = \left[ \begin{array}{c} -2 \\ 1  \end{array} \right], \quad  
   \mathbf{b} = \left[ \begin{array}{c} 1 \\ -3  \end{array} \right] 
$$

We can visualize vector addition as follows: draw vector $\mathbf{a}$ with its tail at the origin; then draw vector $\mathbf{b}$ with its tail on the head of $\mathbf{a}$. If you now draw a vector from the origin to the head of $\mathbf{b}$, that vector is $\mathbf{a} + \mathbf{b}$.

With our helper function for plotting 2D vectors, it looks like this:

In [None]:
# vector addition
a = numpy.array((-2,1))
b = numpy.array((1,-3))
origin = numpy.array((0,0))

vectors = [a, b, a+b]
tails   = [origin, a, origin]
plot_vector(vectors, tails)
pyplot.title("Adding vectors with coordinates $(-2, 1)$ and $(1,-3)$.");

In this visualization of vector addition, the head of $\mathbf{a} + \mathbf{b}$ ends up at the coordinates  resulting from adding the tail-to-head horizontal and vertical distances of $\mathbf{a}$ and $\mathbf{b}$. In other words, from adding the respective coordinates:

$$
   \left[ \begin{array}{c} -2 \\ 1  \end{array} \right] +  
   \left[ \begin{array}{c} 1 \\ -3  \end{array} \right] =
   \left[ \begin{array}{c} -2+1 \\ 1-3  \end{array} \right]
$$


Let's now look at multiplication by a scalar: essentially, the length of the vector is *scaled* by the scalar factor. If you multiply a vector by $2$, its length (magnitude) doubles. 

For example, if we scale by $2$ the vector $\mathbf{c} = \left[ \begin{array}{c} 2 \\ 1  \end{array} \right]$, it looks like this:

In [None]:
# vector scaling
c = numpy.array((2,1))
vectors = [c, 2*c]
plot_vector(vectors)
pyplot.title("Scaling of the vector $(2,1)$ by the scalar $2$.");

The head of the vector $2\mathbf{c}$ ends up at the coordinates resulting from scaling the tail-to-head horizontal and vertical distance of $\mathbf{c}$:

$$
  2\cdot\left[ \begin{array}{c} 2 \\ 1  \end{array} \right] =
  \left[ \begin{array}{c} 2\cdot 2 \\ 2\cdot 1  \end{array} \right]
$$

## Basis vectors

With the ideas of vector addition and multiplication by a scalar fresh in your mind, now imagine this. Any horizontal vector (i.e., having zero as its second component) can be scaled to have length $1$. 

For example, the vector $\,\mathbf{u} = \left[ \begin{array}{c} u \\ 0  \end{array} \right]$ scaled by $1/u$ becomes $\left[ \begin{array}{c} 1 \\ 0  \end{array} \right]$.

Similarly, any vertical vector (having zero as its first component) can be scaled to have length $1$.

Going the opposite way, 
- scaling the vector $\,\mathbf{i}=\left[ \begin{array}{c} 1 \\ 0  \end{array} \right]$ can give us all possible horizontal vectors, and 
- scaling the vector $\,\mathbf{j}=\left[ \begin{array}{c} 0 \\ 1  \end{array} \right]$ can give us all possible vertical vectors. 

Since every vector is the sum of a horizontal and a vertical one, it means we can generate *all vectors* by adding scaled versions of $\mathbf{i}$ and $\mathbf{j}$. That's why they are called **basis vectors**.

For any vector, its components are the scalars we need to multiply the basis vectors by to generate it. For example:

$$
 \left[ \begin{array}{c} 3 \\ 2  \end{array} \right] =
 3\cdot\left[ \begin{array}{c} 1 \\ 0  \end{array} \right] +
 2\cdot\left[ \begin{array}{c} 0 \\ 1  \end{array} \right] =
 3\mathbf{i} + 2\mathbf{j}
$$

Let's visualize this using our helper function.

In [None]:
# basis vector
i = numpy.array((1,0))
j = numpy.array((0,1))

vec = 3*i + 2*j
vectors = [i, j, 3*i, 2*j, vec]
plot_vector(vectors)

## Linear combination and span

Adding two vectors that were each multiplied by a scalar is called a **linear combination** of those two vectors. Thus, we say that every vector is some linear combination of the basis vectors.

This brings us to the idea of the **span** of two vectors: the set of all possible linear combinations of the two. The second episode of the series _"Essence of Linear Algebra"_ uses rich visuals to bring these ideas to life [2]. Recommended!


In the code cells below, we will use the NumPy function [`randint`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.randint.html) to get random integers in an interval (in this case, from $-10$ to $10$). We then create a list of 30 random vectors on the plane via a linear combination of the basis vectors $\mathbf{i}$ and $\mathbf{j}$, and we draw them all.  

In [None]:
from numpy.random import randint

In [None]:
# span
vectors = []
i = numpy.array((1,0))
j = numpy.array((0,1))

for _ in range(30):
    m = randint(-10,10)
    n = randint(-10,10)
    vectors.append(m*i + n*j)
    
plot_vector(vectors)
pyplot.title("Thirty random vectors, created from the basis vectors");

You can imagine that if we created more and more random vectors in this way, eventually we will fill up the 2D plane. Indeed, the *span* of the basis vectors is the whole 2D space. 

What if we tried the same experiment, but making linear combinations of the vectors $\mathbf{a}$ and $\mathbf{b}$, defined above?

In [None]:
vectors = []
for _ in range(30):
    m = randint(-10,10)
    n = randint(-10,10)
    vectors.append(m*a + n*b)
    
plot_vector(vectors)
pyplot.title("Thirty random vectors, created as linear combinations of $\mathbf{a}$ and $\mathbf{b}$.");

In fact, we can *still* fill up the whole plane with infinite linear combinations of $\mathbf{a}$ and $\mathbf{b}$—they span the full 2D space. We're not forced to use the unit vectors $\mathbf{i}$ and $\mathbf{j}$ as our basis vectors: other pairs of vectors could form a basis. With $\mathbf{i}$ and $\mathbf{j}$, we saw that the coordinates of a vector $\mathbf{v}$ are the scalars needed in its corresponding linear combination of the basis vectors. If we were to use another pair of vectors as basis, we would need a different pair of scalars in the linear combination: we are _changing the coordinate system_.

Let's see another situation... we'll make linear combinations of the vector $\mathbf{a}$, and a new vector, $\mathbf{d} = \left[ \begin{array}{c} -1 \\ 0.5  \end{array} \right]$,

In [None]:
d = numpy.array((-1,0.5))
vectors = []
for _ in range(30):
    m = randint(-10,10)
    n = randint(-10,10)
    vectors.append(m*a + n*d)
    
plot_vector(vectors)

*What's going on?*

Well, the new vector $\mathbf{d}$ happens to be a scaled version of the original vector $\mathbf{a}$—we say that they are _colinear_. Thus, all linear combinations of $\mathbf{a}$ and $\mathbf{d}$ end up on one line, which is their span. Their combinations are not able to travel all over the plane!

##### Definition: Basis

> A **basis** for a vector space is a set of _linearly independent_ vectors that _span_ that space.

Plotting 30 vectors can result in a messy figure. When we want to visualize many vectors like this, we can simplify the plot by only showing the tip (head) of the vector, as a point on the plane. We'll do that from now on.

## What's a matrix

In many books, they'll tell you that a matrix is a "table" of numbers, ordered in rows and columns. Maybe that's enough for some people, but you will get a kick out of _seeing_ what a matrix does!

Let's remember our friendly vectors from above:

$$
   \mathbf{a} = \left[ \begin{array}{c} -2 \\ 1  \end{array} \right], \quad  
   \mathbf{b} = \left[ \begin{array}{c} 1 \\ -3  \end{array} \right] 
$$

Our little experiment with 30 random linear combinations of $\mathbf{a}$ and $\mathbf{b}$ helped us visualize that they can span the 2D space, and nothing is stopping us from using them as a basis if we so desire.

Remember also our vector $\mathbf{c} = \left[ \begin{array}{c} 2 \\ 1  \end{array} \right]$. Choosing $\mathbf{i}$ and $\mathbf{j}$ as a basis, then $\mathbf{c} = 2\,\mathbf{i} + 1\,\mathbf{j}$.

Now imagine that we use the components of $\mathbf{c}$ to make a linear combination of $\mathbf{a}$ and $\mathbf{b}$:

$$
 2\,\mathbf{a} + 1\,\mathbf{b} =
 2\cdot\left[ \begin{array}{c} -2 \\ 1  \end{array} \right] +
 1\cdot\left[ \begin{array}{c} 1 \\ -3  \end{array} \right] = 
  \left[ \begin{array}{c} -3 \\ -1  \end{array} \right]
$$

This is a new vector, let's call it $\mathbf{c}^\prime$: 

- it has components $\left[ \begin{array}{c} 2 \\ 1  \end{array} \right]$ in the $\mathbf{a}$, $\mathbf{b}$ system of coordinates, and 
- it has components
$\left[ \begin{array}{c} -3 \\ -1  \end{array} \right]$ in the $\mathbf{i}$, $\mathbf{j}$ system of coordinates.

This will blow your mind. Arrange the vectors $\mathbf{a}$ and $\mathbf{b}$ as the columns of a matrix, and you'll see that:

$$
   \begin{bmatrix} -2 & 1 \\ 
                    1 & -3  \end{bmatrix}  
   \left[ \begin{array}{c} 2 \\ 1  \end{array} \right] =
  \left[ \begin{array}{c} -3 \\ -1  \end{array} \right]
$$

The matrix $ A=\begin{bmatrix} -2 & 1 \\ 
                    1 & -3  \end{bmatrix} $  when multiplied by the vector $\mathbf{c}$ gives the vector $\mathbf{c}^\prime$.
We say that the **matrix** A is the **linear transformation** that maps vector $\mathbf{c}$ into $\mathbf{c}^\prime$.


In [None]:
matrix = [[1,2], [2,1]]
matrix = numpy.array(matrix)
plot_linear_transformation(matrix)

$ M = \begin{bmatrix} 1 & 2 \\
                      2 & 1 \end{bmatrix} $

The basis $\hat{i}$ lands at $(1,2)^T$ and basis $\hat{j}$ lands at $(2,1)^T$ after the transformation.

$$
\begin{equation}
\hat{i} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}  \Rightarrow  \begin{bmatrix} 1 \\ 2 \end{bmatrix} \\
\hat{j} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}  \Rightarrow  \begin{bmatrix} 2 \\ 1 \end{bmatrix}
\end{equation}
$$

- Matrix-Matrix multiplication:
 a combination of two linear tranformations

In [None]:
shear = numpy.array([[1,1], [0,1]])
rotation = numpy.array([[0,-1], [1,0]])

In [None]:
plot_linear_transformation(shear@rotation)

In [None]:
plot_linear_transformations(rotation, shear)  # the order of transformation: from right to left

note: `shear@rotation != rotation@shear` the order of transformations is important. Matrix mulitiplication is not commutative.

- Inverse of a matrix:

In [None]:
from numpy.linalg import inv

In [None]:
A = numpy.array([[1,2], [2,1]])
A_inv = inv(A)
plot_linear_transformations(A, A_inv)

## References

1. Vectors, what even are they? Essence of linear algebra, chapter 1. Video at https://youtu.be/fNk_zzaMoSs (2016), by Grant Sanderson.
2. Linear combinations, span, and basis vectors. Essence of linear algebra, chapter 2. Video at https://youtu.be/k7RM-ot2NWY (2016), by Grant Sanderson.

In [None]:
# Execute this cell to load the notebook's style sheet, then ignore it
from IPython.core.display import HTML
css_file = '../style/custom.css'
HTML(open(css_file, "r").read())