_CREDITS: (based on Brian Mann's notes and Jack Benedetto's notebook, plus some other cool stuff of my own)_

In [None]:
%load_ext autoreload
%autoreload 2

%matplotlib inline
import numpy as np
from vectorplotter import VectorPlotter

## Context : Linear Algebra and Machine Learning

* Ranking web pages in order of importance
    * Solved as the problem of finding the eigenvector of the page score matrix
* Dimensionality reduction - Principal Component Analysis
* Movie recommendation
    * Use singular value decomposition (SVD) to break down user-movie into user-feature and movie-feature matrices, keeping only the top $k$-ranks to identify the best matches
* Topic modeling
    * Extensive use of SVD and matrix factorization can be found in Natural Language Processing, specifically in topic modeling and semantic analysis

## Objectives

- Perform Linear Algebra operations by hand: Multiply matrices, Add and subtract matrices, Transpose matrices, verify inverses.
- Perform linear algebra operations in numpy.

to the extend that is only necessary to understand this week's concepts in Data Science.


## Warm-up: meet Frank !

<img src="images/frank.png" width=100 align="left" style="margin-right:20px"/>

Frank comes in peace from planet _alwkudbzkfb_. He just recently learned english and has probably **200 words of vocabulary**. You have been chosen as an emissary from Mankind to **teach Mathematics to Frank**.

### <span style="color:red">QUESTION : In $\mathbb{R}$, how would you define $+$ for Frank ?</span>

# 1. From operations to vectors

The core of linear algebra is to define such **abstract mathematical structures** from their essential properties. 

Then, we can use these abstract structures and their properties to **derive systemic properties, theorems, rules**, using logic.

## 1.1. Operations in $\mathbb{R}$

We have seen above how one can **"define $+$"** in $\mathbb{R}$ from its properties:

> $x + y = y + x \; \; $ for any $x,y$ in $\mathbb{R}$ **(commutativity)**

> $x + (y + z) = (x + y) + z \; \; $ for any $x,y,z$ in $\mathbb{R}$ **(associativity)**

> there exists an element $0$ in $\mathbb{R}$ such as $x + 0 = 0 + x = x$ for any $x \in \mathbb{R}$ **(identity element)**

> for any $x \in \mathbb{R}$, there exists an element $-x \in \mathbb{R}$ such as $x + (-x) = 0$ **(inverse element)**


The same way, one can **"define $\times$"** in $\mathbb{R}$:

> $x \times y = y \times x \; \; $ for any $x,y$ in $\mathbb{R}$ **(commutativity)**

> $x \times (y \times z) = (x \times y) \times z \; \; $ for any $x,y,z$ in $\mathbb{R}$ **(associativity)**

> there exists an element $1$ in $\mathbb{R}$ such as $x \times 1 = 1 \times x = x$ for any $x \in \mathbb{R}$ **(identity element)**

> for any $x \in \mathbb{R}, x \neq 0$, there exists an element $x^{-1} \in \mathbb{R}$ such as $x \times x^{-1} = 1$ **(inverse element)**

There is also a property between those two:

> $x \times (y + z) = (x \times y) + (x \times z) \; \; $ for any $x,y,z$ in $\mathbb{R}$ **(distributivity)**

Last (bot not least), when considering $\mathbb{R}$ with $+$ and $\times$, $\mathbb{R}$ has another property of **"closure"**:

> for any $x,y$ in $\mathbb{R}$, $x+y \in \mathbb{R}$

> for any $x,y$ in $\mathbb{R}$, $x \times y \in \mathbb{R}$

This set of properties between $(\mathbb{R},+,\times)$ makes it a **Field** (just name dropping today).

### <span style="color:red">QUESTION : Consider $\mathbb{N}$, $\mathbb{Z}$, $\mathbb{Q}$, do they also have the properties above ?</span>
<br/>
<details>
<summary>Click here to see (one) solution below</summary>
<br/>
$\mathbb{N}$ has closure, but not inverse element with $+$.
<br/>
$\mathbb{Z}$ has closure, but not inverse element with $\times$.
<br/>
$\mathbb{Q}$ has all of them.
</details>

## 1.2. Vectors (with values drawn from $\mathbb{R}$)

My teacher would write vectors that way:

$
\vec{x} =
  \begin{bmatrix}
    x_1 \\
    x_2 \\
    \vdots \\
    x_n
  \end{bmatrix}
$

But since we're in python world now, a vector can also be represented by an array of real numbers:

$ \vec{x} = [x_1, x_2, \ldots, x_n] $

In [None]:
x = np.array([1,2,3,4])
print x
print x.shape

Geometrically, a vector specifies the coordinates of the tip of the vector if the tail were placed at the origin

In [None]:
vp = VectorPlotter(figsize=(5, 5), limits=[-1,6])
vp.plot_vector([3,1], color='r')
vp.plot_vector([1,5], color='g')
vp.show()

## 1.3. Operations in a vector space

### <span style="color:red">QUESTION : What operations do you know that relate to vectors ?</span>


### Sum of two vectors

If we have two vectors $\boldsymbol{x}$ and $\boldsymbol{y}$ of the same length $(n)$, then

$$\boldsymbol{x} + \boldsymbol{y} = [x_1+y_1, x_2+y_2, \ldots, x_n+y_n]$$

In [None]:
x = np.array([1,2,3,4])
y = np.array([5,6,7,8])

print("{} + {} = {}".format(x, y, x+y))

In [None]:
u = np.array([3,1])
v = np.array([1,5])
w = u+v

vp = VectorPlotter(figsize=(5, 5), limits=[-1,7])
vp.plot_vector(u, color='r')
vp.plot_vector(v, color='g')
vp.plot_vector(v, color='gray', orig=u)
vp.plot_vector(u+v, color='b')
vp.show()

### Adding a constant to a vector adds the constant to each element

$$ a + \vec{x} = [a + x_1, a + x_2, \ldots, a + x_n] $$

In [None]:
a = 4
x = np.array([1,2,3,4])

print("{} + {} = {}".format(a, x, a+x))

### Multiplying a vector by a constant multiplies each term by the constant.

$$ \lambda . \vec{x} = [ax_1, ax_2, \ldots, ax_n] $$

Any operation can be used this way on numpy arrays and numbers. 

In [None]:
l = 4
x = np.array([1,2,3,4])

print("{} * {} = {}".format(l, x, l*x))

### "Linear Combination"

We call a _linear combination_ of a collection of vectors $(\vec{x}_1,
                                                    \vec{x}_2, \ldots,
                                                    \vec{x}_m)$ 
any vector of the form

$$ \alpha_1 \cdot \vec{x}_1 + \alpha_2 \cdot \vec{x}_2 + 
\cdots + \alpha_m \cdot \vec{x}_m $$          

In [None]:
a1=2
x1 = np.array([1,2,3,4])

a2=4
x2 = np.array([5,6,7,8])

print a1*x1 + a2*x2

## 1.4. Vector Space

Those vectors represented as arrays, with values taken in $\mathbb{R}$ have a specificy **dimension** (number of values, degrees of freedom, components, coordinates...). The set of all vectors with n coordinates is noted $\mathbb{R}^n$. It has all very nice properties regarding addition, multiplication by a _scalar_, etc...

Considering $\mathbb{R}^n$ and addition between vectors $+$:

> $\vec{u} + \vec{v} = \vec{v} + \vec{u} \; \; $ for any $\vec{u},\vec{v}$ in $\mathbb{R}^n$ **(commutativity)**

> $\vec{u} + (\vec{v} + \vec{w}) = (\vec{u} + \vec{v}) + \vec{w} \; \; $ for any $\vec{u},\vec{v},\vec{w}$ in $\mathbb{R}^n$ **(associativity)**

> there exists a **vector** $\vec{0}$ in $\mathbb{R}^n$ such as $\vec{u} + 0 = 0 + \vec{u} = \vec{u}$ for any $\vec{u} \in \mathbb{R}^n$ **(identity element)**

> for any $\vec{u} \in \mathbb{R}^n$, there exists an element $-\vec{u} \in \mathbb{R}^n$ such as $\vec{u} + (-\vec{u}) = 0$ **(inverse element)**

Considering $\mathbb{R}^n$ and multiplication by a _scalar_ (value in $\mathbb{R}$):

> $1 . \vec{u} = \vec{u} \; \; $ for any $\vec{u}$ in $\mathbb{R}^n$ **(multiplicative identity)**

> $a . (\vec{u} + \vec{v}) = a.\vec{u} + a.\vec{v} \; \; $ for any $a \in \mathbb{R}$, any $\vec{u},\vec{v}$ in $\mathbb{R}^n$ **(distributivity of $.$ with $+$ of $\mathbb{R}^n$)**

> $(a + b) . \vec{u} = a.\vec{u} + b.\vec{u} \; \; $ for any $a,b$ in $\mathbb{R}$, any $\vec{u} \in \mathbb{R}^n$ **(distributivity of $.$ with $+$ of $\mathbb{R}$)**

> $a . (b . \vec{u}) = (a \times b).\vec{u} \; \; $ for any $a,b$ in $\mathbb{R}$, any $\vec{u} \in \mathbb{R}^n$ **(compatibility of $.$ with $\times$ of $\mathbb{R}$)**


Because of all these properties, we say that $(\mathbb{R}^n,+,.)$ is a vector space over $\mathbb{R}$.

## 1.5. Dot product

If we have two vectors $\vec{x}$ and $\vec{y}$ of the same length $(n)$, then the _dot product_ is give by

$$\vec{x} \cdot \vec{y} = x_1y_1 + x_2y_2 + \cdots + x_ny_n$$

In [None]:
x = np.array([1, 2, 3, 4])
y = np.array([4, 3, 2, 1])
print x
print y
np.dot(x,y)

In [None]:
u = np.array([1,4])

vp = VectorPlotter(figsize=(6, 5), limits=[-10,10])
vp.plot_dotproduct(u)
vp.show()

If $\vec{x} \cdot \vec{y} = 0$ then $\vec{x}$ and $\vec{y}$ are *orthogonal* (aligns with the intuitive notion of perpindicular)

In [None]:
w = np.array([1, 2])
v = np.array([-2, 1])
np.dot(w,v)

### Solutions to $\vec{\alpha} \cdot \vec{x} = 0$

For any given vector $
\vec{\alpha} =
  \begin{bmatrix}
    \alpha_1 \\
    \alpha_2 \\
    \vdots \\
    \alpha_n
  \end{bmatrix}
$, let's find all the vectors $\vec{x} =
  \begin{bmatrix}
    x_1 \\
    x_2 \\
    \vdots \\
    x_n
  \end{bmatrix}
$ such as $\vec{u} \cdot \vec{x} = 0$.

### <span style="color:red">QUESTION : given $\vec{\alpha}$ can you find $\vec{x}$ such as $\vec{\alpha} \cdot \vec{x} = 0$ ?</span>

When you've found two such $\vec{x}$, $\vec{y}$...
- what happens if you multiply it by some constant $\lambda$ ?
- what happens if you add some constant $\lambda$ ?
- what happens if you add $\vec{x}$ and $\vec{y}$ ?
- what would be the solutions to $\vec{\alpha} \cdot \vec{x} = \beta$ ?

In [None]:
a = np.array([1,2,3,4])

x = np.array([0,0,0,0]) # find another ?
print np.dot(a,x)

y = np.array([0,0,0,0]) # find another ?
print np.dot(a,y)

In [None]:
np.dot(a,2*x)

In [None]:
np.dot(a,2+x)

In [None]:
np.dot(a,x+y)

Let's visualize the dot product of u.

In [None]:
u = np.array([1,4])

vp = VectorPlotter(figsize=(6, 5), limits=[-10,10])
vp.plot_dotproduct(u)
vp.show()

We can show that this set of solutions is a vector subspace. Meaning that any linear combination of solutions to this equation is also a solution to the equation:

$$\vec{u} \cdot \vec{x} = \alpha_1 x_1 + \alpha_2 x_2 + \cdots + \alpha_n x_n = 0$$

This set of solutions has a dimension which is equal to $n-1$.

## 1.6. Norm and distance

The norm of a vector $\mathbf{x}$ is defined by

$$||\vec{x}|| = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2}$$

In [None]:
x = np.array([1,2,3,4])

print x**2
print np.sqrt(np.sum(x**2))
print np.linalg.norm(x)

The norm squared of a vector is just the vector dot product with itself
$$
||\vec{x}||^2 = \vec{x} \cdot \vec{x}
$$

In [None]:
print np.linalg.norm(x)**2
print np.dot(x,x)

The distance between two vectors is the norm of the difference.
$$
d(\vec{x},\vec{y}) = ||\vec{x}-\vec{y}||
$$

In [None]:
y = np.array([4, 3, 2, 1])

np.linalg.norm(x-y)

_Cosine Similarity_ is the cosine of the angle between the two vectors give by

$$cos(\theta) = \frac{\vec{x} \cdot \vec{y}}{||\vec{x}|| \text{ } ||\vec{y}||}$$


In [None]:
x = np.array([1,2,3,4])
y = np.array([5,6,7,8])
np.dot(x,y)/(np.linalg.norm(x)*np.linalg.norm(y))

If both $\vec{x}$ and $\vec{y}$ are zero-centered, this calculation is the _correlation_ between $\vec{x}$ and $\vec{y}$

In [None]:
x_centered = x - np.mean(x)
print x_centered
y_centered = y - np.mean(y)
print y_centered
np.dot(x_centered,y_centered)/(np.linalg.norm(x_centered)*np.linalg.norm(y_centered))

### <span style="color:red">QUESTION: calculating cosine similarity</span>

- Draw a picture of the vectors [3, 4] and [2, 1] and what the cosine similarity is measuring.
- Calculate the cosine similarity of [3, 4] and [2, 1].

- What do two vectors of dimension n look like if they are **the most similar** possible? What is the cosine similarity between these two vectors?
- What do two vectors of dimension n look like if they are **the least similar** possible? What is the cosine similarity between these two vectors?
- If the values are all nonnegative, does this change your answer to the previous two questions (2 & 3)?

In [None]:
u = np.array([3,4])
v = np.array([2,1])

vp = VectorPlotter(figsize=(5, 5), limits=[-1,5])
vp.plot_vector(u, color='r')
vp.plot_vector(v, color='g')
vp.show()