# Vectors & Spaces

Vectors are fuindamental to Machine Learning, if you can build some insight into vectors, vector spaces and how to operate on them then understanding teh fundamentals of a lot of Machine Learning (ML) algortihms become a lot easier.

You can begin to understand how things as disparate as `k-means clusters` and exotic `deep neural network` all relate to undamentally representing, transforming, slicing, dcing and measuring information content of vector spaces.

So it's a good idea to start here, we'll try and do so with the minimum of maths.

## Vectors

When working in a programming language we often think about vectors as lists and in some languages we have objects or classes that explicitly call themselves `vectors`. e.g. `std::vector` in the c++ standard library but even they don;'t provide an interface that aligns with a mathematical vector.

In native javascript, the closest thing we have the the native Array which as a datastructure fits out needs for a vector but it does not provide any of teh mathematical operations that we need when treating it as a vector. We are going to develop some of our own here.

Typically we will see maths like: 

\begin{equation*}
y = Wx + b
\end{equation*}

where $y, x, b$ are assumed to be vectors and $W$ is a matrix, dots $\dot{y}$ are sometimes also used to denote vectors


In [1]:
// We can easily contain a vector within an array
x = [1, 3];

[ 1, 3 ]

#### Vector Spaces 1

A vector space is mathematical concept that wis quite hard to grasp in general. A vector space in genreall defined by a dimension and an `inner product` which we will talk about below. 

We are lucky as in most of the time we work within a Euclidean Vector space, which is a space that we are used to and is applicable to the real world around us.

When reading ML material we will see notation like $R^N$ representing an n-dimensional space, that is spanned by n-dimensional vectors. So $R^2$ simply means 2 dimensional, $R^3$ means 3D, and so on...

Most of teh time when we see $R^N$ the author is talking about euclidean space and the defintions we go through below apply

#### Magnitude (L2 norm)
The length of a vector, or it's magnitude is given by the square root of the squared sum of its elements. Or $$\Vert x \Vert = \sqrt{\sum_{i=0}^N x^2}$$

This is also known as the `L2 Norm`

In [None]:
// simply for our 2 dimensional case
var X1 = Math.sqrt(Math.pow(x[0], 2) + Math.pow(x[1], 2))
console.log(X1)

// more generally for anmy length vector
var X2 = Math.sqrt(x.map(x => x*x).reduce((a,x) => a+x , 0))
console.log(X2)

##### L1 Norm

There is an alternate measure of vector magnitude that you may encouter, the `L1 norm`. Which is simply the bum of the absolute components.

$$\vert x \vert = \sum_{i=0}^N \vert x \vert$$

In [None]:
var X_L1 = x.reduce((a,x) => a + Math.abs(x), 0);
console.log(X_L1)

#### Unit Vectors

Unit vectors $u$ are a vectors whose magnitude is 1.0, aka `unity`. There is no length information. All unit vectors start at the origin and end on the unit circle, sphere or hyper sphere depending on the dimensio. So unit vectors vary only in angle.

$$\hat x = \frac{x}{\Vert x \Vert}$$

In [None]:
// compute a unit vector from any vector X just by dividing out by the magnitude
var u = x.map(x => x/X1)
console.log("unit vector", u)

// double chaeck the magnitude of u
var U1 = Math.sqrt(u.map(x => x*x).reduce((a,x) => a+x , 0))
console.log("expect unity", U1)

#### Basis Vectors

The basis of a vector space are the primary axes of that space. In $R^2$ the basis is represented by 2 unit vectors; $\hat e_0, \hat e_1$. This is the standard basis in cartesian space. 

We can see that this generalises well to $R^N$ where we have unit vectors $\hat e_0, \hat e_1, ... , \hat e_{N-1}$ that are each all zeros.

(Terminology Note! ML folks call a vector of zeros with a single component set to 1, [one hot](https://en.wikipedia.org/wiki/One-hot) and sometime specifically encode data into that representation)

In [None]:
// the standard basis of 2 dimensional space
var e0 = [1, 0];
var e1 = [0, 1];

#### Inner products or the dot product

The inner product of two vectors in cartesian space is defined as: $$a \bullet b = \Vert a \Vert \Vert b \Vert cos(\alpha) = \sum_{i-0}^{N-1}a_i b_i$$

so take the product of all vector components and sum them gemetrically this product is the product of the magnitude of the two vectors times the cosing of the angle between them. Not that intuitive.

But there are some importanrt properties:

 - if two vectors are co-linear or parallel the $a \bullet b = \Vert a \Vert \Vert b \Vert$
 - if two unit vectors are co-linear or parallel the $a \bullet b = 1$
 - if two vectors are `orthogonal` then $a \bullet b = 0$


In [None]:
var a = [ -1, 4 ];
var b = [ 3, 2 ];

var dot = 0;
for (var i = 0; i < a.length; i++) {
    dot = a[i]*b[i];
}
console.log(dot)

**Exercise**: what is the dot product of the standard basis in 2D?

In [None]:
// compute the dot product of e0 & e1 here, whatdo you expect it to be?
// is there a better native way than a for loop?






#### Projections

Something that the dot product also represents is a projection. Let's look at that equation again.

$$a \bullet b = \Vert a \Vert \Vert b \Vert cos(\alpha) = \sum_{i-0}^{N-1}a_i b_i$$

so 

$$\frac{a \bullet b}{\Vert b \Vert} = \Vert a \Vert cos(\alpha)$$

or on other words, the dot product can be used to project the vector `a` onto `b` or vice versa.

![dot product](https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Dot_Product.svg/300px-Dot_Product.svg.png)
[image: wikipedia](https://commons.wikimedia.org/wiki/File:Dot_Product.svg)

this becomes more interesting when we replace be with the unit vector $$a \bullet \hat u = \Vert a \Vert cos(\alpha)$$

So we can project a vector along any direction by taking the dot product with another unit vector in that direction, as we will get a meaningful scalar value depending on teh angle between them.

We can project a vector onto a given basis by taking the dot product with each of the basis vectors.

In [None]:
var c = [ -3, 7 ];

var c_proj = [
    c[0]*e0[0] + c[0]*e1[0],
    c[0]*e0[0] + c[0]*e1[0]
];

console.log("c projected onto e0, e1 equal", c, "c!")

So the standard basis is not that interesting, it's like an identity function and actually if we stack the vectors of the standard basis we get an identity matrix.

$$
I =
\begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix}
$$

and the operation we just did above was $c = Ic$

We can do more interesting operations with arbitrary matrices of course.

Exercise: try computing $y = Rc$

In [None]:
var theta = 30/2*Math.PI;
var r0 = [Math.cos(theta), -Math.sin(theta)], r1 = [Math.sin(theta), Math.cos(theta)]
var R = [r0, r1]

var y = 0; // write some code here, matrix math is awkward in native js!


console.log("c projected onto R equals", y, "which is c rotated by 30 degrees")


## Summary

 - Machine Learning under the hood is fundamentally about dealing with vector and vector spaces
 - Information is represented in $R^N$ an n-dimensional spacewhere in practice N can become very large! 1000's of components
 - Bases are sets of vectors that represent the principal axes within a vector space. Bases are said to span the space. Basis are orthogonal of their inner products are mutually `0`
 - Inner products can be used to project vectors onto one another, or determine the contribution of a vector in any particular abitrary direction.
 - Inner products are a fundamental measure of `similarity` between a pair of vectors

### Better Tools

Native javascript is very limited when it comes to the type of mathematical manipulations we need ot make at a low level. Solving that is beyond tehscope of this workshop, but if you do need to do some linear algegra at JS level and can live with teh performance then libraries like these are work looking at.

 - [Math.js](http://mathjs.org/)
 - [glMatrix](http://glmatrix.net/)

However, bear in mind that any library that you use would need to be compatible with the ML library that you also want to use, and at the end of the day if you are trying to do ML it's best to find a high level library for the technique that you are interested in, which will hopefully be optimised to a level beyond that which a generic matrix library is.


## Further Reading

 - [Vector spaces as moidels for Natural Language Processing (NLP)](https://en.wikipedia.org/wiki/Vector_space_model)
 - [facebook's hyperbolic space for heirarchical representations](https://medium.com/towards-data-science/facebook-research-just-published-an-awesome-paper-on-learning-hierarchical-representations-34e3d829ede7)
 - [Differences between the L1 & L2 norms](http://www.chioka.in/differences-between-the-l1-norm-and-the-l2-norm-least-absolute-deviations-and-least-squares/)
 
 - [Recommended reading on linear algebra](http://www.cs.cmu.edu/~zkolter/course/linalg/linalg_notes.pdf) from the book "Machine Learning mit Python"