### Fundamentals
#### The Dot Product

We have 2 vectors $\vec{a}$, $\vec{b}$ defined as:

<center>$\vec{a}$ = ($a_{1}$, $a_{2}$, $a_{3}$, ... , $a_{n}$)</center>

<center>$\vec{b}$ = ($b_{1}$, $b_{2}$, $b_{3}$, ... , $b_{n}$)</center>

where $\vec{a}_{n}$ and $\vec{b}_{n}$ are components of the vector (e.g. features of a document) and <i>n</i> is the dimension of the vectors. 

The dot product is defined is the multiplication of each componenet from both vectors added together:

<center>$\vec{a}$ $\cdot$ $\vec{b}$ = $a_{1}$$b_{1}$ + $a_{2}$$b_{2}$ + ... + $a_{n}$$b_{n}$ </center>

For example in 2D:

<center>$\vec{a}$ = (0,4)</center>

<center>$\vec{b}$ = (6,0)</center>

<center>$\vec{a}$ $\cdot$ $\vec{b}$ = $0*6 + 4*0$ = $0$</center>


The <b>geometric definition</b> of the dot product is defined as:

<center>$\vec{a}$ $\cdot$ $\vec{b}$ = $\|\vec{a}\|$ $\|\vec{b}\|$$\cos\theta$

where $\|\vec{a}\|$$\cos\theta$ is the projection of $\vec{a}$ onto $\vec{b}$:


<img src= "http://blog.christianperone.com/wp-content/uploads/2013/09/Dot_Product.png", height=200, width=200>

Now, what happens when vector $\vec{a}$ is orthogonal to vector $\vec{b}$, meaning when $\vec{a}$ is 90 degrees from vector $\vec{b}$?

<img src= "http://blog.christianperone.com/wp-content/uploads/2013/09/vectors1.gif", height=200, width=200>

There will be no adjacent side on the triangle, and therefore the dot product of the 2 vectors is 0

### Cosine Similarity
The cosine similarity between 2 vectors is a measure that calculates the consine of the angle between them. This metric is a measurement of <b>orientation</b> and not magnitude and is defined as:

<center> $\cos\theta$ = $\frac{\vec{a} \cdot \vec{b}}{\|\vec{a}\|\|\vec{b}\|}$ </center>

This metric represents how related the 2 items are by angle instead of magnitude.

Case examples:


<img src= "http://blog.christianperone.com/wp-content/uploads/2013/09/cosinesimilarityfq1.png", height=800, width=800>


In [1]:
import numpy as np

def cosine_similarity(a,b):
    """
    Takes 2 vectors a, b and returns cosine similarity measurement
    """
    dot_product = np.dot(a,b)
    norm_a = np.linalg.norm(a)
    norm_b = np.linalg.norm(b)
    return dot_product / (norm_a * norm_b)

In [2]:
sentence_1 = "Ashley really loves dessert"
sentence_2 = "Hannah loves dessert too"
sentence_3 = "Ice cream is dessert"

In [7]:
sentence_1 = np.array([1, 1, 1, 1, 0, 0, 0, 0, 0]) 
sentence_2 = np.array([0, 0, 1, 1, 1, 1, 0, 0, 0])
sentence_3 = np.array([0, 0, 0, 1, 0, 0, 1, 1, 1])

In [8]:
print(cosine_similarity(sentence_1, sentence_2)) 
print(cosine_similarity(sentence_1, sentence_3)) 

0.5
0.25
