# Cosine Similarity
 - Understand the basic mathematics behind vector-based similarity
 - Understand the vector dot product
 - Understand the L2 norm

## The dot product:  $\mathbf{a} \cdot \mathbf{b}$
 - Reduces two vectors to a single number

In [None]:
def dot_product(a, b):
    "Return dot product of two vectors (given as sequence of numbers)"
    
    result = 0
    for i,a_i in enumerate(a):
        result += a_i * b[i]
    return result

In [None]:
dot_product([1,2],[1,2.1])

## The L2 Norm: $||\mathbf{a}||$
 - Reduces a vector to a single number: The length of a vector in a Euclidean space

In [None]:
import math
def l2_norm(a):
    "Return L2 norma of a vector"
    
    result = 0
    for i in a:
        result += i**2
    return math.sqrt(result)

In [None]:
l2_norm([1,2.1])

## Cosine similarity: $\mathit{cosine}(\mathbf{a},\mathbf{b})=\frac{\mathbf{a} \cdot \mathbf{b}}{||\mathbf{a}||\times||\mathbf{b}||}$
 - The resulting similarity ranges from −1 meaning exactly opposite, to 1 meaning exactly the same, with 0 indicating orthogonality, while in-between values indicate intermediate similarity or dissimilarity.

In [None]:
def cosine(a,b):
    "Return cosine distance of two vectors"
    
    return dot_product(a,b)/(l2_norm(a)*l2_norm(b))

In [None]:
cosine([1,2],[1,2.1])