# Vector Operations

In this lesson, we will examine more advanced operations on vectors. We will focus on the concepts of **dot product**, **vector norms**, and **cosine similarity**, which are frequently used in machine learning.

---

## 1. Dot Product
The **dot product** is the sum of the products of the corresponding elements of two vectors. This operation helps us to understand the relationship between two vectors. It allows us to answer questions like, "_To what extent do two vectors point in the same direction?"_ or _"What is the angle between two vectors?"_ 

Let $\mathbf{u} = \begin{bmatrix} u_1 \\ u_2 \\ \vdots \\ u_n \end{bmatrix}$ and $\mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix}$ be two vectors.

The dot product of these two vectors is denoted as $\mathbf{u} \cdot \mathbf{v}$ or $\mathbf{u}^T\mathbf{v}$ and is calculated as:

$\mathbf{u} \cdot \mathbf{v} = u_1v_1 + u_2v_2 + \dots + u_nv_n = \sum_{i=1}^{n} u_i v_i$

**Notes:** 
* For the dot product to be defined, the two vectors must have the same dimension (the same number of elements).
* The dot product is **commutative**. This means the order of the vectors doesn't matter: u⋅v = v⋅u.

### 1.1. Geometric Interpretation of the Dot Product
The dot product is related to the cosine of the angle between two vectors. If $\theta$ is the angle between the two vectors:

$\mathbf{u} \cdot \mathbf{v} = ||\mathbf{u}|| \cdot ||\mathbf{v}|| \cdot \cos(\theta)$

Here, $||\mathbf{u}||$ and $||\mathbf{v}||$ are the lengths (norms) of vectors $\mathbf{u}$ and $\mathbf{v}$, respectively. We will discuss this topic shortly.

*   If $\mathbf{u} \cdot \mathbf{v} = 0$, the vectors are perpendicular (**orthogonal**).
*   If $\mathbf{u} \cdot \mathbf{v} > 0$, the angle between the vectors is acute (they point in a similar direction).
*   If $\mathbf{u} \cdot \mathbf{v} < 0$, the angle between the vectors is obtuse (they point in opposite directions).

**Example:**  

Let $\mathbf{u} = \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}$ and $\mathbf{v} = \begin{bmatrix} 1 \\ 4 \\ 0 \end{bmatrix}$.

$\mathbf{u} \cdot \mathbf{v} = (2 \cdot 1) + (-1 \cdot 4) + (3 \cdot 0) = 2 - 4 + 0 = -2$

<br>

In [1]:
import numpy as np

u = np.array([2, -1, 3])
v = np.array([1, 4, 0])

# Method 1: np.dot()
dot_product1 = np.dot(u, v)

# Method 2: @ operatior (Python 3.5 and later)
dot_product2 = u @ v

# Method 3: .dot() method
dot_product3 = u.dot(v) # or v.dot(u)

print("Dot Product (np.dot()):", dot_product1)
print("Dot Product (@):", dot_product2)
print("Dot Product (.dot()):", dot_product3)

Dot Product (np.dot()): -2
Dot Product (@): -2
Dot Product (.dot()): -2


---

## 2. Vector Norms

A vector's **norm** can be thought of as the "length" or "magnitude" of that vector. There are different types of norms, but the most commonly used ones in machine learning are the **L1 norm** and the **L2 norm**. The norm of a vector is an answer to the question _"How big is this vector?"_

### 2.1. L1 Norm (Manhattan Norm | Taxicab Norm)
The **L1 norm** is the sum of the absolute values of the elements of a vector. The L1 norm answers the question, _"If I could only travel along grid lines (like a taxi in Manhattan), how far would I have to travel to get from the origin to the vector's endpoint?"_

The L1 norm of a vector $\mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix}$ is denoted as $||\mathbf{v}||_1$ and is calculated as:

$||\mathbf{v}||_1 = |v_1| + |v_2| + \dots + |v_n| = \sum_{i=1}^{n} |v_i|$  

**Example:**  

The L1 norm of the vector $\mathbf{v} = \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}$ is:

$||\mathbf{v}||_1 = |2| + |-1| + |3| = 2 + 1 + 3 = 6$

<br>

In [2]:
v = np.array([2, -1, 3])

# Method 1: Using np.linalg.norm()
l1_norm1 = np.linalg.norm(v, ord=1)

# Method 2: Manual calculation
l1_norm2 = np.sum(np.abs(v))

print("L1 Norm (np.linalg.norm()):", l1_norm1)
print("L1 Norm (manual):", l1_norm2)

L1 Norm (np.linalg.norm()): 6.0
L1 Norm (manual): 6


### 2.2. L2 Norm (Euclidean Norm)

The **L2 norm** square root of the sum of the squared elements of a vector. This is the Euclidean distance from the origin to the endpoint of the vector. It answers the question, _"If I could fly directly from the origin to the vector's endpoint in a straight line, how far would I travel?"_

The L2 norm of a vector $\mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix}$ is denoted as $||\mathbf{v}||_2$ or simply $||\mathbf{v}||$ and is calculated as:

$||\mathbf{v}||_2 = \sqrt{v_1^2 + v_2^2 + \dots + v_n^2} = \sqrt{\sum_{i=1}^{n} v_i^2}$

**Example:**  

The L2 norm of the vector $\mathbf{v} = \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}$ is:

$||\mathbf{v}||_2 = \sqrt{2^2 + (-1)^2 + 3^2} = \sqrt{4 + 1 + 9} = \sqrt{14}$

<br>

In [3]:
v = np.array([2, -1, 3])

# Method 1: Using np.linalg.norm()
l2_norm1 = np.linalg.norm(v) # ord=2 is the default, no need to specify

# Method 2: Manual calculation (using dot product)
l2_norm2 = np.sqrt(np.dot(v, v))
# or
l2_norm3 = (np.sum(v**2))**0.5

print("L2 Norm (np.linalg.norm()):", l2_norm1)
print("L2 Norm (manual):", l2_norm2)
print("L2 Norm (manual 2):", l2_norm3)

L2 Norm (np.linalg.norm()): 3.7416573867739413
L2 Norm (manual): 3.7416573867739413
L2 Norm (manual 2): 3.7416573867739413


## 2.3. Comparison of L1 and L2 Norms

* **L1 Norm:**
    * Requires less computation (no square root operation)i
    * Tends to produce _sparse_ solutions (vectors with some elements being zero). This is particularly useful in _feature selection_ problems.
    * More _robust_ to outliers.

* **L2 Norm:**
    * More commonly used.
    * Differentiable (L1 norm is not differentiable at zero). This is important for optimization algorithms.
    * Generally produces more stable solutions.
    * More sensitive to outliers than L1.