## Linear dependence and independence 

- A set of vectors is **linearly dependent** if at least one vector can be obtained as a linear combination of other vectors in the set. As you can see in the left pane, we can combine vectors $x$ and $y$ to obtain $z$. 

- The important points to remember are: `linearly dependent vectors contain redundant information, whereas linearly independent vectors do not.`

## 

<center>
<img src="./images/b-linear-independence.svg">
<center/>

## Linear Dependence {background-iframe="https://www.geogebra.org/m/qbp6ymgw" background-interactive=true}

## Vector null space

Intuitively, the null space of a set of vectors are **all linear combinations that "map" into the zero vector**.   
Consider a set of geometric vectors $\bf{w}$, $\bf{x}$, $\bf{y}$, and $\bf{z}$ as in **Fig. 8**.   
By inspection, we can see that vectors $\bf{x}$ and $\bf{z}$ are parallel to each other, hence, independent.   
On the contrary, vectors $\bf{w}$ and $\bf{y}$ can be obtained as linear combinations of $\bf{x}$ and $\bf{z}$, therefore, dependent. 

## {.smaller} 

<center>
<img src="./images/b-vector-null-space.svg">
<center/>

## Vector norms

- The **norm** or the **length** of a vector as the distance between its "origin" and its "end".  

- Norms "map" vectors to non-negative values. In this sense are functions that assign length $\lVert \bf{x} \rVert \in \mathbb{R^n}$ to a vector $\bf{x}$. To be valid, a norm has to satisfy these properties (keep in mind these properties are a bit abstruse to understand):

1. **`Absolutely homogeneous`**: $\forall \alpha \in \mathbb{R},  \lVert \alpha \bf{x} \rVert = \vert \alpha \Vert \lVert \bf{x} \rVert$. In words: for all real-valued scalars, the norm scales proportionally with the value of the scalar.
2. **`Triangle inequality`**:  $\lVert \bf{x} + \bf{y} \rVert \le \lVert \bf{x} \rVert + \lVert \bf{y} \rVert$. In words: in geometric terms, for any triangle the sum of any two sides must be greater or equal to the lenght of the third side. This is easy to see experimentally: grab a piece of rope, form triangles of different sizes, measure all the sides, and test this property.
3. **`Positive definite`**: $\lVert \bf{x} \rVert \ge 0$ and $\lVert \bf{x} \rVert = 0 \Longleftrightarrow \bf{x}= 0$. In words: the length of any $\bf{x}$ has to be a positive value (i.e., a vector can't have negative length), and a length of $0$ occurs only of $\bf{x}=0$

## 

<center> Fig. 9: Vector norms <center/>

<center>
<img src="./images/b-l2-norm.svg">
<center/>

## Euclidean norm

The Euclidean norm is one of the most popular norms in machine learning. It is so widely used that sometimes is refered simply as "the norm" of a vector. Is defined as:

$$
\lVert \bf{x} \rVert_2 := \sqrt{\sum_{i=1}^n x_i^2} = \sqrt{x^Tx} 
$$

Hence, in **two dimensions** the $L_2$ norm is:

$$
\lVert \bf{x} \rVert_2 \in \mathbb{R}^2 = \sqrt {x_1^2  \cdot x_2^2 } 
$$

Which is equivalent to the formula for the hypotenuse a triangle with sides $x_1^2$ and $x_2^2$. 

The same pattern follows for higher dimensions of $\mathbb{R^n}$



## Slide Title {background-iframe="https://www.geogebra.org/m/Scz8gTWv" background-interactive=true}


## Numpy norm 

In [14]:
x = np.array([[3],[4]])

np.linalg.norm(x, 2)

5.0

## Manhattan norm

The Manhattan or $L_1$ norm gets its name in analogy to measuring distances while moving in Manhattan, NYC. Since Manhattan has a grid-shape, the distance between any two points is measured by moving in vertical and horizontals lines (instead of diagonals as in the Euclidean norm). It is defined as:

$$
\lVert \bf{x} \rVert_1 := \sum_{i=1}^n \vert x_i \vert 
$$

Where $\vert x_i \vert$ is the absolute value. The $L_1$ norm is preferred when discriminating between elements that are exactly zero and elements that are small but not zero.   


## Numpy Manhattan norm 

::: {.nonincremental}


- In `NumPy` we compute the $L_1$ norm as

:::

In [15]:
x = np.array([[3],[-4]])

np.linalg.norm(x, 1)

7.0

## Slide Title {background-iframe="https://www.geogebra.org/m/vsj8akcb" background-interactive=true}


## Max norm

The max norm or infinity norm is simply the absolute value of the largest element in the vector. It is defined as:

$$
\lVert \bf{x} \rVert_\infty := max_i \vert x_i \vert 
$$

Where $\vert x_i \vert$  is the absolute value. For instance, for a vector with elements $\bf{x} = \begin{bmatrix} 1 & 2 & 3 \end{bmatrix}$, the $\lVert \bf{x} \rVert_\infty = 3$



## Numpy Max norm

::: {.nonincremental}

 - In `NumPy` we compute the $L_\infty$ norm as:
 
:::

In [16]:
x = np.array([[3],[-4]])

np.linalg.norm(x, np.inf)

4.0

# Vector inner product, length, and distance

## Inner Product
The notation for the inner product is usually a pair of angle brackets as $\langle  .,. \rangle$ .   
For instance, the scalar inner product is defined as:

$$
\langle x,y \rangle := x\cdot y
$$

In $\mathbb{R}^n$ the inner product is a dot product defined as:

$$
\Bigg \langle
\begin{bmatrix} 
x_1 \\ 
\vdots\\
x_n
\end{bmatrix},
\begin{bmatrix} 
y_1 \\ 
\vdots\\
y_n
\end{bmatrix}
\Bigg \rangle :=
x \cdot y = \sum_{i=1}^n x_iy_i
$$

## Length

**Length** is a concept from geometry.     
We say that geometric vectors have length and that vectors in $\mathbb{R}^n$ have norm. 
For instance, we can compute the length of a directed segment (i.e., geometrical vector) $\bf{x}$ by taking the square root of the inner product with itself as:

$$
\lVert x \rVert = \sqrt{\langle x,x \rangle} = \sqrt{x\cdot y} = x^2 + y^2  
$$

## Distance
**Distance** is a relational concept. It refers to the length (or norm) of the difference between two vectors. Hence, we use norms and lengths to measure the distance between vectors. Consider the vectors $\bf{x}$ and $\bf{y}$, we define the distance $d(x,y)$ as:

$$
d(x,y) := \lVert x - y \rVert = \sqrt{\langle x - y, x - y \rangle}
$$

When the inner product $\langle x - y, x - y \rangle$ is the dot product, the distance equals to the Euclidean distance.

## Numpy dot product 

In [17]:
x, y = np.array([[-2],[2]]), np.array([[4],[-3]])
x.T @ y 

array([[-14]])

## Numpy Distance 

As with the inner product, usually, we can safely assume that **distance** stands for the Euclidean distance or $L_2$ norm unless otherwise noted. To compute the $L_2$ distance between a pair of vectors:

In [18]:
distance = np.linalg.norm(x-y, 2)
print(f'L_2 distance : {distance}')

L_2 distance : 7.810249675906656


## Vector angles and orthogonality {.smaller}

In machine learning, the **angle** between a pair of vectors is used as a **measure of vector similarity**.   

To understand angles let's first look at the **Cauchy–Schwarz inequality**. Consider a pair of non-zero vectors $\bf{x}$ and $\bf{y}$ $\in \mathbb{R}^n$. The Cauchy–Schwarz inequality states that:  

$$
\vert \langle x, y \rangle \vert \leq \Vert x \Vert \Vert y \Vert
$$

In words: *the absolute value of the inner product of a pair of vectors is less than or equal to the products of their length*. The only case where both sides of the expression are *equal* is when vectors are colinear, for instance, when $\bf{x}$ is a scaled version of $\bf{y}$. In the 2-dimensional case, such vectors would lie along the same line. 

The definition of the angle between vectors can be thought as a generalization of the **law of cosines** in trigonometry, which defines for a triangle with sides $a$, $b$, and $c$, and an angle $\theta$ are related as:

$$
c^2 = a^2 + b^2 - 2ab \cos \theta
$$

## 

<center> Fig. 10: Law of cosines and Angle between vectors <center/>

<center>
<img src="./images/b-vector-angle.svg">
<center/>

## COSINE ANGLE {.smaller}

We can replace this expression with vectors lengths as: 

$$
\Vert x - y \Vert^2 = \Vert x \Vert^2 + \Vert y \Vert^2 - 2(\Vert x \Vert \Vert y \Vert) \cos \theta
$$

With a bit of algebraic manipulation, we can clear the previous equation to:

$$
\cos \theta = \frac{\langle x, y \rangle}{\Vert x \Vert \Vert y \Vert} 
$$

And there we have a **definition for (cos) angle $\theta$**. Further, from the Cauchy–Schwarz inequality we know that $\cos \theta$ must be:

$$
-1 \leq \frac{\langle x, y \rangle}{\Vert x \Vert \Vert y \Vert} \leq 1  
$$

This is a necessary conclusion (range between $\{-1, 1\}$) since the numerator in the equation always is going to be smaller or equal to the denominator.

## NUMPY COSINE 

In `NumPy`, we can compute the $\cos \theta$ between a pair of vectors as: 

In [1]:
import numpy as np
x, y = np.array([[1], [2]]), np.array([[5], [7]])

# here we translate the cos(theta) definition
cos_theta = (x.T @ y) / (np.linalg.norm(x,2) * np.linalg.norm(y,2))
print(f'cos of the angle = {np.round(cos_theta, 3)}')

cos of the angle = [[0.988]]


We get that $\cos \theta \approx 0.988$. Finally, to know the exact value of $\theta$ we need to take the trigonometric inverse of the cosine function as:

In [3]:
cos_inverse = np.arccos(cos_theta)
print(f'angle in radians = {np.round(cos_inverse, 3)}')

angle in radians = [[0.157]]


We obtain $\theta \approx 0.157 $. To fo from radiants to degrees we can use the following formula:

In [4]:
degrees = cos_inverse * ((180)/np.pi)
print(f'angle in degrees = {np.round(degrees, 3)}')

angle in degrees = [[8.973]]


We obtain $\theta \approx 8.973^{\circ}$

## Orthogonality

- `Orthogonality` can be seen as a generalization of perpendicularity to vectors in any number of dimensions.

- We say that a pair of vectors $\bf{x}$ and $\bf{y}$ are **orthogonal** if their inner product is zero, $\langle x,y \rangle = 0$. The notation for a pair of orthogonal vectors is $\bf{x} \perp \bf{y}$. In the 2-dimensional plane, this equals to a pair of vectors forming a $90^{\circ}$ angle.


## 

<center> Fig. 11: Orthogonal vectors <center/>

<center>
<img src="./images/b-orthogonal-vectors.svg">
<center/>

## Numpy Operations

In [22]:
x = np.array([[2], [0]])
y = np.array([[0], [2]])

cos_theta = (x.T @ y) / (np.linalg.norm(x,2) * np.linalg.norm(y,2))
print(f'cos of the angle = {np.round(cos_theta, 3)}')

cos of the angle = [[0.]]


We see that this vectors are **orthogonal** as $\cos \theta=0$. This is equal to  $\approx 1.57$ radians and $\theta = 90^{\circ}$

In [23]:
cos_inverse = np.arccos(cos_theta)
degrees = cos_inverse * ((180)/np.pi)
print(f'angle in radiants = {np.round(cos_inverse, 3)}\nangle in degrees ={np.round(degrees, 3)} ')

angle in radiants = [[1.571]]
angle in degrees =[[90.]] 
