<a href="https://colab.research.google.com/github/yuzonightly/AI-IDS/blob/main/machine_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Artificial Inteligence

Tools:

- Google Colab.

Sources:

- Linear Algebra.
  - https://cs229.stanford.edu/summer2020/cs229-linalg.pdf
  - Range $\mathcal{R}$: https://en.wikipedia.org/wiki/Row_and_column_spaces


## Linear Algebra

### Symmetric Matrices

A matrix $A \in \mathbb{R}^{n \times n}$ is ***symmetric*** if $A = A^{T}$. $A$ is ***anti-symmetric*** if $A = -A^{T}$. For any matrix $A \in \mathbb{R}^{n \times n}$, $A + A^{T}$ is ***symmetric*** and $A - A^{T}$ is ***anti-symmetric***.

### The Trace

Sum of the diagonal elements in the matrix:

$$
trA = \sum^{n}_{i=1}A_{ii}
$$

### Norms

The ***norm*** of a vector $||x||$ is a measure of the "length" of the vector. $\ell_{2}$ norm:

$$
||x||_{2} = \sqrt{\sum^{n}_{i=1}x^{2}_ {i}}
$$

$\ell$ norm:

$$
||x||_{1} = \sum^{n}_{i=1}|x_{i}|
$$

$\ell_{\infty}$ norm:

$$
||x||_{\infty} = max_{i}|x_{i}|
$$

$\ell_{p}$ norms are parameterized by $p \geq 1$ and defined as:

$$||x||_{p} = \left( \sum^{n}_{i=1}|x_{i}|^{p} \right)^{1/p}$$

Norms for matrices:

$$
||A||_{F} = \sqrt{tr(A^{T}A)}
$$

### Linear Independence and Rank

A set of vectors $\{x_{1}, x_{2}, x_{3}, ... x_{n}\} \subset \mathbb{R}^{m}$ is ***(linearly) independent*** if no vector can be represented as a linear combination of the remaining vectors.

If one vector in the set can be represented as a linear combination of the remaining vectors, then they are said to be ***(linearly) dependent***:

$$
x_{n} = \sum_{i=1}^{n-1}α_{i}x_{i},
$$

where $α_{i} \in \mathbb{R}$. This means that vectors $x_{1},...,x_{n}$ are linearly dependent.

The ***column rank*** of a matrix $A \in \mathbb{R}^{m \times n}$ is the size of the largest subset of columns of $A$ that forms a ***linearly independent*** set. The ***row rank*** is the largest number of rows of $A$ that forms a ***linearly independent*** set.

We simply say rank of $A$, denoted as $rank(A)$. We do that because, for any matrix $A$, the ***column rank*** of $A$ and the ***row rank*** of $A$ are equal. For the same matrix $A$, if $rank(A) = min(m,n)$, then $A$ is said to be full rank.

### The Inverse of a Square Matrix

The ***inverse*** of $A \in \mathbb{R}^{n \times n}$ is denoted $A^{-1}$:

$$
A^{-1}A = I = AA^{-1}
$$

If the inverse of $A$ exists, then we call $A$ ***invertible*** or ***non-singular***, ***non-invertible*** or ***singular*** otherwise.

Consider the equation $Ax = b$, where $a \in \mathbb{R}^{n \times n}$, and $x,b \in \mathbb{R}^{n}$. If $A$ is invertible, then $x = A^{-1}b$.

### Orthogonal Matrices

Two vectors $x, y \in \mathbb{R}^{n}$ are orthogonal if $x^{T}y = 0$. A vector $x$ is normalized if $||x||_{2} = 1$. A matrix $U \in \mathbb{R}$ is orthogonal if all its columns are ***ortogonal*** to each other and are ***normalized*** (the columns are ***orthonormal***).

$$
U^{T}U = I = UU^{T}
$$

The inverse of an orthogonal matrix is its transpose. If $m \ne n$, then $U^{T}U = I$, but $I \ne UU^{T}$. Also:

$$
||Ux||_{2} = ||x||_{2},
$$

where $x \in \mathbb{R}$ and $U \in \mathbb{R}^{n \times n}$.

### Range and Nullspace of a Matrix

The ***span*** of a set of vectors $\{x_{1},...,x_{n}\}$  is the set of all vectors that can be expressed as a ***linear combination*** of $\{x_{1},...,x_{n}\}$:

$$
span(\{x_{1},...,x_{n}\}) = \left\{ v : v = \sum^{n}_{i=1}α_{i}x_{i}, α_{i} \in \mathbb{R} \right\}.
$$

If $\{x_{1},...,x_{n}\}$ is a set of $n$ ***linearly independent*** vectors, where $x_{i} \in \mathbb{R}$, then $span(\{x_{1},...,x_{n}\}) = \mathbb{R}^{n}$. Which means that any vector in $\mathbb{R}^{n}$ can be expressed as a linear combination of $x_{1}$ through $x_{n}$.

The ***projection*** of a vector $y \in \mathbb{R}^{m}$ onto the span of $\{x_{1},...,x_{n}\}$ if the vector $v \in span(\{x_{1},...,x_{n}\})$, such that $v$ is as close as possible to $y$:

$
Proj(y; \{x_{1},...,x_{n}\}) = argmin_{v \in span(\{x_{1},...,x_{n}\})}||y - v||_{2}.
$$

The ***range***, or columnspace, of a matrix $A \in \mathbb{R}^{m \times n}$, denoted $\mathcal{R}(A)$, is the span of the columns of $A$:

$$
\mathcal{R}(A) = \{v \in \mathbb{R}^{m} : v = Ax,x \in \mathbb{R}^{n}\}.
$$

The ***projection*** of a vector $y \in \mathbb{R}^{m}$ onto the ***range*** of $a$:

$$
Proj(y;A) = argmin_{v \in \mathcal{R}(A)}||v-y||_{2} = A(A^{T}A)^{-1}A^{T}y.
$$

If the matrix $A$ has only one column, $a \in \mathbb{R}^{m}$ (special case for a projection of a vector onto a line):

$$
Proj(y;a) = \frac{aa^{T}}{a^{T}a}y.
$$

The ***nullspace*** of $A \in \mathbb{R}^{m \times n}$ ($\mathcal{N}(A)$) is the set of all vectors that equals $0$ when multiplied by $A$:

$$
\mathcal{N}(A) = \{x \in \mathbb{R}^{n} : Ax = 0\}.
$$

$\mathcal{R}(A^{T})$ and $\mathcal{N}(A)$ are ***orthogonal complements***. They are **disjoint subsets** that together span the entire space of $\mathbb{R}^{n}$, denoted $\mathcal{R}(A^{T}) = \mathcal{N}(A)^{\bot}$.

### The Determinant

$A \in \mathbb{R}^{n}$:

$$
\begin{bmatrix}
- & a_{1}^{T} & -\\
- & a_{2}^{T} & -\\
- & \vdots & -\\
- & a_{n}^{T} & -\\
\end{bmatrix}.
$$

Consider $S \subset \mathbb{R}^{n}$ as a set of all possible linear combinations of the row vectors $a_{1}, a_{2}, a_{3},...,a_{n} \in \mathbb{R}^{n}$