# SVD Basics - Optimizations
Optimizations make SVD "robust":
* Reducing sensitivity to outliers (large, sparse errors)
* Handling missing data without biasing the low-rank approximation
* Using alternative norms or iterative methods to approximate SVD

## Background

### Norms

**Norms**
* Functions that assign a non-negative real number to an element in vector space, that satisfy the following properties:
  * Non-negativity: always $\geq 0$, $=0$ only for the zero vector.
  * Scalar multiplication: if element is multiplied by a scalar $a$, then norm is multiplied by $|a|$.
  * Triangle inequality: norm of a sum is $\leq$ sum of norms of individual elements.
* Norms turn abstract vector spaces into measurable ones. They generalize the idea of "distance" beyond the Physical world.

**$L_p$ vector norms**
* $L_2$ norm (Euclidean norm)
  * $||\overline{x}||_2 = \sqrt{\sum_{i=1}^n x_i^2}$
  * Size doesn't change if the vector is rotated.

* $L_1$ norm (Manhattan norm)
  * $||\overline{x}||_1 = \sum_{i=1}^n{x_i}$
  * Penalizes small values less that $L_2$ norm.

* $L_\infty$ norm (Infinity/Chebyshev norm)
  * $||\overline{x}||_\infty = \max_{i=1} |x_i|$
  * Measures the largest absolute value in the vector.

**Matrix norms**
* Frobenius norm - measures the total energy of the matrix grid. 
* Induced norm - if $p=2$, then equals the largest singular value of the matrix.
* Nuclear norm - sum of singular values (for SVD).

## Outlier rejection