### SVD definition

$$ \large A_{[\text{m x n}]} = U_{[\text{m x m}]}\Sigma_{[\text{m x n}]}V^T_{[\text{n x n}]}$$
* $A$: Input data matrix
    * $\text{m x n}$ matrix (e.g., $m$ words, $n$ contexts: each element $A_{ij}$ says about the association between a word $i$ and a context $j$)
* $U$: Left singular vectors
    * $\text{m x m}$ matrix (rows are word vectors)
* $\Sigma$: Singular values
    * $\text{m x n}$ matrix (values on the diagonal are the singular values)
* $V^T$: Right singular vectors
    * $\text{n x n}$ matrix (columns are context vectors)

#### SVD Theorem

Let $A \in \mathbb{R}^{\text{m x n}}$ be a rectangular matrix of rank $r \in [0; min(\text{m; n})]$. The SVD of A is a decomposition of the form $$A_{[\text{m x n}]} = U_{[\text{m x m}]}\Sigma_{[\text{m x n}]}V^T_{[\text{n x n}]}$$ 
with an orthogonal matrix $U \in \mathbb{R}^{\text{m x m}}$ with column vectors $u_i, i = 1; ... ;m$ (*left-singular vectors*),  
and an orthogonal matrix $V \in \mathbb{R}^{\text{n x n}}$ with column vectors $v_j, j = 1; ... ;n$ (*right-singular vectors*).  
Moreover, $\Sigma$ is an $\text{m x n}$ matrix with $\Sigma_{ii} = \sigma_i > 0$ and $\Sigma_{ij} = 0; i \neq j$.  

Remarks:
* The diagonal entries $\sigma_i, i = 1; ...; r$, of $\Sigma$ are called the *singular values*.  
* By convention, the singular values are ordered, i.e., $\sigma_1 \geqslant \sigma_2 \geqslant \sigma_r \geqslant 0$.


$Â©$ 2021 M. P. Deisenroth, A. A. Faisal, C. S. Ong. Published by Cambridge University Press (2020).

#### Python Implementation

In [18]:
import numpy as np
np.random.seed(42)

A = np.random.rand(5, 10) # create random data matrix
U, S, V = np.linalg.svd(A, full_matrices=True) # full SVD
Uhat, Shat, Vhat = np.linalg.svd(A, full_matrices=False) # economy SVD