In [1]:
import random
import numpy as np
import pandas as pd
import sklearn as skl

## Matrix Decompositions

### Spectral Decomposition

Given a **symmetric matrix** $A$, we can decompose into the form $A = P\Lambda P^T$, where $P$ is a matrix whose columns are **eigenvectors**, and $\Lambda$ is a diagonal matrix, whose diagonals contain **eigenvalues**, in descending order from left to right.  
This is often called the **spectral decomposition**.

Note, not entirely sure if symmetry is required, or if it's coincidental, and we're just currently thinking about our symmetric matrix.

All **symmetric** matrices are square; suppose $A$ is $n \times n$.
Then $P$ and $\Lambda$ are $n \times n$.

### Eigenvectors and Eigenvalues

The **eigenvectors** and **eigenvalues** have some useful properties.

**eigenvector** properties:
- All **eigenvectors** $\vec{e}$ are normalized
- All **eigenvectors** are mutually orthogonal

Sum over all eigenvalues gives us the total variance of the matrix, I guess.
And taking an eigenvalue over the sum gives us the "importance" of that component.
We refer to the eigenvectors that make up the columns of $P$ as components.  

### Singular Value Decomposition

Similar to the spectral decomposition, we can perform a decomposition on a general matrix $A$, with dimensions $n \times p$.
That is, matrices that are not necessarily symetric (or positive or definite).
In this case, we have $A = P \Lambda Q'$, where $P$ and $Q$ are not necessarily equal.

However, if we compute a correlation matrix (maybe you could do covariance too, not quite sure) from general $A$, then the spectral decomposition of that symmetric matrix will have the same $P$ and $\Lambda$ as the singular value decomposition.

So, after performing SVD, we have $\Lambda$ with **eigenvalues** in descending order.
The first component, then, is the most "important."
It best represents the variability within the original matrix $A$.
We can then take the first **eigenvector**, $\vec{e_1}$, the first **eigenvalue** $e_1$, and a row vector of $\vec{e_1}$, which we could denote as $\vec{e_1}^T$.
Then $\vec{e_1} \cdot e_1 \cdot \vec{e_1}^T$ is an approximation of the original $A$.

We can instead take, say, the first two **eigenvectors** and **eigenvalues** of the decomposition, and get a better approximation of $A$, and so forth.

Also note, initially each variable in original matrix is equivalent to an eigenvalue of 1, so an eigenvalue > 1 indicates the corresponding eigenvector gives more information than an original variable would.

In [9]:
mat = pd.DataFrame([[random.random() * 10 for j in range(5)] for i in range(5)])
print(f"Original matrix:\n{mat}\n")

correlations = mat.corr()
print(f"Correlation matrix:\n{correlations}\n")

mat_svd = np.linalg.svd(mat)
corr_svd = np.linalg.svd(correlations)
print(f"The matrix svd:\n{mat_svd[0]}\n{mat_svd[1]}\n{mat_svd[2]}\nThe correlation svd:\n{corr_svd[0]}\n{corr_svd[1]}\n{corr_svd[2]}")

Original matrix:
          0         1         2         3         4
0  0.445258  7.372919  7.252065  1.166756  6.950437
1  2.420977  6.700241  5.846414  8.957586  6.400166
2  2.270033  7.445409  5.215288  8.458579  4.378710
3  5.534517  8.919214  9.612851  9.155913  9.660121
4  2.159515  8.567953  2.231314  3.736846  0.330785

Correlation matrix:
          0         1         2         3         4
0  1.000000  0.618791  0.492711  0.723730  0.452455
1  0.618791  1.000000  0.106710 -0.020105 -0.041315
2  0.492711  0.106710  1.000000  0.290315  0.983677
3  0.723730 -0.020105  0.290315  1.000000  0.350071
4  0.452455 -0.041315  0.983677  0.350071  1.000000

The matrix svd:
[[-0.36577142 -0.73359366 -0.47609598 -0.31680506 -0.0319183 ]
 [-0.46212381  0.18810986  0.32575315 -0.45803957  0.6596521 ]
 [-0.42770267  0.37707254  0.06149814 -0.3997147  -0.71507543]
 [-0.63090949 -0.18092967  0.3145467   0.68195722 -0.07219689]
 [-0.26772465  0.50153342 -0.75132939  0.25494473  0.21747429]]
[30.5