# Introduction to Matrix Decomposition for Machine Learning

1. What is a Matrix Decomposition?
2. LU Matrix Decomposition.
3. QR Matrix Decomposition.
4. Cholesky Decomposition.
5. Introduction to Eigendecomposition.
    * Eigenvectors and Eigenvalues.
6. Calculation of Eigendecomposition.
    * Calculate Eigenvectors and Eigenvalues.
    * Confirm an Eigenvector and Eigenvalue.
    * Reconstruct Original Matrix.

## 1. What is a Matrix Decomposition?

* A matrix decomposition is a way of reducing a matrix into its constituent parts.
* It is an approach that can simplify more complex matrix operations that can be performed on the decomposed matrix rather than on the original matrix itself.
* A common analogy for matrix decomposition is the factoring of numbers, such as the factoring of 10 into 2 x 5. For this reason, matrix decomposition is also called matrix factorization. Like factoring real values, there are many ways to decompose a matrix, hence there are a range of different matrix decomposition techniques.

## 2. LU Matrix Decomposition.

The LU decomposition is for **square matrices** and decomposes a matrix into L and U components:

A = L . U

* A is the square matrix that we wish to decompose.
* L is the lower triangle matrix.
* U is the upper triangle matrix.

A variation of this decomposition that is numerically more stable to solve in practice is called the **LUP decomposition, or the LU decomposition with partial pivoting**.

A = P . L . U

* The rows of the parent matrix are re-ordered to simplify the decomposition process and the additional P matrix specifies a way to permute the result or return the result to the original order.

* There are also other variations of the LU.
* The LU decomposition is often used to simplify the solving of systems of linear equations, such as **finding the coefficients in a linear regression**, as well as in calculating the determinant and inverse of a matrix.

In [1]:
from numpy import array
from scipy.linalg import lu

A = array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(A)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [2]:
# LU decomposition
P, L, U = lu(A)
print(P)
print(L)
print(U)

[[0. 1. 0.]
 [0. 0. 1.]
 [1. 0. 0.]]
[[1.         0.         0.        ]
 [0.14285714 1.         0.        ]
 [0.57142857 0.5        1.        ]]
[[ 7.00000000e+00  8.00000000e+00  9.00000000e+00]
 [ 0.00000000e+00  8.57142857e-01  1.71428571e+00]
 [ 0.00000000e+00  0.00000000e+00 -1.58603289e-16]]


In [3]:
# reconstruct the original matrix
B = P.dot(L).dot(U)
print(B)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


## 3. QR Matrix Decomposition.

The QR decomposition is for m x n matrices **(not limited to square matrices)** and decomposes a matrix into Q and R components:

A = Q . R

* A is the matrix that we wish to decompose.
* Q a matrix with the size m x m.
* R is an upper triangle matrix with the size m x n.

The QR decomposition can be implemented in NumPy using the **qr()** function. By default, the function returns the Q and R matrices with smaller or **‘reduced’** dimensions that is more economical. We can change this to return the expected sizes of m x m for Q and m x n for R by specifying the mode argument as **‘complete’**, although this is not required for most applications.

In [4]:
from numpy import array
from numpy.linalg import qr

A = array([[1, 2], [3, 4], [5, 6]])
print(A)

[[1 2]
 [3 4]
 [5 6]]


In [5]:
# QR decomposition
Q, R = qr(A, 'complete')
print(Q)
print(R)

[[-0.16903085  0.89708523  0.40824829]
 [-0.50709255  0.27602622 -0.81649658]
 [-0.84515425 -0.34503278  0.40824829]]
[[-5.91607978 -7.43735744]
 [ 0.          0.82807867]
 [ 0.          0.        ]]


In [6]:
# reconstruct the original matrix
B = Q.dot(R)
print(B)

[[1. 2.]
 [3. 4.]
 [5. 6.]]


## 4. Cholesky Decomposition.

The Cholesky decomposition is for **square symmetric matrices** where **all eigenvalues are greater than zero**, so-called **positive definite matrices**:

A = L . L^T

* A is the matrix being decomposed.
* L is the lower triangular matrix.
* L^T is the transpose of L.

The decompose can also be written as the product of the upper triangular matrix, for example:

A = U^T . U

* The Cholesky decomposition is used for **solving linear least squares for linear regression**, as well as **simulation and optimization methods**.
* **When decomposing symmetric matrices**, the Cholesky decomposition is nearly twice as efficient as the LU decomposition and should be preferred in these cases.
* The Cholesky decomposition can be implemented in NumPy by calling the **cholesky()** function. The function only returns L as we can easily access the L transpose as needed.

In [7]:
from numpy import array
from numpy.linalg import cholesky

A = array([[2, 1, 1], [1, 2, 1], [1, 1, 2]])
print(A)

[[2 1 1]
 [1 2 1]
 [1 1 2]]


In [8]:
# Cholesky decomposition
L = cholesky(A)
print(L)

[[1.41421356 0.         0.        ]
 [0.70710678 1.22474487 0.        ]
 [0.70710678 0.40824829 1.15470054]]


In [9]:
# reconstruct the original matrix
B = L.dot(L.T)
print(B)

[[2. 1. 1.]
 [1. 2. 1.]
 [1. 1. 2.]]


## 5. Introduction to Eigendecomposition.

Eigendecomposition of a matrix is a type of decomposition that involves decomposing a **square matrix** into a set of eigenvectors and eigenvalues:

A . v = lambda . v

* A is the parent square matrix that we are decomposing.
* v is the eigenvector of the matrix 
* lambda is the lowercase Greek letter and represents the eigenvalue scalar.

A matrix could have one eigenvector and eigenvalue for each dimension of the parent matrix. **Not all square matrices can be decomposed into eigenvectors and eigenvalues**, and some can only be decomposed in a way that requires complex numbers.

The parent matrix can be shown to be a product of the eigenvectors and eigenvalues:

A = Q . diag(V) . Q^-1

* Q is a matrix comprised of the eigenvectors.
* diag(V) is a diagonal matrix comprised of the eigenvalues along the diagonal (sometimes represented with a capital lambda).
* Q^-1 is the inverse of the matrix comprised of the eigenvectors.

*Almost all vectors change direction, when they are multiplied by A. Certain exceptional vectors x are in the same direction as Ax. Those are the “eigenvectors”. Multiply an eigenvector by A, and the vector Ax is the number lambda times the original x. […] The eigenvalue lambda tells whether the special vector x is stretched or shrunk or reversed or left unchanged – when it is multiplied by A.*

Eigendecomposition can also be used to calculate the principal components of a matrix in the **Principal Component Analysis method or PCA** that can be used to reduce the dimensionality of data in machine learning.

### Eigenvectors and Eigenvalues

* **Eigenvectors** are unit vectors, which means that **their length or magnitude is equal to 1.0**.
* They are often referred as right vectors, which simply means a column vector (as opposed to a row vector or a left vector). A right-vector is a vector as we understand them.
* **Eigenvalues** are coefficients applied to eigenvectors that **give the vectors their length or magnitude**. For example, a negative eigenvalue may reverse the direction of the eigenvector as part of scaling it.
* A matrix that has only positive eigenvalues is referred to as a **positive definite matrix**, whereas if the eigenvalues are all negative, it is referred to as a **negative definite matrix**.
* *Decomposing a matrix in terms of its eigenvalues and its eigenvectors gives valuable insights into the properties of the matrix. Certain matrix calculations, like **computing the power of the matrix**, become much easier when we use the eigendecomposition of the matrix.*

## 6. Calculation of Eigendecomposition.

### Calculate Eigenvectors and Eigenvalues

In [10]:
from numpy import array
from numpy.linalg import eig

A = array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
values, vectors = eig(A)
print(values)
print(vectors)

[ 1.61168440e+01 -1.11684397e+00 -9.75918483e-16]
[[-0.23197069 -0.78583024  0.40824829]
 [-0.52532209 -0.08675134 -0.81649658]
 [-0.8186735   0.61232756  0.40824829]]


### Confirm an Eigenvector and Eigenvalue

In [11]:
# confirm first eigenvector
B = A.dot(vectors[:, 0])
print(B)

C = vectors[:, 0] * values[0]
print(C)

# B must be equal to C

[ -3.73863537  -8.46653421 -13.19443305]
[ -3.73863537  -8.46653421 -13.19443305]


### Reconstruct Original Matrix

In [12]:
from numpy import diag
from numpy import dot
from numpy.linalg import inv

# create matrix from eigenvectors
Q = vectors
# create inverse of eigenvectors matrix
R = inv(Q)
# create diagonal matrix from eigenvalues
L = diag(values)
# reconstruct the original matrix
B = Q.dot(L).dot(R)
print(B)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


In [13]:
print(A)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
