# Dimensionality Reduction: Matrix Factorization / Decomposition

Matrix decompositions are methods that reduce a matrix into constituent parts that make it easier to calculate more complex matrix operations. Matrix decomposition methods, also called matrix factorization methods, are a foundation of linear algebra in computers, even for basic operations such as solving systems of linear equations, calculating the inverse, and calculating the determinant of a matrix. It is an approach that can simplify more complex matrix operations that can be performed on the decomposed matrix rather than on the original matrix itself.

## Import libraries

In [1]:
import numpy as np
from scipy import linalg

## LU Decomposition

The LU decomposition is for square matrices and decomposes a matrix into L and U components. Where A is the square matrix that we wish to decompose, L is the lower triangle matrix and U is the upper triangle matrix. 

`A = LU`

The LU decomposition is found using an iterative numerical process and can fail for those matrices that cannot be decomposed or decomposed easily. A variation of this decomposition that is numerically more stable to solve in practice is called the LUP decomposition, or the LU decomposition with partial pivoting.

`A = PLU`

The rows of the parent matrix are re-ordered to simplify the decomposition process and the additional P matrix specifies a way to permute the result or return the result to the original order. There are also other variations of the LU.

The LU decomposition is often used to simplify the solving of systems of linear equations, such as finding the coefficients in a linear regression, as well as in calculating the determinant and inverse of a matrix.


In [2]:
# define a square matrix
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(A)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [3]:
# LU decomposition
P, L, U = linalg.lu(A)
print(P)
print(L)
print(U)

[[0. 1. 0.]
 [0. 0. 1.]
 [1. 0. 0.]]
[[1.         0.         0.        ]
 [0.14285714 1.         0.        ]
 [0.57142857 0.5        1.        ]]
[[7.         8.         9.        ]
 [0.         0.85714286 1.71428571]
 [0.         0.         0.        ]]


In [4]:
# reconstruct
B = P.dot(L).dot(U)
print(B)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


## QR Decomposition

The QR decomposition is for m x n matrices (not limited to square matrices) and decomposes a matrix into Q and R components. Where A is the matrix that we wish to decompose, Q a matrix with the size m x m, and R is an upper triangle matrix with the size m x n.

`A = QR`

The QR decomposition is found using an iterative numerical method that can fail for those matrices that cannot be decomposed, or decomposed easily. Like the LU decomposition, the QR decomposition is often used to solve systems of linear equations, although is not limited to square matrices.

The QR decomposition can be implemented in NumPy using the qr() function. By default, the function returns the Q and R matrices with smaller or ‘reduced’ dimensions that is more economical. We can change this to return the expected sizes of m x m for Q and m x n for R by specifying the mode argument as ‘complete’, although this is not required for most applications.

In [5]:
# define a 3x2 matrix
A = np.array([[1, 2], [3, 4], [5, 6]])
print(A)

[[1 2]
 [3 4]
 [5 6]]


In [6]:
# QR decomposition
Q, R = np.linalg.qr(A, 'complete')
print(Q)
print(R)

[[-0.16903085  0.89708523  0.40824829]
 [-0.50709255  0.27602622 -0.81649658]
 [-0.84515425 -0.34503278  0.40824829]]
[[-5.91607978 -7.43735744]
 [ 0.          0.82807867]
 [ 0.          0.        ]]


In [7]:
# reconstruct
B = Q.dot(R)
print(B)

[[1. 2.]
 [3. 4.]
 [5. 6.]]


## Cholesky Decomposition

The Cholesky decomposition is for square symmetric matrices where all eigenvalues are greater than zero, so-called positive definite matrices. For our interests in machine learning, we will focus on the Cholesky decomposition for real-valued matrices and ignore the cases when working with complex numbers.

`A = L.dot(L.T)`

Where A is the matrix being decomposed, L is the lower triangular matrix and L^T is the transpose of L. The decompose can also be written as the product of the upper triangular matrix (U), for example:

`A = (U.T).dot(U)`

The Cholesky decomposition is used for solving linear least squares for linear regression, as well as simulation and optimization methods. When decomposing symmetric matrices, the Cholesky decomposition is nearly twice as efficient as the LU decomposition and should be preferred in these cases.

The Cholesky decomposition can be implemented in NumPy by calling the cholesky() function. The function only returns L as we can easily access the L transpose as needed.

In [8]:
# define a 3x3 matrix
A = np.array([[2, 1, 1], [1, 2, 1], [1, 1, 2]])
print(A)

[[2 1 1]
 [1 2 1]
 [1 1 2]]


In [9]:
# Cholesky decomposition
L = np.linalg.cholesky(A)
print(L)

[[1.41421356 0.         0.        ]
 [0.70710678 1.22474487 0.        ]
 [0.70710678 0.40824829 1.15470054]]


In [10]:
# reconstruct
B = L.dot(L.T)
print(B)

[[2. 1. 1.]
 [1. 2. 1.]
 [1. 1. 2.]]


## Eigendecomposition

Perhaps the most used type of matrix decomposition is the eigendecomposition that decomposes a matrix into eigenvectors and eigenvalues. This decomposition also plays a role in methods used in machine learning, such as in Principal Component Analysis.

A vector is an eigenvector of a matrix if it satisfies the following equation.

`A.dot(v) = lambda.dot(v)`

This is called the eigenvalue equation, where A is the parent square matrix that we are decomposing, v is the eigenvector of the matrix, and lambda is the lowercase Greek letter and represents the eigenvalue scalar.

Eigenvectors are unit vectors, which means that their length or magnitude is equal to 1.0. They are often referred as right vectors, which simply means a column vector (as opposed to a row vector or a left vector). Eigenvalues are coefficients applied to eigenvectors that give the vectors their length or magnitude. For example, a negative eigenvalue may reverse the direction of the eigenvector as part of scaling it. A matrix that has only positive eigenvalues is referred to as a positive definite matrix, whereas if the eigenvalues are all negative, it is referred to as a negative definite matrix.

A matrix could have one eigenvector and eigenvalue for each dimension of the parent matrix. Not all square matrices can be decomposed into eigenvectors and eigenvalues, and some can only be decomposed in a way that requires complex numbers. The parent matrix can be shown to be a product of the eigenvectors and eigenvalues.

$A=V\Lambda V^{-1}$

Where V is a matrix comprised of the eigenvectors, $\Lambda$ is a diagonal matrix comprised of the eigenvalues along the diagonal, and V^-1 is the inverse of the matrix comprised of the eigenvectors.

The eigendecomposition can be calculated in NumPy using the eig() function.


In [11]:
# define matrix
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(A)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [12]:
# calculate eigendecomposition
lambdas, V = np.linalg.eig(A)
print(lambdas)
print(V)

[ 1.61168440e+01 -1.11684397e+00 -9.75918483e-16]
[[-0.23197069 -0.78583024  0.40824829]
 [-0.52532209 -0.08675134 -0.81649658]
 [-0.8186735   0.61232756  0.40824829]]


In [13]:
# confirm eigenvector (A*v=lambda*v)
B = A.dot(V[:, 0]) # original matrix times first eigenvalue
print(B)
C = lambdas[0] * V[:, 0]  # first eigenvalue times first eigenvector
print(C)

[ -3.73863537  -8.46653421 -13.19443305]
[ -3.73863537  -8.46653421 -13.19443305]


In [14]:
# reconstruct matrix
V_inv = np.linalg.inv(V)
Lambda = np.diag(lambdas)
print(A)
print(np.dot(V, np.dot(Lambda, V_inv)))

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]
