### Matrix Decomposion
    A matrix decomposition is a way of reducing a matrix into its constituent parts
    Two methods are the LU matrix decomposition and the QR matrix decomposition

### LU Decomposion 
    The LU decomposition is for square matrices and decomposes a matrix into L and U components.

    A=L.U
    
    Where A is the square matrix that we wish to decompose, L is the lower triangle matrix and U is the upper triangle matrix.

In [1]:
import numpy as np
from scipy import linalg

In [2]:
# LU Decomposion
# define square matrix
a = np.array([[1,2,3],
              [4,5,6],
              [7,8,9]])
P, L, U = linalg.lu(a)
print(P)
print(np.round(L,2))
print(np.round(U,2))
b = P.dot(L).dot(U)
b

[[0. 1. 0.]
 [0. 0. 1.]
 [1. 0. 0.]]
[[1.   0.   0.  ]
 [0.14 1.   0.  ]
 [0.57 0.5  1.  ]]
[[ 7.    8.    9.  ]
 [ 0.    0.86  1.71]
 [ 0.    0.   -0.  ]]


array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

### QR Decomposion
    The QR decomposition is for n × m matrices (not limited to square matrices) and decomposes a matrix into Q and R components.
    
    A=Q.R
    
    Where 
    A is the matrix that we wish to decompose, 
    Q a matrix with the size m × m, and 
    R is an upper triangle matrix with the size m × n
    

In [3]:
a=np.array([[1,2],
           [3,4],
           [5,6]])
Q,R = np.linalg.qr(a, 'complete')
Q

array([[-0.16903085,  0.89708523,  0.40824829],
       [-0.50709255,  0.27602622, -0.81649658],
       [-0.84515425, -0.34503278,  0.40824829]])

In [4]:
R

array([[-5.91607978, -7.43735744],
       [ 0.        ,  0.82807867],
       [ 0.        ,  0.        ]])

In [5]:
b=Q.dot(R)
b

array([[1., 2.],
       [3., 4.],
       [5., 6.]])

### Cholesky Decomposion
    the cholesky decomposion is for square symmetric metrices where all values are greater than zero, so called positive definite matrices.
    
    A=L.L^T
    Where A is the matric being decomposed
    L is the lower triangular matrix and 
    L^T is transpose of L
    
    Decompose can also be written as the product of the upper triangular matrix, for example
    
    A=U^T.U

In [6]:
# Cholesky decomposion

# define symmetrical matix
a = np.array([[2,1,1],
             [1,2,1],
             [1,1,2]])
#factorize
L = np.linalg.cholesky(a)
L

array([[1.41421356, 0.        , 0.        ],
       [0.70710678, 1.22474487, 0.        ],
       [0.70710678, 0.40824829, 1.15470054]])

In [7]:
#reconstruct
b=L.dot(L.T)
b

array([[2., 1., 1.],
       [1., 2., 1.],
       [1., 1., 2.]])

### Eigendecomposition
    Eigendocomposition of a matrix is a type of decomposition that involves decomposing a square matrix into a set of eigenvectors and eigenvalues
    
    A vector is an eigenvector of a matrix if it satisfies the following equation
    
    Av = λv
    where A is the parent square matrix that we are decomposing, 
    v is the eigenvector of the matrix, and 
    λ is the lowercase Greek letter lambda and represents the eigenvalue scalar
    
    A = QΛQT
    Where 
    Q is a matrix comprised of the eigenvectors, 
    Λ is the uppercase Greek letter lambda and is the diagonal matrix comprised of the eigenvalues

### Eigenvectors and Eigenvalues
    Eigenvectors are unit vectors, which means that their length and magnitude is equal to 1.0
    They are often referred as right vector(column vector)
    
    Eigenvalues are coefficients applied to eigenvectors that give the vectors their length or magnitude
    
    computing the power of the matrix, become much easier when we use the eigendecomposition of the matrix


In [8]:
# eigendecomposition
a = np.array([[1,2,3],
             [4,5,6],
             [7,8,9]])
values, vectors = np.linalg.eig(a)

In [9]:
values

array([ 1.61168440e+01, -1.11684397e+00, -9.75918483e-16])

In [10]:
vectors

array([[-0.23197069, -0.78583024,  0.40824829],
       [-0.52532209, -0.08675134, -0.81649658],
       [-0.8186735 ,  0.61232756,  0.40824829]])

In [11]:
# confirm eigenvector
a = np.array([[1,2,3],
             [4,5,6],
             [7,8,9]])
# factorize
values, vectors = np.linalg.eig(a)

# confirm first eigenvector
b = a.dot(vectors[:,0])
print(b)


c = vectors[:, 0] * values[0]
print(c)

[ -3.73863537  -8.46653421 -13.19443305]
[ -3.73863537  -8.46653421 -13.19443305]


In [12]:
# reconstruct matrix
a = np.array([[1,2,3],
             [4,5,6],
             [7,8,9]])
# factorize
values, vectors = np.linalg.eig(a)
# create matrix from eigenvectors
Q = vectors
# create inverse of eighenvectors matrix
R = np.linalg.inv(Q)
# create diagonal matrix from eigenvalues
L = np.diag(values)
# reconstruct the original matrix
b = Q.dot(L).dot(R)
b

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

### Singular-Value Decomposition

    SVD, is a matrix decomposition method for reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations simpler.
    
    A=U·Σ·V^T
    Where A is the real n×m matrix that we wish to decompose, 
    U is an m×m matrix, 
    Σ is an m×n diagonal matrix, and 
    VT is the V transpose of an n × n matrix 
    
    The diagonal values in the Σ matrix are known as the singular values of the original matrix A.
    The columns of the U matrix are called the left-singular vectors of A
    the columns of V are called the right-singular vectors of A

In [13]:
# singular-value decomposition
a = np.array([[1,2],
             [3,4],
             [5,6]])
# factorize
U, s, V = linalg.svd(a)
print(U)
print(s)
print(V)

[[-0.2298477   0.88346102  0.40824829]
 [-0.52474482  0.24078249 -0.81649658]
 [-0.81964194 -0.40189603  0.40824829]]
[9.52551809 0.51430058]
[[-0.61962948 -0.78489445]
 [-0.78489445  0.61962948]]


In [14]:
# reconstruct rectangular matrix from svd
a = np.array([[1,2],
             [3,4],
             [5,6]])
# factorize
U, s, V = linalg.svd(a)

# create sigma matrix
sigma = np.zeros((a.shape[0], a.shape[1]))
# populate sigma with diagonal matrix
sigma[:a.shape[1], :a.shape[1]] = np.diag(s)
# reconstruct matrix
b = U.dot(sigma.dot(V))
b

array([[1., 2.],
       [3., 4.],
       [5., 6.]])

In [15]:
# reconstruct squar matrix from svd
a = np.array([[1,2,3],
             [4,5,6],
             [7,8,9]])
# factorize
U, s, V = linalg.svd(a)

# create sigma matrix
sigma = np.diag(s)
# reconstruct matrix
b = U.dot(sigma.dot(V))
b

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

### Pseudoinverse
    
    The pseudoinverse is the generalization of the matrix inverse for square matrices to rectangular matrices where the number of rows and columns are not equal.
    
    A+ =V ·D^+ ·U^T

In [16]:
# pseudoinverse
a = np.array([[0.1, 0.2],
             [0.3, 0.4],
             [0.5, 0.6],
             [0.7, 0.8]])
b = np.linalg.pinv(a)
b

array([[-1.00000000e+01, -5.00000000e+00,  9.07607323e-15,
         5.00000000e+00],
       [ 8.50000000e+00,  4.50000000e+00,  5.00000000e-01,
        -3.50000000e+00]])

In [17]:
# pseudoinverse via svd
a = np.array([[0.1, 0.2],
             [0.3, 0.4],
             [0.5, 0.6],
             [0.7, 0.8]])

U, s, V = linalg.svd(a)
# reciprocals of s
d = 1.0 / s
# create m x n D matrix
D = np.zeros(a.shape)
# populate D with n x n diagonal matrix
D[:a.shape[1], :a.shape[1]] = np.diag(d)
# calculate pseudoinverse
b = V.T.dot(D.T).dot(U.T)
b

array([[-1.00000000e+01, -5.00000000e+00,  9.07607323e-15,
         5.00000000e+00],
       [ 8.50000000e+00,  4.50000000e+00,  5.00000000e-01,
        -3.50000000e+00]])

### Dimensionality Reduction

    An approximate B of the original vector A can then be reconstructed
    B=U·Σk ·Vk^T
    
    a descriptive subset of the data called T. This is a dense summary of the matrix or a projection
    T = U · Σk
    
    transform can be calculated and applied to the original matrix A as well as other similar matrices.
    T = A · Vk^T

In [25]:
# data reduction with svd
a = np.array([
    [1,2,3,4,5,6,7,8,9,10],
    [11,12,13,14,15,16,17,18,19,20],
    [21,22,23,24,25,26,27,28,29,30]
])
print(a)
# factorize
U, s, V = linalg.svd(a)
# create sigma matrix
sigma = np.zeros((a.shape[0], a.shape[1]))
# populate simga with diagonal matrix
sigma[:a.shape[0], :a.shape[0]] = np.diag(s)
# select
n_elements = 2
sigma = sigma[:, :n_elements]
V = V[:n_elements, :]
# reconstruct
b= U.dot(sigma.dot(V))
print('b=',b)
# transform
T = U.dot(sigma)
print('T=', T)
T = a.dot(V.T)
print('T=', T)


[[ 1  2  3  4  5  6  7  8  9 10]
 [11 12 13 14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27 28 29 30]]
b= [[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
 [11. 12. 13. 14. 15. 16. 17. 18. 19. 20.]
 [21. 22. 23. 24. 25. 26. 27. 28. 29. 30.]]
T= [[-18.52157747   6.47697214]
 [-49.81310011   1.91182038]
 [-81.10462276  -2.65333138]]
T= [[-18.52157747   6.47697214]
 [-49.81310011   1.91182038]
 [-81.10462276  -2.65333138]]


In [28]:
# The TruncatedSVD class can be created in which you must specify the number of desirable features or 
# components to select, e.g. 2. You can fit the transform (e.g. calculate Vk^T ) by calling the fit() function
# then apply it to the original matrix by calling the transform() function. 
# The result is the transform of A called T above.

# svd data reduction in scikit-learn
from sklearn.decomposition import TruncatedSVD
# define matrix
A = np.array([
  [1,2,3,4,5,6,7,8,9,10],
  [11,12,13,14,15,16,17,18,19,20],
  [21,22,23,24,25,26,27,28,29,30]])
print(A)
# create transform
svd = TruncatedSVD(n_components=2)
# fit transform
svd.fit(A)
# apply transform
result = svd.transform(A)
print(result)

[[ 1  2  3  4  5  6  7  8  9 10]
 [11 12 13 14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27 28 29 30]]
[[18.52157747  6.47697214]
 [49.81310011  1.91182038]
 [81.10462276 -2.65333138]]
