# Theory Review

In [1]:
import numpy as np

### Matrix Basics

![matrix](imgs/matrix.png)
![notation](imgs/notation.png)

In [2]:
a = np.array(
    [[1402,191],
     [1371,821],
     [949,1437]]
)

print(a)

[[1402  191]
 [1371  821]
 [ 949 1437]]


In [3]:
a[1,1]

821

In [65]:
a[0,0]

1402

## Matrix Addition & Subtraction

![add](imgs/add.png)
![add 2](imgs/subtract.png)

A Matrix is a 2D array that stores real or complex numbers. A Real Matrix is one such that all its elements r belong to ℝ. Likewise, a Complex Matrix has entries c in ℂ.

In [37]:
matrix1 = np.array(
    [[0, 4],
     [2, 0]]
)

matrix2 = np.array(
    [[-1, 2],
     [1, -2]]
)

matrix_sum = matrix1 + matrix2

In [38]:
matrix_sum

array([[-1,  6],
       [ 3, -2]])

## Scalar Multiplication & Division

![scalar](imgs/scalar.png)


In [66]:
3 * a

array([[4206,  573],
       [4113, 2463],
       [2847, 4311]])

In [67]:
a / 3

array([[ 467.33333333,   63.66666667],
       [ 457.        ,  273.66666667],
       [ 316.33333333,  479.        ]])

## Matrix Multiplication

![multiplication](imgs/mult_1.png)
![multiplication](imgs/mult_2.png)
![multiplication](imgs/mult_3.png)

How to find the product of two matrices.


In [68]:
matrix1 = np.array(
    [[1, 4],
     [2, 0]]
)

matrix2 = np.array(
    [[-1, 2],
     [1, -2]]
)

matrix_prod = np.dot(matrix1, matrix2)

In [69]:
matrix_prod

array([[ 3, -6],
       [-2,  4]])

Matrix multiplication is not communative

In [70]:
matrix_prod_2 = np.dot(matrix2, matrix1)
matrix_prod_2

array([[ 3, -4],
       [-3,  4]])

## Identity Matrix

![identity](imgs/identity.png)

In [71]:
np.identity(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [72]:
np.identity(3, dtype=int)

array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

Show that for any matrix A, AI=IA=A.

In [77]:
A = np.array(
    [[4,2,1],
     [4,8,3],
     [1,1,0]]
)

I = np.identity(3, dtype=int)
np.dot(A,I)

array([[4, 2, 1],
       [4, 8, 3],
       [1, 1, 0]])

In [78]:
A

array([[4, 2, 1],
       [4, 8, 3],
       [1, 1, 0]])

In [74]:
np.dot(A,I) == np.dot(I,A)

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]], dtype=bool)

In [75]:
np.dot(A,I) == A

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]], dtype=bool)

## Transposing a Matrix

At times it is useful to pivot a matrix for conformability - that is in order to divide or multiply a matrix, we need to switch the rows and column dimensions of matrices.

![transpose](imgs/transpose.png)

In [4]:
A = np.arange(6).reshape((3,2))

print("A is")
print(A)
print("")

print("The Transpose of A is")
print(A.T)

A is
[[0 1]
 [2 3]
 [4 5]]

The Transpose of A is
[[0 2 4]
 [1 3 5]]


##  Inverting a Matrix

AB = BA = I

Matrix inversion is the process of finding the matrix B that satisfies the prior equation for a given invertible matrix A. (I = the identity matrix)

![inverse matrix](imgs/inverse.gif)

In [None]:
matrix = np.array(
    [[1, 4],
     [2, 0]]
)

inverse = np.linalg.inv(matrix)

In [79]:
inverse

array([[ 0.   ,  0.5  ],
       [ 0.25 , -0.125]])

**A square matrix that is not invertible is called singular or degenerate. A square matrix is singular if and only if its determinant is 0.**

## The Determinant


![determinant](imgs/det_1.png)
![determinant](imgs/det_2.png)
![determinant](imgs/det_3.png)
![determinant](imgs/det_4.png)
![determinant](imgs/det_5.png)
![determinant](imgs/det_6.png)

In [80]:
matrix = np.array(
    [[1, 4],
     [2, 0]]
)

det = np.linalg.det(matrix)

In [81]:
det

-7.9999999999999982

Computing determinants for a stack of matrices:

In [83]:
a = np.array([ 
        [[1, 2], 
         [3, 4]], 
        
        [[1, 2], 
         [2, 1]],
        
        [[1, 3],
         [3, 1]]
    ])

a.shape

(3, 2, 2)

In [84]:
np.linalg.det(a)

array([-2., -3., -8.])

# Practical Uses

## Solving a Linear System

By hand you can do Gaussian Elimination. Solve the system of equations:
- 3x + y = 9
- x + 2y = 8

In [50]:
a = np.array(
    [[3,1], 
     [1,2]]
)

b = np.array(
    [9,8]
)

x = np.linalg.solve(a, b)
x

array([ 2.,  3.])

## Linear Transformations

Rn ---- > Rm

![linear](linear.jpg)
![compgraphic](computergraphic.png)

## Eigenvalues and Eigenvectors

- [Great math formula explanation](http://www.visiondummy.com/2014/03/eigenvalues-eigenvectors/)
- [Visual explanation of Eigenvectors and Eigenvalues](http://setosa.io/ev/eigenvectors-and-eigenvalues/)

In [5]:
# consider the matrix
# find the determinant of the matrix - lambda
# solve the polynomial = eigenvalues
# solve the linear system 

matrix = np.matrix(
    [[1, 2, 1],
     [6, -1, 0],
     [-1, -2, -1]]
)

eigvals, eigvecs = np.linalg.eig(matrix)
print("The eigenvalues are {} and {}".format(eigvals[0], eigvals[1]))
print("The eigenvectors are {} and {}".format(eigvecs[0], eigvecs[1]))

The eigenvalues are -3.999999999999999 and 2.9999999999999996
The eigenvectors are [[ 0.40824829 -0.48507125 -0.0696733 ]] and [[-0.81649658 -0.72760688 -0.41803981]]


In [7]:
eigvals

array([ -4.00000000e+00,   3.00000000e+00,   9.61673584e-17])

## Principal Components Analysis

In simple words, principal component analysis is a method of extracting important variables (in form of components) from a large set of variables available in a data set. It extracts low dimensional set of features from a high dimensional data set with a motive to capture as much information as possible. With fewer variables, visualization also becomes much more meaningful. PCA is more useful when dealing with 3 or higher dimensional data.

**Steps**
1. Eigendecomposition - compute eigenvectors and eigenvalues of the covariance or correlation matrix
2. Select Principal Components - sort eigenvalues and compute explained variance
3. Project onto New Feature Space - visualize transformed data on new subspace

![component](imgs/components.png)


First principal component is a linear combination of original predictor variables which captures the maximum variance in the data set. It determines the direction of highest variability in the data. Larger the variability captured in first component, larger the information captured by component. No other component can have variability higher than first principal component.

If the two components are uncorrelated, their directions should be orthogonal (image above). This image is based on a simulated data with 2 predictors. Notice the direction of the components, as expected they are orthogonal. This suggests the correlation b/w these components in zero.

### Eigenfaces

So if we run principal component analysis on our faces, what we’re essentially doing is producing faces that capture most of the variance on the face, where each respective principal component is orthogonal to each other (meaning each face captures the most variance of a feature on a face). We’ll call these PCA faces “eigenfaces.”

![eigenfaces](eigenface.png)