# Principle Component Analysis

A small notebook that exercises PCA.

Done for the Oxford ML course.

In [1]:
using LinearAlgebra
using Statistics

## Initial small example

In [2]:
M = [1 1; 1 1]

2×2 Matrix{Int64}:
 1  1
 1  1

In [3]:
eigvals(M)

2-element Vector{Float64}:
 0.0
 2.0

In [4]:
eigvecs(M)

2×2 Matrix{Float64}:
 -0.707107  0.707107
  0.707107  0.707107

## Animal Size Data

In [5]:
A = [8.3 9.1 4.0 11.5; 3.7 5.1 1.0 6.7; 10.3 12.3 5.2 14.2]

3×4 Matrix{Float64}:
  8.3   9.1  4.0  11.5
  3.7   5.1  1.0   6.7
 10.3  12.3  5.2  14.2

In [6]:
SA = A' * A

4×4 Matrix{Float64}:
 188.67  221.09   90.46  266.5
 221.09  260.11  105.46  313.48
  90.46  105.46   44.04  126.54
 266.5   313.48  126.54  378.78

In [7]:
eigvals(SA)

4-element Vector{Float64}:
  -1.1671165167121752e-14
   0.29219181566468994
   2.030482260246855
 869.2773259240886

In [8]:
eigvecs(SA)

4×4 Matrix{Float64}:
  0.689835   0.379794    0.404069   -0.465416
  0.115457  -0.827826   -0.0488346  -0.546799
 -0.606347   0.0170623   0.763381   -0.222039
 -0.378339   0.412517   -0.50159    -0.659618

### Iterate towards eigenvectors by multiplication and renormalisation

In [9]:
v0 = [1,1,1,1]
v0p = SA * v0
v1 = v0p / sqrt(v0p'*v0p)

4-element Vector{Float64}:
 0.46572260163976376
 0.5467648458890038
 0.22262016577234636
 0.6592351048096248

In [10]:
(v1 * v1') * eigvals(SA)[4]

4×4 Matrix{Float64}:
 188.544  221.353   90.126   266.886
 221.353  259.872  105.809   313.328
  90.126  105.809   43.0812  127.574
 266.886  313.328  127.574   377.78

## Exercise Example 1

Calculate principle eigenvector by repeated multiplication

In [11]:
M1 = [0.29 0.58 0.8; 0.58 1.16 1.6; 0.8 1.6 2.21]

3×3 Matrix{Float64}:
 0.29  0.58  0.8
 0.58  1.16  1.6
 0.8   1.6   2.21

In [12]:
V0 = [1,1,1]
V0p = M1 * V0
V1 = V0p / sqrt(dot(V0p,V0p))
V1p = M1 * V1
V2 = V1p / sqrt(dot(V1p,V1p))
V2p = M1 * V2
V3 = V2p / sqrt(dot(V2p,V2p))

3-element Vector{Float64}:
 0.2814622663210985
 0.562924532642197
 0.7771067900790439

Find the eigenvalue by seeing how much the vector is scaled by

In [13]:
EV1 = mean(M1 * V3 ./ V3)

3.6587700784757935

Now reconstruct the original matrix

In [14]:
RecoM1 = (V3 * V3') * EV1

3×3 Matrix{Float64}:
 0.289851  0.579703  0.800269
 0.579703  1.15941   1.60054
 0.800269  1.60054   2.20951

How good is this?

In [15]:
reldiff = (M1 .- RecoM1) ./ M1

3×3 Matrix{Float64}:
  0.000512237   0.000512237  -0.00033627
  0.000512237   0.000512237  -0.00033627
 -0.00033627   -0.00033627    0.000220443

In [16]:
rms = sqrt(mean(reldiff .* reldiff) / prod(size(reldiff)))

0.00013835236540607755