$A$ is a demeaned data matrix.

1. Compute the first, second and third principal axes of $A$.
2. Compute the first principal component of $A$.
3. Total variance of $A$.
4. Fraction of the total variance of $A$ captured by the first principal component; fraction of the total variance captured by the first two principal components.

Import the modules we needed.

In [1]:
%matplotlib inline

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Enter the data of matrix $A$ to the list `a`.

In [2]:
a = [-27, 38, 37, 8, 22,0,14,-25,-35,-19,-23,-16,36,32,14,34,39,33,-40,-24,-24,-14,-3,-34,-2,-30,3,10,-26,22 ]

`A_demeaned` is an array of matrix $A$, which is already a demeaned matrix.

In [3]:
A = np.array(a)
A_demeaned = A.reshape(10,3)
A_demeaned

array([[-27,  38,  37],
       [  8,  22,   0],
       [ 14, -25, -35],
       [-19, -23, -16],
       [ 36,  32,  14],
       [ 34,  39,  33],
       [-40, -24, -24],
       [-14,  -3, -34],
       [ -2, -30,   3],
       [ 10, -26,  22]])

Compute the covariance matrix `CA` of the demeaned matrix $A$. 
$$C_A = \frac{1}{N}A^TA$$

In [4]:
N = A_demeaned.shape[0]     # rows of matrix A
M = A_demeaned.shape[1]     # columns of matrix A

CA = (1 / N) * A_demeaned.T @ A_demeaned

with np.printoptions(precision=2):
    print(CA)

[[570.2 251.7 209.1]
 [251.7 778.8 440. ]
 [209.1 440.  636. ]]


The $i^{th}$ principal axis of $A$ is the vector $u_i$ such that the eigenvector of $C_A$ corresponding to the eigenvalues $\lambda_i$ (order from largest to smallest).

In [5]:
# eigenvals are the eigenvalues  of the covariance matrix
# eigenvects are the eigenvectors of the covariance matrix
eigenvals, eigenvects = np.linalg.eig(CA)

# order the eigenvalues from the largest to the smallest 
order = np.argsort(eigenvals)[::-1]
eigenvals = eigenvals[order]

# and put the eigenvectors in the same order
eigenvects = eigenvects[:, order]
print(eigenvals)
print(eigenvects)

[1299.88121785  423.55268294  261.56609921]
[[ 0.4091695   0.91223061  0.02038715]
 [ 0.69594175 -0.29754986 -0.65355119]
 [ 0.59012321 -0.28160148  0.7566077 ]]


First principal axes $u_1$ of $A$:

In [6]:
u_1 = eigenvects[:,0]
print(u_1)

[0.4091695  0.69594175 0.59012321]


Second principal axes $u_2$ of $A$:

In [7]:
u_2 = eigenvects[:,1]
print(u_2)

[ 0.91223061 -0.29754986 -0.28160148]


Third principal axes $u_3$ of $A$:

In [8]:
u_3 = eigenvects[:,2]
print(u_3)

[ 0.02038715 -0.65355119  0.7566077 ]


$Y_1 = Au_1$ is the $1^{st}$ principal component of $A$.

First principal component of $A$:

In [9]:
Y_1 = A_demeaned @ u_1
print(Y_1)

[ 37.23276857  18.58407452 -32.32448294 -33.2228521   45.26196298
  60.52755715 -47.23233906 -27.88038731 -19.92622189  -1.02007993]


The sum of entries on the diagonal of the covariance matrix $C_A$ gives the total variance:

In [10]:
tot_variance = np.diag(CA).sum()
print(tot_variance)

1985.0


By dividing the largest eigenvalue by the total variance of the data, we can get the fraction of the variance which is captured by the first principal component of the data:

In [11]:
fr1 = eigenvals[0]/tot_variance
fr1

0.6548519989167443

In [12]:
np.round(fr1,2)

0.65

By dividing the sum of the largest two eigenvalues by the total variance of the data, we can get the fraction of the total variance captured by the first two principal components:

In [13]:
first_two = eigenvals[0] + eigenvals[1]
fr2 = first_two/tot_variance
fr2

0.8682286653848535

In [14]:
np.round(fr2,2)

0.87