<a href="https://colab.research.google.com/github/ronbalanay/MAT-422/blob/main/MAT422_HW_1_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1.4.1 Singular Value Decomposition

Let A be any m by n matrix. Then there exist matrices U, S, and V (that are m \* m, m \* n, and n \* n, resp.) such that A = U \* S \* V^T, where V^T is the transpose of V (also note that U, V are orthogonal). This technique, singular value decomposition (SVD), is especially powerful when you want to reduce the size of a matrix and eliminate less important data, such as in image compression or speeding up computation of large data.

In [1]:
import numpy as np

matrix_string = input("Please enter 4 numbers, separating by commas (i.e. 1,2,3,4): ")
m = np.array([float(x) for x in matrix_string.split(',')])
matrix = np.array([[m[0], m[1]], [m[2], m[3]]])
A = np.array([[m[0], m[1]], [m[2], m[3]]])

print("\nLet A be a 2x2 matrix with entries in R:")
print(matrix)

#i tried to do this manually at first by following the process described,
#but i ran into discrepancies with the ordering of eigenvalues and eigenvectors using np.eig as opposed to using .svd

U, S, V_T = np.linalg.svd(A)

print("\nNow, we need to find the eigenvectors of A * A^T, and we let these eigenvectors be the columns of U:\n", U)
print("\nWe will repeat this process with A^T * A, finding a matrix V, and take its transpose V^T:\n", V_T)
print("\nFinally, we can take the square roots of the eigenvalues corresponding to U or V as a diagonal matrix S:\n", np.diag(S))

S_matrix = np.zeros_like(A)
np.fill_diagonal(S_matrix, S)
A_svd = U @ S_matrix @ V_T

print("\nWe see that U * S * V^T = \n", A_svd)

if np.allclose(A, A_svd):
    print("\nIndeed, A = U *S * V^T.")



Please enter 4 numbers, separating by commas (i.e. 1,2,3,4): 9,2,6,7

Let A be a 2x2 matrix with entries in R:
[[9. 2.]
 [6. 7.]]

Now, we need to find the eigenvectors of A * A^T, and we let these eigenvectors be the columns of U:
 [[-0.70710678 -0.70710678]
 [-0.70710678  0.70710678]]

We will repeat this process with A^T * A, finding a matrix V, and take its transpose V^T:
 [[-0.85749293 -0.51449576]
 [-0.51449576  0.85749293]]

Finally, we can take the square roots of the eigenvalues corresponding to U or V as a diagonal matrix S:
 [[12.36931688  0.        ]
 [ 0.          4.12310563]]

We see that U * S * V^T = 
 [[9. 2.]
 [6. 7.]]

Indeed, A = U *S * V^T.


# 1.4.2 Low-Rank Matrix Approximations

As mentioned in the previous section, we can use SVD to produce a lossy compression of a matrix, namely by reducing the rank of said matrix. Suppose A is m by n. In order to describe A, we need m\*n numbers, while a rank-k approximation of A only requires k(m + n) numbers. Therefore, a rank-k approximation becomes more useful as the size of A increases.

In [3]:
import numpy as np

matrix_string = input("Please enter 4 numbers, separating by commas (i.e. 1,2,3,4): ")
m = np.array([float(x) for x in matrix_string.split(',')])
matrix = np.array([[m[0], m[1]], [m[2], m[3]]])
A = np.array([[m[0], m[1]], [m[2], m[3]]])

print("\nLet A be a 2x2 matrix with entries in R:")
print(matrix)

U, S, V_T = np.linalg.svd(A)
print("\nAgain, we use singular value decomposition to produce matrices U, S, and V^T such that A = U * S * V^T.")
print("\nU:\n", U)
print("\nS:\n", S)
print("\nV^T:\n", V_T)
print("\nIn order to find a rank-k approximation of A, we take the sum from i=1 to k of s_i(u_i * v_i^T),\nwhere s_i are the singular values of S, and u_i, v_i^T are the singular vectors of U and V^T.")
k1 = 1
U_k1 = U[:, :k1]
S_k1 = np.diag(S[:k1])
V_T_k1 = V_T[:k1, :]
A_approx_k1 = U_k1 @ S_k1 @ V_T_k1

print("\nA rank-1 approximation of A only uses the largest singular value from S:\n", A_approx_k1)


k2 = 2
U_k2 = U[:, :k2]
S_k2 = np.diag(S[:k2])
V_T_k2 = V_T[:k2, :]
A_approx_k2 = U_k2 @ S_k2 @ V_T_k2

print("\nOn the other hand, the rank-2 approximation uses both singular values from S:\n", A_approx_k2)
print("\nSince our rank-2 approximation is the same rank as A, it is precisely the same as A.")



Please enter 4 numbers, separating by commas (i.e. 1,2,3,4): 98,2,389,8

Let A be a 2x2 matrix with entries in R:
[[ 98.   2.]
 [389.   8.]]

Again, we use singular value decomposition to produce matrices U, S, and V^T such that A = U * S * V^T.

U:
 [[-0.24429411 -0.96970118]
 [-0.96970118  0.24429411]]

S:
 [4.01239330e+02 1.49536687e-02]

V^T:
 [[-0.99978879 -0.02055182]
 [-0.02055182  0.99978879]]

In order to find a rank-k approximation of A, we take the sum from i=1 to k of s_i(u_i * v_i^T),
where s_i are the singular values of S, and u_i, v_i^T are the singular vectors of U and V^T.

A rank-1 approximation of A only uses the largest singular value from S:
 [[ 97.99970199   2.01449753]
 [389.00007508   7.99634768]]

On the other hand, the rank-2 approximation uses both singular values from S:
 [[ 98.   2.]
 [389.   8.]]

Since our rank-2 approximation is the same rank as A, it is precisely the same as A.


# 1.4.3 Principal Component Analysis

Suppose we have a high-dimensional dataset with several different components. There may be a lot of variation in our data, and we would like to understand which components are causing this variation. Since there are so many components, this is a difficult task- but principal component analysis (PCA) allows us to do just that. We can reduce the dimension of our data by projecting each vector onto the components that have the most influence on our data's variation.

In fact, principal component analysis is a special case of singular value decomposition, so we will use SVD in the below example.

In [7]:
import numpy as np

matrix_string = input("Please enter 6 numbers, separating by commas (i.e. 1,2,3,4,5,6): ")
m = np.array([float(x) for x in matrix_string.split(',')])
A = np.array([[m[0], m[1], m[2]],
              [m[3], m[4], m[5]]])

print("\nLet A be a 2x3 matrix with entries in R:")
print(A)

A_mean = A - np.mean(A, axis=0)
print("\nFirst, we need to center the data by subtracting the mean of each column.")
print("This will simplify the calculations we have to do later and remove bias from columns with large magnitudes:\n", A_mean)

U, S, V_T = np.linalg.svd(A_mean)
print("\nWe'll use singular value decomposition on A to find matrices U, S, and V^T.")
print("\nU:\n", U)
print("\nS (Singular values):\n", S)
print("\nV^T (our principal components):\n", V_T)

V_T_1 = V_T[:1, :]
A_projected_1 = A_mean @ V_T_1.T

print("\nSuppose we project our data onto the first principal component, i.e. setting k = 1:\n", A_projected_1)

A_approx_1 = A_projected_1 @ V_T_1 + np.mean(A, axis=0)

print("\nWe can take our projection and multiply it by the first column of V_T to approximate A:\n", A_approx_1)

V_T_2 = V_T[:2, :]
A_projected_2 = A_mean @ V_T_2.T

print("\nNow, let's observe the difference when we set k = 2, i.e. projecting our matrix onto the first two principal components:\n", A_projected_2)

A_approx_2 = A_projected_2 @ V_T_2 + np.mean(A, axis=0)

print("\nLet's use this to approximate A with k=2:\n", A_approx_2)

V_T_3 = V_T[:3, :]
A_projected_3 = A_mean @ V_T_3.T

print("\nFinally, let's project the matrix onto all three principal components, i.e. setting k = 3:\n", A_projected_3)

A_approx_3 = A_projected_3 @ V_T_3 + np.mean(A, axis=0)

print("\nSince we used the same number of principal components as our original matrix, we yield an exact reconstruction of A:\n", A_approx_3)



Please enter 6 numbers, separating by commas (i.e. 1,2,3,4,5,6): 12,125,894,89,8,7

Let A be a 2x3 matrix with entries in R:
[[ 12. 125. 894.]
 [ 89.   8.   7.]]

First, we need to center the data by subtracting the mean of each column.
This will simplify the calculations we have to do later and remove bias from columns with large magnitudes:
 [[ -38.5   58.5  443.5]
 [  38.5  -58.5 -443.5]]

We'll use singular value decomposition on A to find matrices U, S, and V^T.

U:
 [[-0.70710678  0.70710678]
 [ 0.70710678  0.70710678]]

S (Singular values):
 [6.34975196e+02 4.01943669e-14]

V^T (our principal components):
 [[ 0.08574701 -0.13029091 -0.98776097]
 [-0.98776097  0.11853247 -0.10138207]
 [ 0.13029091  0.98436494 -0.11853247]]

Suppose we project our data onto the first principal component, i.e. setting k = 1:
 [[-448.99526724]
 [ 448.99526724]]

We can take our projection and multiply it by the first column of V_T to approximate A:
 [[ 12. 125. 894.]
 [ 89.   8.   7.]]

Now, let's o