# Computational Linear Algebra
## A great project on Singular Value Decomposition

### by Paul, Placida, and Sean

## Know your audience
- Third year students with a prior course in linear algebra
- Prior exposure to programming recommended but not required
- In person; we are going to tell ourselves that it's an active learning class
- LMS agnostic
- Access to Syzygy assumed

## Big Ideas and Essential Questions
- Main idea: Singular Value Decomposition is a very stable and fast algorithm with many applications in both math and industry, and it works with **any** matrix
- Core understanding -- the decomposition provides us with lots of useful information:
     - Four fundamental subspaces
     - Rank
     - Norm of a matrix
     - Condition number: how changes in matrix entries can affect solutions
     - Pseudoinverse, least squares
     - Principal component analysis
- Basic understanding: singular values are usually not nice: this is a numerical algorithm
- Essential questions:
     - How to compute it!
     - How to use SVD to obtain all the information listed above

## Learning goals
- How to implement SVD in Python using NumPy
- Relationship between singular values and eigenvalues
- To be able to prove the SVD theorem
- What is so fundamental about the four fundamental subspaces?
- Why is SVD the right way to compute rank?
- How to apply the pseudoinverse to least squares solutions
- Geometric interpretation of singular values

## Learning plan
- Walk through the steps of the SVD algorithm in an example
- Confirm that the results match what the built-in algorithm produces
- Try the algorithm on a few more examples
- Define the four fundamental subspaces, show how to find them from SVD
- Do examples and confirm that SVD gives the same nullspace as "old" methods
- Also use SVD to determine rank in these examples
- Introduce pseudoinverse and have students work through an example

## Notebook design
- Code examples provided
- Ensure students understand how to interpret output by comparing algorithm output to initial worked example
- Students expected to modify/copy code for their own examples

# The Singular Value Decomposition

For any $m\times n$ matrix $A\text{,}$ the matrices $A^TA$ and $AA^T$ are both positive. (Exercise!) This means that we can define $\sqrt{A^TA}\text{,}$ even if $A$ itself is not symmetric or positive.

- Since $A^TA$ is symmetric, we know that it can be diagonalized.
- Since $A^TA$ is positive, we know its eigenvalues are non-negative.
- This means we can define the singular values $\sigma_i = \sqrt{\lambda_i}$ for each $i=1,\ldots, n$.
- This works even if $A$ is not a square matrix!

The singular value decomposition has the form

$$A = P\Sigma_A Q^T,$$

where $\Sigma_A$ is a matrix containing the singular values of $A$. There are two conventions:
1. $\Sigma_A$ has the same size as $A$, and the upper-left corner is block-diagonal, with diagonal entries given by the singular values of $A$.
2. $\Sigma_A$ is truncated to include only the diagonal matrix of singular values.

The sizes of $P$ and $Q$ depend on which convention we choose. The algorithm in `NumPy` gives the full matrices $P$ and $Q$, but does not give $\Sigma_A$; it only lists the singular values.

## 1. Initial example

Let $A = \begin{bmatrix}1&0&1\\0&1&2\end{bmatrix}$. Compute $A^TA$, and find the singular values of $A$ by determining the eigenvalues of $A^TA$.

First, let's load the required libraries.

In [None]:
import numpy as np
import scipy.linalg as la

Next, let's define our matrix $A$ as a NumPy array, and compute $B=A^TA$.

In [None]:
A = np.array([[1,0,1],[0,1,2]])
B = (A.T)@A
print(A)
print(B)

Next, let's find the eigenvalues of $B$.

In [None]:
eigvals = la.eig(B)[0].real
print(eigvals)

It looks like our eigenvalues are $6, 1$ and $0$. Now, let's get the singular values.

In [None]:
singvals = []
for ev in eigvals:
    singvals.append(np.sqrt(ev))
print(singvals)

The matrix $Q$ is an orthogonal $n\times n$ matrix whose columns are an orthonormal basis of eigenvectors for $A^TA\text{.}$ The matrix $P$ is an orthogonal $m\times m$ matrix whose columns are an orthonormal basis of $\mathbb{R}^m\text{.}$ (The first $r$ columns of $P$ are given by $A\mathbf{q}_i\text{,}$ where $\mathbf{q}_i$ is the eigenvector of $A^TA$ corresponding to the positive singular value $\sigma_i\text{.}$)

First, let's get the eigenvectors.

In [None]:
eigvects = la.eig(B)[1].real
print(eigvects)

The columns of this matrix are the eigenvectors of $A^TA$. Since $A^TA$ is symmetric, we know that these eigenvectors are orthogonal; by default, the `la.eig` command produces unit vectors, so we can proceed directly to forming the matrix $Q$. In fact, we don't have to proceed anywhere. This **is** the matrix $Q$!

In [None]:
Q = eigvects

For later reference, we want to extract the eigenvectors, which are the columns of $Q$. The command `Q[:,i]` will extract column `i`, but it will extract it as a row vector, so we also need to reshape it as a column. We do this as follows:

In [None]:
q1 = Q[:,0].reshape(3,1)
q2 = Q[:,1].reshape(3,1)
q3 = Q[:,2].reshape(3,1)

Next, we want to construct the matrix $P$. The columns of $P$ are eigenvectors of $AA^T$, so we proceed as above.

In [None]:
C = A@(A.T)
Ceval,Cevec = la.eig(C)

In [None]:
P = Cevec
print(P)

Note: we want the columns in order of decreasing eigenvalue, so we had to swap the order of the eigenvectors above.

We want to see if this works! We're going to check two things:
1. Whether or not $P\Sigma_AQ^T$ is equal to $A$
2. If the results we found here agree with the built in `np.svd` command.

First, we need to construct $\Sigma_A$. Let's just do that by hand.

In [None]:
sigA = np.array([[singvals[0],0,0],[0,singvals[1],0]])
print(sigA)

In [None]:
P@sigA@Q.T

It didn't work! Two things could have gone wrong. First, we need to ensure that the order of our eigenvectors for both $P$ and $Q$ is consistent with the decreasing order of singular values. Let's check how the eigenvalues are ordered for $P$.

In [None]:
print(Ceval.real)

Oh, dang! Our vectors were in the wrong order! Let's look at the eigenvectors again.

In [None]:
print(Cevec.real)

Can we swap the columns without manually writing out the entries? Swapping rows is easy, so let's try this: transpose, to turn columns into rows, then, swap rows, and then, transpose back.

In [None]:
Cevec2 = Cevec.T
Cevec3 = np.array([Cevec2[1],Cevec2[0]])
Cevec4 = Cevec3.T
print(Cevec4)

Success! Now, let's see if we get back the matrix $A$.

In [None]:
P = Cevec4
print(P@sigA@Q.T)
print(np.round(P@sigA@Q.T))

Hooray!!!! It worked! But there is one other thing that could go wrong: each eigenvector is determined only up to sign. Change the signs on one column, and you might not get back your matrix. How can we make sure everything matches up?

### Alternative construction of $P$

Following the text by Keith Nicholson, we get the columns of $P$ using the formula
$$p_i = \frac{1}{\lVert Aqi\rVert}q_i,$$
where $p_i,q_i$ represent the $i$th columns of $P$ and $Q$, respectively.

Earlier, we turned the eigenvectors of $A^TA$ into column vectors called `q1,q2,q3`. The first two of these correspond to the non-zero singular values. Let's multiply them by $A$, and then normalize.

In [None]:
p1 = A@q1
p2 = A@q2
print(p1)
print(p2)

In [None]:
p1n = (1/np.linalg.norm(p1))*p1
p2n = (1/np.linalg.norm(p2))*p2
print(p1n)
print(p2n)

We could reshape these into rows, put them into an `np.array`, and then take the transpose.
Or, we could take advantage of the `hstack` function:

In [None]:
P = np.hstack((p1n,p2n))
print(P)

That's the same matrix $P$ as before!