# Computational Linear Algebra
## A great project on Singular Value Decomposition

### by Paul, Placida, and Sean

## Know your audience
- Third year students with a prior course in linear algebra
- Prior exposure to programming recommended but not required
- In person; we are going to tell ourselves that it's an active learning class
- LMS agnostic
- Access to Syzygy assumed

## Big Ideas and Essential Questions
- Main idea: Singular Value Decomposition is a very stable and fast algorithm with many applications in both math and industry, and it works with **any** matrix
- Core understanding -- the decomposition provides us with lots of useful information:
     - Four fundamental subspaces
     - Rank
     - Norm of a matrix
     - Condition number: how changes in matrix entries can affect solutions
     - Pseudoinverse, least squares
     - Principal component analysis
- Basic understanding: singular values are usually not nice: this is a numerical algorithm
- Essential questions:
     - How to compute it!
     - How to use SVD to obtain all the information listed above

## Learning goals
- How to implement SVD in Python using NumPy
- Relationship between singular values and eigenvalues
- To be able to prove the SVD theorem
- What is so fundamental about the four fundamental subspaces?
- Why is SVD the right way to compute rank?
- How to apply the pseudoinverse to least squares solutions
- Geometric interpretation of singular values

## Learning plan
- Walk through the steps of the SVD algorithm in an example
- Confirm that the results match what the built-in algorithm produces
- Try the algorithm on a few more examples
- Define the four fundamental subspaces, show how to find them from SVD
- Do examples and confirm that SVD gives the same nullspace as "old" methods
- Also use SVD to determine rank in these examples
- Introduce pseudoinverse and have students work through an example

## Notebook design
- Code examples provided
- Ensure students understand how to interpret output by comparing algorithm output to initial worked example
- Students expected to modify/copy code for their own examples

# The Singular Value Decomposition

For any $m\times n$ matrix $A\text{,}$ the matrices $A^TA$ and $AA^T$ are both positive. (Exercise!) This means that we can define $\sqrt{A^TA}\text{,}$ even if $A$ itself is not symmetric or positive.

- Since $A^TA$ is symmetric, we know that it can be diagonalized.
- Since $A^TA$ is positive, we know its eigenvalues are non-negative.
- This means we can define the singular values $\sigma_i = \sqrt{\lambda_i}$ for each $i=1,\ldots, n$.
- This works even if $A$ is not a square matrix!

The singular value decomposition has the form

$$A = P\Sigma_A Q^T,$$

where $\Sigma_A$ is a matrix containing the singular values of $A$. There are two conventions:
1. $\Sigma_A$ has the same size as $A$, and the upper-left corner is block-diagonal, with diagonal entries given by the singular values of $A$.
2. $\Sigma_A$ is truncated to include only the diagonal matrix of singular values.

The sizes of $P$ and $Q$ depend on which convention we choose. The algorithm in `NumPy` gives the full matrices $P$ and $Q$, but does not give $\Sigma_A$; it only lists the singular values.

## 1. Initial example

Let $A = \begin{bmatrix}1&0&1\\0&1&2\end{bmatrix}$. Compute $A^TA$, and find the singular values of $A$ by determining the eigenvalues of $A^TA$.

First, let's load the required libraries.

In [1]:
import numpy as np
import scipy.linalg as la

Next, let's define our matrix $A$ as a NumPy array, and compute $B=A^TA$.

In [2]:
A = np.array([[1,0,1],[0,1,2]])
B = (A.T)@A
print(A)
print(B)

[[1 0 1]
 [0 1 2]]
[[1 0 1]
 [0 1 2]
 [1 2 5]]


Next, let's find the eigenvalues of $B$.

In [19]:
eigvals = la.eig(B)[0].real
print(eigvals)

[6.00000000e+00 1.00000000e+00 4.59534891e-17]


It looks like our eigenvalues are $6, 1$ and $0$. Now, let's get the singular values.

In [20]:
singvals = []
for ev in eigvals:
    singvals.append(np.sqrt(ev))
print(singvals)

[2.449489742783178, 1.0, 6.778900286091395e-09]


The matrix $Q$ is an orthogonal $n\times n$ matrix whose columns are an orthonormal basis of eigenvectors for $A^TA\text{.}$ The matrix $P$ is an orthogonal $m\times m$ matrix whose columns are an orthonormal basis of $\mathbb{R}^m\text{.}$ (The first $r$ columns of $P$ are given by $A\mathbf{q}_i\text{,}$ where $\mathbf{q}_i$ is the eigenvector of $A^TA$ corresponding to the positive singular value $\sigma_i\text{.}$)

First, let's get the eigenvectors.

In [21]:
eigvects = la.eig(B)[1].real
print(eigvects)

[[-1.82574186e-01 -8.94427191e-01 -4.08248290e-01]
 [-3.65148372e-01  4.47213595e-01 -8.16496581e-01]
 [-9.12870929e-01 -6.94835567e-17  4.08248290e-01]]


Since $A^TA$ is symmetric, we know that these eigenvectors are orthogonal; to get an orthogonal matrix we need to normalize them.

In [22]:
qvects = []
for vec in eigvects:
    qvects.append((1/np.linalg.norm(vec))*vec)
print(qvects)

[array([-0.18257419, -0.89442719, -0.40824829]), array([-0.36514837,  0.4472136 , -0.81649658]), array([-9.12870929e-01, -6.94835567e-17,  4.08248290e-01])]


In [23]:
Q = np.vstack((qvects[0],qvects[1],qvects[2])).T
print(Q)

[[-1.82574186e-01 -3.65148372e-01 -9.12870929e-01]
 [-8.94427191e-01  4.47213595e-01 -6.94835567e-17]
 [-4.08248290e-01 -8.16496581e-01  4.08248290e-01]]


What just happened there? The `vstack` command stacks the arrays, one on top of the other.
But this enters our vectors as rows, and we want them as columns, so we take the transpose.

Next, we need the matrix $P$. We do the same thing, but with $AA^T$.

In [8]:
C = A@(A.T)
Ceval,Cevec = la.eig(C)
print(C)
print(Ceval)
print(Cevec)

[[2 2]
 [2 5]]
[1.+0.j 6.+0.j]
[[-0.89442719 -0.4472136 ]
 [ 0.4472136  -0.89442719]]


In [36]:
p1 = Cevec[0].reshape(2,1)
p2 = Cevec[1].reshape(2,1)
print(p1)
print(p2)

[[-0.89442719]
 [-0.4472136 ]]
[[ 0.4472136 ]
 [-0.89442719]]


In [39]:
p1 = pvects[0].reshape(-1,1)
p2 = pvects[1].reshape(2,1)
print(p1)

[[-0.89442719]
 [-0.4472136 ]]


In [33]:
P = np.hstack((p2,p1))
print(P)

[[-0.4472136  -0.89442719]
 [ 0.89442719 -0.4472136 ]]


Note: we want the columns in order of decreasing eigenvalue, so we had to swap the order of the eigenvectors above.

We want to see if this works! We're going to check two things:
1. Whether or not $P\Sigma_AQ^T$ is equal to $A$
2. If the results we found here agree with the built in `np.svd` command.

First, we need to construct $\Sigma_A$. Let's just do that by hand.

In [11]:
sigA = np.array([[singvals[0],0,0],[0,singvals[1],0]])
print(sigA)

[[2.44948974+0.j 0.        +0.j 0.        +0.j]
 [0.        +0.j 1.        +0.j 0.        +0.j]]


In [32]:
P@sigA@Q

array([[ 0.6+0.j, -0.8+0.j, -1. +0.j],
       [ 0.8+0.j,  0.6+0.j,  2. +0.j]])

In [13]:
Pnp,Snp,Qnp = np.linalg.svd(A)

In [14]:
print(Pnp)
print(Snp)
print(Qnp)

[[-0.4472136  -0.89442719]
 [-0.89442719  0.4472136 ]]
[2.44948974 1.        ]
[[-1.82574186e-01 -3.65148372e-01 -9.12870929e-01]
 [-8.94427191e-01  4.47213595e-01  2.62514530e-16]
 [-4.08248290e-01 -8.16496581e-01  4.08248290e-01]]


In [17]:
Pnp@sigA@Qnp

array([[ 1.00000000e+00+0.j, -1.23617226e-16+0.j,  1.00000000e+00+0.j],
       [-3.95312514e-16+0.j,  1.00000000e+00+0.j,  2.00000000e+00+0.j]])

AAACK!! We TRIED