# Activity 7: SVD

In [None]:
import numpy as np
from numpy import linalg as LA

### Exercise 1: SVD
Let $A$ be an $m\times n$ matrix of rank $r$.
Recall that the *singular values* of $A$ are the positive square roots $\sigma_i=\sqrt{\lambda_i}>0$ of the eigenvalues $\lambda_i$ of $K=A^T A$.
A *singular value decomposition* (SVD) of $A$ is a factorization $A=P\Sigma Q^T$ where $P$ is an $m\times r$ matrix with orthonormal columns, $\Sigma$ is an $r\times r$ diagonal matrix with the singular values of $A$ as its entries, and $Q^T$ is $r\times n$ with orthonormal rows (i.e. $Q$ is $n\times r$ with orthonormal columns. 
By convention, we take $\Sigma$ to have its diagonal entries ordered from greatest to least.
It is Theorem 8.63 of Olver-Shakiban that every matrix has a SVD.
The proof of that theorem contains a recipe for finding the SVD of $A$:
- Let $\vec q_1,\dots \vec q_r$ be the orthonormal eivgenvectors of $K=A^TA$ such that each $q_i$ corresponds to $\lambda_i=\sigma_i^2$.
- Then, set $\vec p_i=\frac{A \vec q_i}{\sigma_i}$
- Let $P$ be the matrix with $\vec p_i$ as its $i$th column, and let $Q^T$ be the matrix with $\vec q_i$ as its $i$th row.

**Exercise:** Below, give a function `my_SVD` which produces a SVD for any input matrix $A$. 
While you can use `symmetric_eigensolver` from last activity to do so, you might run into accuracy issues if you do. 
Instead, consider using the equivalent built-in function, `np.linalg.eigh` (imported for convenience as `LA.eigh`).
`LA.eigh` gives output in the same format as `symmetric_eigensolver`, but with eigenvalue ascending rather than descending.
You may be tempted to use `np.count_nonzero`, but there is an issue there--while `LA.eigh` is significantly more accurate than our `symmetric_eigensolver`, it still is not perfect and may return very small nonzero values instead of zero. 
Instead, you should take an optional `tolerance` argument in your input and only take the eigenvalues greater than `tolerance`, ignoring the rest.
Your function should output a tuple `(P,Sigma, Qt)` where `Qt` is the matrix $Q^T$.

<details>
<summary>
<b>
    Hints:
</b>
    (Click here to open)
</summary>

- One helpful built-in function is `np.linalg.diag` (here `LA.diag`) which converts vectors to diagonal matrices and vice-versa.
- You may want to make an entire function (say, `count_nonzero_up_to_tolerance`) which counts the number of entries in a vector exceeding `tolerance` in order to determine rank.

</details>


In [2]:
def my_SVD():
    #YOUR CODE HERE
    return

In [None]:
#Testing:
A=np.array([[1,2,3,4],[2,3,4,5]],"float64")
P,Sigma,Qt=my_SVD(A)
P,Sigma,Qt
#Desired Output: (up to possibly simultaneously negating columns of P,Qt) 
#(array([[-0.59693053,  0.80229293],
#        [-0.80229293, -0.59693053]]),
# array([[9.15211593, 0.        ],
#        [0.        , 0.48864503]]),
# array([[-0.24054726, -0.3934325 , -0.54631774, -0.69920298],
#        [-0.80133452, -0.38106544,  0.03920365,  0.45947273]]))

In [None]:
#Further testing:
np.dot(np.dot(P,Sigma),Qt)
#what should the desired output be?

## Exercise 2: Pseudoinverses.

Recall that the pseudo inverse of a matrix $A$ with SVD $A=P\Sigma Q^T$ is $A^+=Q\Sigma^{-1}P^T$.

**Exercise: (a)** Below, use `my_SVD` to give the pseudoinverse of an input matrix `A`. Do **not** use built-in functions to compute an inverse for $\Sigma$.

In [None]:
def my_pseudoinv():
    #YOUR CODE HERE
    return

In [None]:
#testing 
A1=np.array([[1,3],[1,4]])
A1_plus=my_pseudoinv(A1)
np.dot(A1,A1_plus)
#What should we expect for output?

In [None]:
#testing 
A=np.array([[1,2,3,4],[2,3,4,5]],"float64")
my_pseudoinv(A)

Pseudoinverses are useful in part because they assist in optimization problems. 
In particular, consider the least squares problem $A\vec x =\vec b$ and let $\vec x^*=A^+\vec b$. 
Then, by Lemma 8.68, $\vec x^*=(A^TA)^{-1}A^T\vec b$, and so in particular, $\vec x^*$ solves the normal equations as $A^TA\vec x^*=A^TA A^+\vec b=(Q\Sigma^2Q^T)(Q\Sigma^{-1}P^T) \vec b=Q\Sigma P^T \vec b=A^T \vec b$.

**Exercise (b):** Below, give a function `least_squares2` which takes as input a matrix `A`, a vector `b` and a `tolerance` and solves the least squares problem $A\vec x=\vec b$, returning the minimizer $\vec x^*$ and the minimum $\left \lVert A\vec x-\vec b\right \rVert$, passing the tolerance to `my_SVD`.

In [None]:
def least_squares2():
    #YOUR CODE HERE
    return

In [None]:
#testing
A2=np.array([
    [1,2,-1],
    [3,-4,1],
    [-1,3,-1],
    [2,-1,0]
],"float64")
b=np.array([1,0,-1,2],"float64")
least_squares2(A2,b)
#desired output:
#(array([ 0.56666667,  0.13333333, -0.16666667]), 1.7320508075688772)

## Exercise 3: Condition Numbers and Hilbert's Revenge
Recall our folly with *Hilbert Matrices* earlier in the semester.
These were the matrix $H_n=\left(\frac1{i+j-1}\right)_{i,j=1}^n$.
We saw that they were particularly resistant to analysis by LU, which was attributed (rather off-handedly) to being "ill-conditioned."
Now, to finally come back around and put a ribbon on that, let's explore ill-conditionedness in better detail.

**Definition:** The condition number of a rank-$r$ matrix $A$ is the ratio of the greatest singular value to the least, that is $\sigma_1/\sigma_r$.

**Exercise: (a)** Below, write a function which uses `my_SVD` to compute the condition number of an input matrix.
Your function should again take a tolerance argument which is passed to `my_SVD` (this is very important here).


In [None]:
def condition_number():
    #YOUR CODE HERE
    return

**(b)** Copy or rewrite or otherwise re-load your `hilbert` function from before. 
By any method you like, Generate a list of condition numbers of $H_n$ with $n=2,\dots,20$ using `tolerance=10**-10`.
What do you notice? Is this in line with your expectations, or does it run counter to your expectations.
What happens if you change the tolerance? How do your results seem to respond?

In [None]:

def hilbert():
    #YOUR CODE HERE
    return

In [None]:
#YOUR CODE HERE TO GENERATE CONDITION NUMBERS

(This is a markdown cell to write down your observations)

**(c)** It is known that for every $n\geq 2$, $H_n$ is invertible (and hence full rank). 
Rewrite your code for part (b) to also print the observed rank in addition to the observed condition numbers (possibly also playing around with `tolerance`).
Do your results agree with what is known theoretically? 

In [None]:
def condition_number_and_rank(:
    #YOUR CODE HERE
    return

(This is a markdown cell to write down your observations)


**(d)** Note that for a symmetric matrix $K$, the singular values $\sigma_i$ are exactly the eigenvalues of $K$. Use this fact and `LA.eigh` to attempt once more to compute the actual condition numbers of $H_n$ for whichever values you are able to. Do your results make sense? If not, where do they seem to diverge from what you expect? Write a few words about your observations. What do you think is happening?

In [None]:
def condition_number_symmetric():
    #YOUR CODE HERE
    return

(This is a markdown cell to write down your observations)
