# 14 Linear Algebra: Singular Value Decomposition (Students 1)

One can always decompose a matrix $\mathsf{A}$ 
\begin{gather}
\mathsf{A} = \mathsf{U}\,\text{diag}(w_j)\,\mathsf{V}^{T}\\
\mathsf{U}^T \mathsf{U} = \mathsf{U} \mathsf{U}^T = 1\\
\mathsf{V}^T \mathsf{V} = \mathsf{V} \mathsf{V}^T = 1
\end{gather}
where the $w_j$ are the _singular values_.

The inverse (if it exists) can be directly calculated from the SVD:
$$
\mathsf{A}^{-1} = \mathsf{V} \text{diag}(1/w_j) \mathsf{U}^T
$$

## Solving ill-conditioned coupled linear equations 

In [None]:
import numpy as np

### Non-singular matrix 

In [None]:
A = np.array([
        [1, 2, 3],
        [3, 2, 1],
        [-1, -2, -6],
    ])
b = np.array([0, 1, -1])

### Singular matrix

In [None]:
C = np.array([
     [ 0.87119148,  0.9330127,  -0.9330127],
     [ 1.1160254,   0.04736717, -0.04736717],
     [ 1.1160254,   0.04736717, -0.04736717],
    ])
b1 = np.array([ 2.3674474,  -0.24813392, -0.24813392])
b2 = np.array([0, 1, 1])

## SVD for fewer equations than unknowns
$M$ equations for $N$ unknowns with $M < N$:
* no unique solutions (underdetermined)
* $N-M$ dimensional family of solutions
* SVD: at least $N-M$ zero or negligible $w_j$: columns of $\mathsf{V}$ corresponding to singular $w_j$ span the solution space when added to a particular solution.

## SVD for more equations than unknowns
$M$ equations for $N$ unknowns with $M > N$:
* no exact solutions in general (overdetermined)
* but: SVD can provide best solution in the least-square sense
  $$
  \mathbf{x} = \mathsf{V}\, \text{diag}(1/w_j)\, \mathsf{U}^{T}\, \mathbf{b}
  $$
  where 
  * $\mathbf{x}$ is a $N$-dimensional vector of the unknowns,
  * $\mathsf{V}$ is a $N \times M$ matrix
  * the $w_j$ form a square $M \times M$ matrix,
  * $\mathsf{U}$ is a $N \times M$ matrix (and $\mathsf{U}^T$ is a $M \times N$ matrix), and
  * $\mathbf{b}$ is the $M$-dimensional vector of the given values
  
It provides the $\mathbf{x}$ that minimizes the residual

$$
\mathbf{r} := |\mathsf{A}\mathbf{x} - \mathbf{b}|.
$$


### Linear least-squares fitting 

This is the *liner least-squares fitting problem*: Given data points $(x_i, y_i)$, fit to a linear model $y(x)$, which can be any linear combination of functions of $x$.

For example: 
$$
y(x) = a_1 + a_2 x + a_3 x^2 + \dots + a_M x^{M-1}
$$
or in general
$$
y(x) = \sum_{k=1}^M a_k X_k(x)
$$

The goal is to determine the coefficients $a_k$.

Define the **merit function**
$$
\chi^2 = \sum_{i=1}^N \left[ \frac{y_i - \sum_{k=1}^M a_k X_k(x_i)}{\sigma_i}\right]^2
$$
(sum of squared deviations, weighted with standard deviations $\sigma_i$ on the $y_i$).

Best parameters $a_k$ are the ones that *minimize $\chi^2$*.

*Design matrix* $\mathsf{A}$ ($N \times M$, $N \geq M$), vector of measurements $\mathbf{b}$ ($N$-dim) and parameter vector $\mathbf{a}$ ($M$-dim):

\begin{align}
A_{ij} &= \frac{X_j(x_i)}{\sigma_i}\\
b_i &= \frac{y_i}{\sigma_i}\\
\mathbf{a} &= (a_1, a_2, \dots, a_M)
\end{align}


Minimum occurs when the derivative vanishes:
$$
0 = \frac{\partial\chi^2}{\partial a_k} = \sum_{i=1}^N {\sigma_i}^{-2} \left[ y_i - \sum_{k=1}^M a_k X_k(x_i) \right] X_k(x_i), \quad 1 \leq k \leq M
$$
($M$ coupled equations)
\begin{align}
\sum_{j=1}^{M} \alpha_{kj} a_j &= \beta_k\\
\mathsf{\alpha}\mathbf{a} = \mathsf{\beta}
\end{align}
with the $M \times M$ matrix
\begin{align}
\alpha_{kj} &= \sum_{i=1}^N \frac{X_j(x_i) X_k(x_i)}{\sigma_i^2}\\
\mathsf{\alpha} &= \mathsf{A}^T \mathsf{A}
\end{align}
and the vector of length $M$
\begin{align}
\beta_{k} &= \sum_{i=1}^N \frac{y_i X_k(x_i)}{\sigma_i^2}\\
\mathsf{\beta} &= \mathsf{A}^T \mathbf{b}
\end{align}

The inverse of $\mathsf{\alpha}$ is related to the uncertainties in the parameters:
$$
\mathsf{C} := \mathsf{\alpha}^{-1}
$$
in particular
$$
\sigma(a_i) = C_ii
$$
(and the $C_{ij}$ are the co-variances).

#### Solution of the linear least-squares fitting problem with SVD
We need to solve the overdetermined system of $M$ coupled equations
\begin{align}
\sum_{j=1}^{M} \alpha_{kj} a_j &= \beta_k\\
\mathsf{\alpha}\mathbf{a} = \mathsf{\beta}
\end{align}

SVD finds $\mathbf{a}$ that minimizes
$$
\chi^2 = |\mathsf{A}\mathbf{a} - \mathbf{b}|
$$

The errors are
$$
\sigma^2(a_j) = \sum_{i=1}^{M} \left(\frac{V_{ji}}{w_i}\right)^2
$$

#### Example
Synthetic data with noise:

In [None]:
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
matplotlib.style.use('ggplot')

In [None]:
def signal(x, noise=0):
    r = np.random.uniform(-noise, noise, len(x))
    return 3*np.sin(x) - 2*np.sin(3*x) + np.sin(4*x) + r

In [None]:
X = np.linspace(-5, 5, 100)
Y = signal(X, noise=5)

In [None]:
plt.plot(X, Y, 'r-', X, signal(X, noise=0), 'k--')

In [None]:
def fitfunc(x, a):
    return a[0]*np.cos(x) + a[1]*np.sin(x) + \
           a[2]*np.cos(2*x) + a[3]*np.sin(2*x) + \
           a[4]*np.cos(3*x) + a[5]*np.sin(3*x) + \
           a[6]*np.cos(4*x) + a[7]*np.sin(4*x)

def basisfuncs(x):
    return np.array([np.cos(x), np.sin(x), 
                     np.cos(2*x), np.sin(2*x), 
                     np.cos(3*x), np.sin(3*x), 
                     np.cos(4*x), np.sin(4*x)])

In [None]:
M = 8
sigma = 1.
alpha = np.zeros((M, M))
beta = np.zeros(M)
for x in X:
    Xk = basisfuncs(x)
    for k in range(M):
        for j in range(M):
            alpha[k, j] += Xk[k]*Xk[j]
for x, y in zip(X, Y):
    beta += y * basisfuncs(x)/sigma

#### Solve with SVD 

... and plot

In [None]:
plt.plot(X, fitfunc(X, a_values), 'b-')
plt.plot(X, signal(X, noise=0), 'k--')