# Inverse Problems Exercises: 2022s s06 (non-physics)
https://www.umm.uni-heidelberg.de/miism/

## Notes
* Please **DO NOT** change the name of the `.ipynb` file. 
* Please **DO NOT** import extra packages to solve the tasks. 
* Please put the `.ipynb` file directly into the `.zip` archive without any intermediate folder. 

## Please provide your personal information
* full name (Name): 

Maximilian Richter

## D05b: Pseudo-inverse

In [None]:
import numpy as np
import matplotlib.pyplot as plt

from scipy.linalg import solve_sylvester

In [None]:
file_gaussian = 'file_gaussian.npz'
with np.load(file_gaussian) as data:
    f_true = data['f_true']
    A_psf = data['A_psf']
    list_gn = data['list_gn']

### Imaging model
The imaging model can be represented by
$$
g = h \otimes f_\text{true}
= Af_\text{true}
= \mathcal{F}^{-1}\{ \mathcal{F}\{h\} \mathcal{F}\{f_\text{true}\} \},
$$
$$
g' = g + \epsilon.
$$
* $f_\text{true}$ is the input signal
* $h$ is the point spread function (kernel)
* $\otimes$ is the convolution operator
* $A$ is the Toeplitz matrix of $h$
* $\mathcal{F}$ and $\mathcal{F}^{-1}$ are the Fourier transform operator and inverse Fourier transform operator
* $\epsilon$ is the additive Gaussian noise
* $g$ is the filtered signal
* $g'$ is the noisy signal

### Downsampling
* Implement the downsampling matrix
$$D_\text{ds} = \begin{bmatrix} 
1 & 0 & 0 & 0 & 0 & ... \\
0 & 0 & 1 & 0 & 0 & ... \\
0 & 0 & 0 & 0 & 1 & ... \\
  &   &   & ... & &     \end{bmatrix}_{n/2 \times n}$$
* Given the size $n_\text{ds}$
* Implement the function `get_downsampling_matrix()` (using `numpy.array`)

Prepare the data with downsampling
* Downsample the signal in `list_gn[1]` and save the output in the variable `gn_ds` (as `numpy.array`)
* Calculate the system matrix with downsampling with `A_psf` and save the output in the variable `A_ds` (as `numpy.array`)

In [None]:
def get_downsampling_matrix(n):
    """ Create downsampling matrix.

    :param n: Size of the input signal
    :returns: 2d matrix of size (n/2, n)
    """
    n_half = int(n/2)
    matrix = np.zeros((n_half, n))
    matrix[:,::2] = np.eye(n_half)
    return matrix

gn_ds = get_downsampling_matrix(list_gn[1].shape[0])@list_gn[1]
A_ds = get_downsampling_matrix(A_psf.shape[0])@A_psf
print(get_downsampling_matrix(A_psf.shape[0]))

In [None]:
# This cell contains hidden tests.


### Pseudo-inverse solutions
In the structure of Tikhonov regularization, a general solution can be written with a pseudo-inverse $A^{-I}$:
$$
\tilde f = A^{-I} g.
$$
$A^{-I}$ is a matrix satisfying the following expression:
$$
\alpha_1 (A^T A) A^{-I} + A^{-I} (\alpha_2 A A^T + \alpha_3 \operatorname{cov}(g)) = (\alpha_1 + \alpha_2) A^T,
$$
where $\alpha_1$, $\alpha_2$, $\alpha_3$ are the specific parameters. Especially,
* when $\alpha_1 = 1$, $\alpha_2 = 0$, $\alpha_3 = \lambda$, $\operatorname{cov}(g) = I$, it is the damped least squares
$$
A^{-I} = (A^T A + \lambda I) ^{-1} A^T ,
$$
* when $\alpha_1 = 0$, $\alpha_2 = 1$, $\alpha_3 = \lambda$, $\operatorname{cov}(g) = I$, it is the damped minimum length
$$
A^{-I} = A^T (A^T A + \lambda I) ^{-1} .
$$

Implement the pseudo-inverse calculation
* Given the system matrix $A$
* Given the covariance $\operatorname{cov}(g)$
* Given the parameter set $(\alpha_1, \alpha_2, \alpha_3)$
* Solving the Sylvester equation (using ```scipy.linalg.solve_sylvester()```)
* Implement the function `solve_pseudo_inverse()` (using `numpy.array`)

Calculation the pseudo-inverse solutions
* Calculate the solutions for the downsampled data `gn_ds` and `A_ds`
* Use $\operatorname{cov}(g) = I$
* Use 9 different parameter sets $(\alpha_1, \alpha_2, \alpha_3)$ as follows:
$$
\begin{array}{lll}
 (1, 0, 0),    &(0.5, 0.5, 0),    &(0, 1, 0) \\
 (1, 0, 0.01), &(0.5, 0.5, 0.01), &(0, 1, 0.01) \\
 (1, 0, 0.1),  &(0.5, 0.5, 0.1),  &(0, 1, 0.1) \\
\end{array}
$$
* Save the pseudo-inverse matrices in the variable `list_A_inv` (as `list` of `numpy.array`)
* Save the pseudo-inverse solutions in the variable `list_f_inv` (as `list` of `numpy.array`)

Display the result
* Plot the outputs in `list_f_inv` in the same order of the parameter options in the subplots of `axs`
* Plot the noisy signal `gn_ds` at the corresponding space coordinates in each subplot
* Plot the input signal `f_true` in each subplot
* Show the legend in each subplot
* Show the case information in the titles to the subplots

In [None]:
def solve_pseudo_inverse(A, cov_g, alpha):
    """ 
    :param A: System matrix.
    :param cov_g: Covariance of g.
    :param alpha: Array of 3 alpha values.
    :returns: Pseudo-inverse matrix.
    """
    a = alpha[0]*(A.T@A)
    b =  (alpha[1]*A@A.T + alpha[2]*cov_g)
    c = (alpha[0]+alpha[1])*A.T
    return solve_sylvester(a, b, c)
    
fig, axs = plt.subplots(3, 3, figsize=(15, 15))
fig.suptitle('Pseudo-inverse solutions')

alphas = [[1,0,0], [0.5, 0.5, 0], [0, 1, 0], [1, 0, 0.01], [0.5, 0.5, 0.01], [0, 1, 0.01], [1, 0, 0.1], [0.5, 0.5, 0.1], [0,1,0.1]]
g_cov = np.eye(160)
list_A_inv = []
list_f_inv = []

for i in range(3):
    for j in range(3):
        A_inv = solve_pseudo_inverse(A_ds, g_cov, alphas[i*3+j])
        f_inv = A_inv @ gn_ds

        list_A_inv.append(A_inv)
        list_f_inv.append(f_inv)
        
        axs[i,j].plot(f_true, label="$f_{true}$", color="black")
        axs[i,j].plot(np.arange(0,320)[::2],gn_ds, label="$g_n$ ds")
        axs[i,j].plot(f_inv, label="$f_{inv}$")
        axs[i,j].set_title("$\\alpha_1={}, \\alpha_2={}, \\alpha_3={}$".format(alphas[i*3+j][0], alphas[i*3+j][1], alphas[i*3+j][2]))
        axs[i,j].legend()



In [None]:
# This cell contains hidden tests.


In [None]:
# This cell contains hidden tests.


### Resolution matrices and covariance matrix
The related matrices, i.e. the data resolution matrix $N$, the model resolution matrix $R$ and the covariance matrix $\operatorname{cov}(f)$, are defined as follows:
$$
\begin{align*}
N &= A A^{-I} \\
R &= A^{-I} A \\
\operatorname{cov}(f) &= A^{-I} \operatorname{cov}(g) (A^{-I})^T \\
\end{align*}
$$
* Given $A$, $A^{-I}$, $\operatorname{cov}(g)$
* Implement the function `get_data_resolution_matrix()` (using `numpy.array`)
* Implement the function `get_model_resolution_matrix()` (using `numpy.array`)
* Implement the function `get_cov_f()` (using `numpy.array`)

Calculate the matrices
* Calculate the matrices for the pseudo-inverse matrices in `list_A_inv`
* Save the data resolution matrices $N$ in the variable `list_N` (as `list` of `numpy.array`)
* Save the model resolution matrices $R$ in the variable `list_R` (as `list` of `numpy.array`)
* Save the covariance matrices $\operatorname{cov}(f)$ in the variable `list_cov_f` (as `list` of `numpy.array`)

Display the result
* Plot the matrices in `list_N` as images in the same order of the parameter options in the subplots of `axs_N`
* Plot the matrices in `list_R` as images in the same order of the parameter options in the subplots of `axs_R`
* Plot the matrices in `list_cov_f` as images in the same order of the parameter options in the subplots of `axs_cov_F`
* Show the colorbar of each subplot
* Show the case information in the titles to the subplots

In [None]:
def get_data_resolution_matrix(A, A_I):
    """
    :param A: System matrix.
    :param A_I: Pseudo-inverse matrix.
    :returns: Data resolutin matrix.
    """
    return A@A_I

def get_model_resolution_matrix(A, A_I):
    """
    :param A: System matrix.
    :param A_I: Pseudo-inverse matrix.
    :returns: Model resolutin matrix.
    """
    return A_I@A

def get_cov_f(cov_g, A_I):
    """
    :param cov_g: Covariance of g.
    :param A_I: Pseudo-inverse matrix.
    :returns: Covariance of f.
    """
    return A_I@cov_g@A_I.T

list_N = [get_data_resolution_matrix(A_ds, A_inv) for A_inv in list_A_inv]
list_R = [get_model_resolution_matrix(A_ds, A_inv) for A_inv in list_A_inv]
list_cov_f = [get_cov_f(g_cov, A_inv) for A_inv in list_A_inv]

fig, axs_N = plt.subplots(3, 3, figsize=(15, 15))
fig.suptitle('data resolution matrix (N)')

for i in range(3):
    for j in range(3):
        im = axs_N[i,j].imshow(list_N[i*3+j], cmap="turbo")
        plt.colorbar(im, ax=axs_N[i,j],fraction=0.046, pad=0.04)
        axs_N[i,j].set_title("$\\alpha_1={}, \\alpha_2={}, \\alpha_3={}$".format(alphas[i*3+j][0], alphas[i*3+j][1], alphas[i*3+j][2]))

    
fig, axs_R = plt.subplots(3, 3, figsize=(15, 15))
fig.suptitle('model resolution matrix (R)')
fig.patch.set_facecolor('yellow')
fig.patch.set_alpha(0.3)

for i in range(3):
    for j in range(3):
        im = axs_R[i,j].imshow(list_R[i*3+j], cmap="turbo")
        plt.colorbar(im, ax=axs_R[i,j],fraction=0.046, pad=0.04)
        axs_R[i,j].set_title("$\\alpha_1={}, \\alpha_2={}, \\alpha_3={}$".format(alphas[i*3+j][0], alphas[i*3+j][1], alphas[i*3+j][2]))
    
fig, axs_cov_f = plt.subplots(3, 3, figsize=(15, 15))
fig.suptitle('covariance cov(f)')

for i in range(3):
    for j in range(3):
        im = axs_cov_f[i,j].imshow(list_cov_f[i*3+j], cmap="turbo")
        plt.colorbar(im, ax=axs_cov_f[i,j],fraction=0.046, pad=0.04)
        axs_cov_f[i,j].set_title("$\\alpha_1={}, \\alpha_2={}, \\alpha_3={}$".format(alphas[i*3+j][0], alphas[i*3+j][1], alphas[i*3+j][2]))
    
# * some points
#   - Check whether the colorbar is shown
#   - Check whether the titles are correct

In [None]:
# This cell contains hidden tests.


In [None]:
# This cell contains hidden tests.


### Characteristics
The characteristics, i.e. the spread of a matrix $\operatorname{spread}()$ and the signal size $\operatorname{size}()$, are defined as follows:
$$
\begin{align*}
\operatorname{spread}(B) &= \| B - I \|_2^2 = \sum_{ij}(B_{ij} - \delta_{ij})^2 \\
\operatorname{size}(f) &= \sum_i \operatorname{cov}(f)_{ii} \\
\end{align*}
$$
* Implement the function `get_spread()` (using `numpy.array`)
* Implement the function `get_size()` (using `numpy.array`)

Calculate the characteristics
* Calculate the characteristics for the matrices in `list_N`, `list_R`, `list_cov_f`
* Save the spread $\operatorname{spread}(N)$ in the variable `list_spread_N` (as `list`)
* Save the spread $\operatorname{spread}(R)$ in the variable `list_spread_R` (as `list`)
* Save the size $\operatorname{size}(f)$ in the variable `list_size_f` (as `list`)
* Save the case index corresponding to the minimal $\operatorname{spread}(N)$ in the variable `idx_spread_N` (as `scalar`)
* Save the case index corresponding to the minimal $\operatorname{spread}(R)$ in the variable `idx_spread_R` (as `scalar`)
* Save the case index corresponding to the minimal $\operatorname{size}(f)$ in the variable `idx_size_f` (as `scalar`)

Display the result
* Plot the value pairs in `list_spread_N` and `list_spread_R` as 2D scatter points in different colors in the left subplot of `axs`
* Plot the value pairs in `list_spread_R` and `list_size_f` as 2D scatter points in different colors in the middle subplot of `axs`
* Plot the value pairs in `list_size_f` and `list_spread_N` as 2D scatter points in different colors in the right subplot of `axs`
* Show the legend in each subplot
* Show the case information and point values in the legend
* Highlight the cases corresponding to the minimal values in the legend

In [None]:
def get_spread(B):
    """ 
    :param B: Input matrix.
    :returns: Spread of the input matrix.
    """
    return np.sum((B-np.eye(B.shape[0]))**2)

def get_size(cov_f):
    """ 
    :param cov_f: Covariance of f.
    :returns: Size of f.
    """
    return np.sum(np.trace(cov_f))
    
fig, axs = plt.subplots(1, 3, figsize=(15, 5))
fig.suptitle('Characteristics')

list_spread_N = []
list_spread_R = []
list_size_f = []

idx_spread_N = []
idx_spread_R = []
idx_size_f = []

for i in range(9):
    list_spread_N.append(get_spread(list_N[i]))
    list_spread_R.append(get_spread(list_R[i]))
    list_size_f.append(get_size(list_cov_f[i]))

    idx_spread_N.append(i)
    idx_spread_R.append(i)
    idx_size_f.append(i)

labels1 = ["({}, {}, {}), {:.2f}, {:.2f}".format(alphas[i][0], alphas[i][1], alphas[i][2], list_spread_N[i], list_spread_R[i]) for i in range(len(alphas))]#np.hstack((alphas, np.array(list_spread_N)[:,np.newaxis]))
labels2 = ["({}, {}, {}), {:.2f}, {:.2f}".format(alphas[i][0], alphas[i][1], alphas[i][2], list_spread_R[i], list_size_f[i]) for i in range(len(alphas))]#np.hstack((alphas, np.array(list_spread_N)[:,np.newaxis]))
labels3 = ["({}, {}, {}), {:.2f}, {:.2f}".format(alphas[i][0], alphas[i][1], alphas[i][2], list_size_f[i], list_spread_N[i]) for i in range(len(alphas))]#np.hstack((alphas, np.array(list_spread_N)[:,np.newaxis]))

scatter0 = axs[0].scatter(list_spread_N, list_spread_R, c=idx_spread_N, label=labels1, cmap="turbo")
scatter1 = axs[1].scatter(list_spread_R, list_size_f, c=idx_spread_N, label=labels2, cmap="turbo")
scatter2 = axs[2].scatter(list_size_f, list_spread_N, c=idx_spread_N, label=labels3, cmap="turbo")

axs[0].legend(handles=scatter0.legend_elements()[0], labels=labels1)
axs[1].legend(handles=scatter1.legend_elements()[0], labels=labels2)
axs[2].legend(handles=scatter2.legend_elements()[0], labels=labels3)

axs[0].set_title("$\\alpha_1,\\alpha_2,\\alpha_3, Spread(N), Spread(R)$")
axs[1].set_title("$\\alpha_1,\\alpha_2,\\alpha_3, Spread(R), Size(f)$")
axs[2].set_title("$\\alpha_1,\\alpha_2,\\alpha_3, Size(f), Spread(N)$")

In [None]:
# This cell contains hidden tests.


### Why are the results with $\lambda = 0.01$ the best according to the characteristics?

Because the Spread resembles the deviation of the matrix from a identity matrix. The case of $\lambda=0.01$ is the one with both, $N$ and $R$ resembling a identity matrix the best. A high $size(f)$ means overfit, which is the smallest and therefore best for $\lambda=0.01$. 