# Inverse Problems Exercises: 2022s s05 (non-physics)
https://www.umm.uni-heidelberg.de/miism/

## Notes
* Please **DO NOT** change the name of the `.ipynb` file. 
* Please **DO NOT** import extra packages to solve the tasks. 
* Please put the `.ipynb` file directly into the `.zip` archive without any intermediate folder. 

## Please provide your personal information
* full name (Name): 

Maximilian Richter

## D04c: Gradient descent

In [None]:
import numpy as np
import matplotlib.pyplot as plt

from scipy.optimize import fminbound

In [None]:
file_gaussian = 'file_gaussian.npz'
with np.load(file_gaussian) as data:
    f_true = data['f_true']
    A_psf = data['A_psf']
    list_gn = data['list_gn']

### Imaging model
The imaging model can be represented by
$$
g = h \otimes f_\text{true}
= Af_\text{true}
= \mathcal{F}^{-1}\{ \mathcal{F}\{h\} \mathcal{F}\{f_\text{true}\} \},
$$
$$
g' = g + \epsilon.
$$
* $f_\text{true}$ is the input signal
* $h$ is the point spread function (kernel)
* $\otimes$ is the convolution operator
* $A$ is the Toeplitz matrix of $h$
* $\mathcal{F}$ and $\mathcal{F}^{-1}$ are the Fourier transform operator and inverse Fourier transform operator
* $\epsilon$ is the additive Gaussian noise
* $g$ is the filtered signal
* $g'$ is the noisy signal

### Mean squared error
Implement the mean squared error (MSE)
$$
\operatorname{MSE}(f)=\frac{1}{n}\sum_{i=1}^n(f_i - f_{\text{true}i})^2
$$
* Given the input signal $f$
* Given the true signal $f_\text{true}$
* Implement the function `mean_squared_error()` (using `numpy.array`)

In [None]:
def mean_squared_error(f, f_true):
    """ Compute the mean squared error comparing to the true signal:

    :param f: Input signal.
    :param f_true: True signal.
    :returns: Mean squared error.
    """
    return np.mean((f - f_true)**2)

In [None]:
# This cell contains hidden tests.


### Difference matrix
Implement the difference matrix $D_\text{diff}$
$$D_\text{diff} = \begin{bmatrix} 
1 & 0 & 0 & 0 & ... & 0 & -1 \\
-1 & 1 & 0 & 0 & ... & 0 & 0 \\
0 & -1 & 1 & 0 & ... & 0 & 0 \\
  &   &   & ... &   &   & \\
0 & 0 & 0 & 0 & ... & -1 & 1 \end{bmatrix}$$
* Given the size $n_\text{diff}$
* Implement the function `get_diff_matrix()` (using `numpy.array`)

In [None]:
def get_diff_matrix(n):
    """ Compute a matrix to calculate the difference along a vector of the size n
    between two neighboring elements.

    :param n: Size of the target vector.
    :returns: Matrix with shape (n, n), which calculates the difference.
    """
    diff_matrix = np.eye(n)
    diff_matrix[1:,:-1] += np.eye(n-1)*-1
    diff_matrix[0,n-1] = -1
    return diff_matrix

In [None]:
# This cell contains hidden tests.


### Tikhonov regularization
Implement the objective function with Tikhonov regularization
$$
L(f) = \|Af - g'\|_2^2 + \lambda\|D'f\|_2^2
$$
* Given the input signal $f$
* Given the system matrix $A$
* Given the measurement $g'$
* Given the regularization matrix $D'$
* Given the regularization parameter $\lambda$
* Implement the function `objective_tikhonov()` (using `numpy.array`)

Implement the closed form solution of the regularized objective function
$$
\tilde f = (A^T A + \lambda D'^T D')^{-1} A^T g' = A_\lambda^{PI} g'
$$
* Given the system matrix $A$
* Given the measurement $g'$
* Given the regularization matrix $D'$
* Given the regularization parameter $\lambda$
* Implement the function `solution_tikhonov()` (using `numpy.array`)

In [None]:
def objective_tikhonov(f, A, g, D, lb):
    """ Compute the objective function with Tikhonov regularization.
    
    :param f: Current estimate of the signal.
    :param A: 2D matrix of the linear problem.
    :param g: Observed signal.
    :param D: 2D matrix in the regularization term.
    :param lb: Regularization parameter.
    :returns: Objective function value.
    """
    return np.sum((A@f - g)**2) + lb * np.sum((D@f)**2)


def solution_tikhonov(A, g, D, lb):
    """ Compute the estimate of the true signal with Tikhonov regularization.

    Use a regularization term to suppress noise.

    :param A: 2d matrix A of the linear problem.
    :param g: Observed signal.
    :param D: 2D matrix in the regularization term.
    :param lb: Regularization parameter.
    :returns: Estimate of the true signal.
    """
    return np.linalg.inv(A.T @ A + lb * D.T @ D) @ A.T @ g

In [None]:
# This cell contains hidden tests.


In [None]:
# This cell contains hidden tests.


### Gradient magnitude solution
The gradient magnitude solution is the solution with $D' = D_\text{diff}$
* Calculate the closed form solution for the noisy signals in `list_gn`
* Return the outputs with $\lambda$ of $0.1$, $0.01$, $0.001$, respectively 
* Save the solutions in the variable `list_f_closed` (as `list` of `numpy.array`)
* Save the corresponding objective values in the variable `list_L_closed` (as `list` of scalars)

Display the result
* Plot the outputs in `list_f_closed` in the same order of the parameter options in the subplots of `axs`
* Show the cases of the same noisy signal in the same subplot column
* Show the cases with the same $\lambda$ in the same subplot row
* Plot the corresponding noisy signal in each subplot
* Plot the input signal `f_true` in each subplot
* Show the legend in each subplot
* Show the case information in the titles to the subplots
* Show the mean squared error of each output comparing to `f_true` in the titles to the subplots
* Show the objective function value of each output in the titles to the subplots

In [None]:
fig, axs = plt.subplots(3, 3, figsize=(15, 15))
fig.suptitle('Gradient magnitude solution (closed form)')

lbs = [0.1, 0.01, 0.001]
list_f_closed = []
list_L_closed = []

for i in range(3):
    for j in range(3):
        f_closed = solution_tikhonov(A_psf, list_gn[j], get_diff_matrix(A_psf.shape[0]), lbs[i])
        MSE = np.round(mean_squared_error(f_closed, f_true))
        objective = np.round(objective_tikhonov(f_closed, A_psf, list_gn[j], get_diff_matrix(A_psf.shape[0]),lbs[i]))
        
        list_f_closed.append(f_closed)        
        list_L_closed.append(objective)

        axs[i,j].plot(f_closed, label="$f_{est}$")
        axs[i,j].plot(list_gn[j], label="g")
        axs[i,j].plot(f_true, label="$f_{true}$", color="black")
        
        axs[i,j].set_title("$g_{}$, $\lambda ={}$, $MSE ={}$, $L(f) ={}$".format(j, lbs[i], MSE, objective))
        axs[i,j].legend()

In [None]:
# This cell contains hidden tests.


### Gradient descent technique
Gradient descent is an optimization method to find an $f$, which minimize the objective function $L(f)$.
One iterative update is given by
$$
f^{(i+1)} = f^{(i)} - s_i \nabla L(f^{(i)}),
$$
where $s_i$ is the optimal step size of the one-dimensional optimization problem
$$
s_i = \arg\min_{s\in \mathbb{R}^+} L(f^{(i)} - s \nabla L(f^{(i)})).
$$

Implement the iterative gradient descent updates
* Given the objective function $L(f)$
* Given the gradient of the objective function $\nabla L(f)$
* Given the initial value $f^{(0)}$
* Given the number of iterations $n$
* Estimating the optimal step size $s_i$ in $[0, 10]$ (using ```scipy.optimize.fminbound()```)
* Return the final value $f^{(n)}$ as the first output
* Return the history array of objective values $[L(f^{(0)}), ..., L(f^{(n)})]$ as the second output
* Implement the function `solve_gradient_descent_ls()` (using `numpy.array`)

In [None]:
def solve_gradient_descent_ls(objective_function, gradient_function, f0, n):
    """ 
    :param objective_function: objective function of f.
    :param gradient_function: gradient of the objective function of f.
    :param f0: Starting values for initializing the parameters f.
    :param n: Number of iterative gradient updates.
    :returns: Final f and an array of n + 1 objective values in the optimization history.
    """
    f = f0
    objective_values = [objective_function(f)]
    for i in range(n+1):
        s = lambda x: objective_function(f - x*gradient_function(f))
        s_min = fminbound(s, 0, 10)
        f = f - s_min * gradient_function(f)
        objective_values.append(objective_function(f))
    return f, np.array(objective_values)

In [None]:
# This cell contains hidden tests.


In [None]:
# This cell contains hidden tests.


### Tikhonov regularization with gradient descent 
Implement the gradient of the objective function with Tikhonov regularization
$$
\nabla L(f) = 2 A^T (Af - g') + 2 \lambda D'^T D'f
$$
* Given the input signal $f$
* Given the system matrix $A$
* Given the measurement $g'$
* Given the regularization matrix $D'$
* Given the regularization parameter $\lambda$
* Implement the function `gradient_tikhonov()` (using `numpy.array`)

The gradient magnitude solution is the solution with $D' = D_\text{diff}$
* Calculate the solution by gradient descent for the noisy signals in `list_gn`
* Return the outputs with $\lambda$ of $0.1$, $0.01$, $0.001$, respectively, with $f^{(0)}=0$, $n=20$
* Save the solutions in the variable `list_f_gd` (as `list` of `numpy.array`)
* Save the corresponding objective value history in the variable `list_L_gd` (as `list` of `numpy.array`)

Display the result
* Plot the outputs in `list_f_gd` in the same order of the parameter options in the subplots of `axs`
* Show the cases of the same noisy signal in the same subplot column
* Show the cases with the same $\lambda$ in the same subplot row
* Plot the corresponding noisy signal in each subplot
* Plot the input signal `f_true` in each subplot
* Show the legend in each subplot
* Show the case information in the titles to the subplots
* Show the mean squared error of each output comparing to `f_true` in the titles to the subplots
* Show the objective function value of each output in the titles to the subplots

In [None]:
def gradient_tikhonov(f, A, g, D, lb):
    """ Compute the gradient of the objective function with Tikhonov regularization.
    
    :param f: Current estimate of the signal.
    :param A: 2D matrix of the linear problem.
    :param g: Observed signal.
    :param D: 2D matrix in the regularization term.
    :param lb: Regularization parameter.
    :returns: Gradient value of the objective function.
    """
    return 2 * A.T @ (A @ f - g) + 2 * lb  * D.T @ D @ f

fig, axs = plt.subplots(3, 3, figsize=(15, 15))
fig.suptitle('Gradient magnitude solution (gradient descent)')

lbs = [0.1, 0.01, 0.001]
f0 = np.zeros(f_true.shape)
n = 20

list_f_gd = []
list_L_gd = []

for i in range(3):
    for j in range(3):
        objective = lambda x: objective_tikhonov(x, A_psf, list_gn[j], get_diff_matrix(A_psf.shape[0]), lbs[i])
        grad = lambda x: gradient_tikhonov(x, A_psf, list_gn[j], get_diff_matrix(A_psf.shape[0]), lbs[i])

        f_gd, L_gd = solve_gradient_descent_ls(objective, grad, f0, n)

        MSE = int(mean_squared_error(f_gd, f_true))

        axs[i,j].plot(f_gd, label="$f_{est}$", color="black")
        axs[i,j].plot(list_gn[j], label="g")
        axs[i,j].plot(f_true, label="$f_{true}$")
        
        axs[i,j].set_title("$g_{}$, $\lambda={}$, MSE={}, $L(f)={}$".format(j, lbs[i], MSE, int(L_gd[-1])))
        axs[i,j].legend()

        list_f_gd.append(f_gd)
        list_L_gd.append(L_gd)

In [None]:
# This cell contains hidden tests.


### Optimization history
Display the result
* Plot the arrays in `list_L_gd` as solid lines in the same order of the parameter options in the subplots of `axs`
* Plot the values in `list_L_closed` as horizontal dash lines in the same order of the parameter options in the subplots of `axs`
* Show the cases of the same noisy signal in the same subplots
* Make the subplots with log scaling on the y axis
* Show the legend in each subplot
* Show the case information in the titles to the subplots
* Show the mean squared error of each output comparing to `f_true` in the titles to the subplots
* Show the objective function value of each output in the titles to the subplots

In [None]:
fig, axs = plt.subplots(1, 3, figsize=(15, 5))
fig.suptitle('Gradient magnitude solution (gradient descent)')
fig.subplots_adjust(top=0.7)
colors = ["blue", "orange", "green"]
title = ["$g_{0}$", "$g_{1}$", "$g_{2}$"]

for i in range(3):
    for j in range(3):
        axs[j].plot(list_L_gd[i*3+j], label=" $\lambda={}$".format(lbs[i])) 
        axs[j].hlines(list_L_closed[i*3+j], 0, len(list_L_gd[0]), label="$\lambda={}$".format(lbs[i]), linestyle="--", color=colors[i])
        MSE = int(mean_squared_error(list_f_gd, f_true))

        title[j] = title[j] + f"\n$\lambda$={lbs[i]}: mse={MSE:.1f}, obj={list_L_gd[i*3+j][-1]:.1f}"
        axs[j].set_title(title[j])
        axs[j].set_yscale("log")
        axs[j].legend()

### Total variation
The objective function with total variation is
$$
L(f) = \|Af - g\|_2^2 + \lambda\|\nabla f\|_1
$$
The gradient of the objective function with total variation is
$$
\nabla L(f) 
\approx 2 A^T (Af - g) + \lambda \nabla \sum_{j=1}^{n} \sqrt{(f_j - f_{j-1})^2 + \beta^2}
= 2 A^T (Af - g) + \lambda \begin{bmatrix} 
r_1 \\
... \\
r_i \\
... \\
r_n \end{bmatrix},
$$
where $1 \gg \beta^2 > 0$ and
$$
r_i = \frac{f_i - f_{i-1}}{\sqrt{(f_i - f_{i-1})^2 + \beta^2}}
- \frac{f_{i+1} - f_{i}}{\sqrt{(f_{i+1} - f_{i})^2 + \beta^2}}
$$
with $f_{-1} = 0$ and $f_{n} = 0$.

* Given the input signal $f$
* Given the system matrix $A$
* Given the measurement $g'$
* Given the regularization parameter $\lambda$
* Implement the objective function `objective_tv()` (using `numpy.array`)
* Implement the gradient of the objective function with $\beta^2 = 0.001$ `gradient_tv()` (using `numpy.array`)

In [None]:
def objective_tv(f, A, g, lb):
    """ 
    :param f: Current estimate of the signal.
    :param A: 2d Matrix A of the linear problem.
    :param g: Observed signal.
    :param lb: Regularization strength of TV.
    :returns: Objective function value.
    """
    return np.sum((A@f - g)**2) + lb * np.sum(np.abs(get_diff_matrix(f.shape[0]) @ f))

def gradient_tv(f, A, g, lb):
    """ 
    :param f: Current estimate of the signal.
    :param A: 2d Matrix A of the linear problem.
    :param g: Observed signal.
    :param lb: Regularization strength of TV.
    :returns: Gradient value of the objective function.
    """
    beta = 0.001
    r = np.zeros(f.shape)
    f_pad = np.pad(f, 1)
    
    for i in range(1, f.shape[0]):
        r[i-1] = (f_pad[i] - f_pad[i-1]) / np.sqrt((f_pad[i]-f_pad[i-1])**2+beta**2) - (f_pad[i+1] - f_pad[i]) / np.sqrt((f_pad[i+1]-f_pad[i])**2+beta**2)
    
    return 2*A.T@(A @ f - g) + lb * r

In [None]:
# This cell contains hidden tests.


### Total variation with gradient descent 
Solve the objective function with total variation by gradient descent 
* Calculate the solution by gradient descent for the noisy signals in `list_gn`
* Return the outputs with $\lambda$ of $0.1$, $0.01$, $0.001$, respectively, with $f^{(0)}=0$, $n=20$
* Save the solutions in the variable `list_f_tv` (as `list` of `numpy.array`)
* Save the corresponding objective value history in the variable `list_L_tv` (as `list` of `numpy.array`)

Display the result
* Plot the outputs in `list_f_tv` in the same order of the parameter options in the subplots of `axs`
* Show the cases of the same noisy signal in the same subplot column
* Show the cases with the same $\lambda$ in the same subplot row
* Plot the corresponding noisy signal in each subplot
* Plot the input signal `f_true` in each subplot
* Show the legend in each subplot
* Show the case information in the titles to the subplots
* Show the mean squared error of each output comparing to `f_true` in the titles to the subplots
* Show the objective function value of each output in the titles to the subplots

In [None]:
fig, axs = plt.subplots(3, 3, figsize=(15, 15))
fig.suptitle('Total variation solution (gradient descent)')

lbs = [0.1, 0.01, 0.001]
f0 = np.zeros(f_true.shape)
n = 20

list_f_tv = []
list_L_tv = []

for i in range(3):
    for j in range(3):
        objective = lambda x: objective_tv(x, A_psf, list_gn[j], lbs[i])
        grad = lambda x: gradient_tv(x, A_psf, list_gn[j], lbs[i])

        f_tv, L_tv = solve_gradient_descent_ls(objective, grad, f0, n)

        MSE = int(mean_squared_error(f_tv, f_true))
        
        axs[i,j].plot(f_tv, label="$f_{est}$")
        axs[i,j].plot(list_gn[j], label="g")
        axs[i,j].plot(f_true, color="black", label="$f_{true}$")
        
        axs[i,j].set_title("$g_{}$, $\lambda={}$, MSE={}, $L(f)={}$".format(j, lbs[i], MSE, int(L_tv[-1])))
        axs[i,j].legend()

        list_f_tv.append(f_tv)
        list_L_tv.append(L_tv)

In [None]:
# This cell contains hidden tests.


### Optimization history
Display the result
* Plot the arrays in `list_L_tv` as solid lines in the same order of the parameter options in the subplots of `axs`
* Show the cases of the same noisy signal in the same subplots
* Make the subplots with log scaling on the y axis
* Show the legend in each subplot
* Show the case information in the titles to the subplots
* Show the mean squared error of each output comparing to `f_true` in the titles to the subplots
* Show the objective function value of each output in the titles to the subplots

In [None]:
fig, axs = plt.subplots(1, 3, figsize=(15, 5))
fig.suptitle('Total variation solution (gradient descent)')
fig.subplots_adjust(top=0.7)

title = ["$g_{0}$", "$g_{1}$", "$g_{2}$"]

for i in range(3):
    for j in range(3):
        MSE = int(mean_squared_error(list_f_gd[i*3+j], f_true))
        axs[j].plot(list_L_tv[i*3+j], label="$\lambda={}$".format(lbs[i]))
        title[j] = title[j] + f"\n$\lambda$={lbs[i]}: mse={MSE:.1f}, obj={list_L_tv[i*3+j][-1]:.1f}"
        axs[j].set_title(title[j])
        axs[j].set_yscale("log")
        axs[j].legend()