# Linear inverse problems in function spaces

Many of the notions discussed in the finite-dimensional setting can be extended to the infinite-dimensional setting. We will focus in this chapter on inverse problems where $K$ is a *bounded linear operator*. 

## Well-posedness
We may again wonder wether the equation $Ku = f$ is well-posed.
When $f \not\in \mathcal{R}(K)$, a solution doesn't exist and it makes sense to define the *minimum residual* solution $\widetilde{u}$ which satisfies

$$
\|K\widetilde{u} - f\|_{\mathcal{V}} \leq \|Kv - f\|_{\mathcal{V}},\quad\forall v \in \mathcal{U}.
$$

This is analogous to the least-squares solution we introduced previously. If the null-space of $K$ is non-empty, we can construct infinitely many such solutions. We call the one with the smallest norm the *minimum-norm* solution. Note however that we have not yet proven that such a solution exists in general, nor do we have a constructive way of finding it (in general). We will return to this issue in the next chapter when analysing variational problems. For now, we'll assume that $\mathcal{U}$ and $\mathcal{V}$ are Hilbert spaces and continue our exploration of the infinite-dimensional setting.

We will show that

* A minimum-residual solution only exists if $f \in \mathcal{R}(K)^\perp \oplus \mathcal{R}(K)$
* It obeys the \emph{normal equations} $K^*Ku = K^*f$
* The pseudo-inverse $K^{\dagger}: \mathcal{R}(K)^\perp \oplus \mathcal{R}(K) \rightarrow \mathcal{U}$ is unique and obeys certain useful relations:
    * $KK^\dagger K = K$
    * $K^\dagger K K^\dagger = K^\dagger$
    * $K^\dagger K = I - P_{\mathcal{N}(K)}$
    * $K^\dagger$ is continuous if $\mathcal{R}(K)$ is closed.
* The minimum-norm solution is unique and is given in terms of the pseudo-inverse $\widetilde{u} = K^{\dagger}f$

## Compact operators
To analyse this in detail, we use the singular value decomposition (SVD) of compact operators:

$$
Kw = \sum_{j=1}^{\infty} \sigma_j \langle w, u_j\rangle_{\mathcal{U}}v_j,
$$

where $\{u_i\}$ and $\{v_i\}$ are orthonormal bases of $\mathcal{N}(K)^\perp$ and $\overline{\mathcal{R}(K)}$ and 
$\{\sigma_i\}$ is a null-sequence. We call $\{(u_i, v_i, \sigma_i)\}$ the singular system of $K$.
The pseudo-inverse of $K$ is now expressed as

$$
K^{\dagger}f = \sum_{j=1}^{\infty} \sigma_j^{-1} \langle f, v_j\rangle_{\mathcal{V}}u_j.
$$

We have that $f \in \mathcal{R}(K)$ if the Picard condition holds:

````{admonition} Picard condition
```{math}
:label: picard
\sum_{j=1}^{\infty} \frac{|\langle f, v_j\rangle_{\mathcal{V}}|^2}{\sigma_j^2} < \infty.
```
````

## Regularisation 

### Truncation and Tikhonov regularisation

In the previous section we saw that the pseudo-inverse of a compact operator is not bounded (continuous). To counter this, we introduce the regularized pseudo-inverse:

$$
K_{\alpha}^{\dagger}f = \sum_{k=0}^{\infty} g_{\alpha}(\sigma_k) \langle f, u_k\rangle v_k,
$$

where $g_{\alpha}$ determines the type of regularization used. For Tikhonov regularisation we let

$$
g_{\alpha}(s) = \frac{s}{s^2 + \alpha}= \frac{1}{s + \alpha/s}.
$$

For a truncated SVD we let

$$
g_{\alpha}(s) = \begin{cases} s^{-1} & \text{if}\, s > \alpha \\ 0 & \text{otherwise} \end{cases}.
$$

Given noisy data $f^{\delta} = Ku + e$ with $\|e\| \leq \delta$, we can now study the effect of regularisation by studying the error. Introducing $u^{\delta,\alpha} = K_{\alpha}^\dagger f^{\delta}$, the total reconstruction error is now given by

$$
\|u^{\delta,\alpha} - u\| = \|K_{\alpha}^\dagger f^{\delta} - u\| \leq \|(I - K_{\alpha}^\dagger K)\|\|u\| + \delta \|K_{\alpha}^\dagger\|,
$$

in which we recognise the \emph{bias} and \emph{variance} contributions. We can re-write this in terms of the relative error as

$$
\frac{\|u^{\delta,\alpha} - u\|}{\|u\|} \leq \|(I - K_{\alpha}^\dagger K)\| + \delta \|K\|\|K_{\alpha}^\dagger\|.
$$

Note, however, that these upperbounds may be useless in practice and more detailed analysis incorporating the type of noise and the class of images $u$ that we are interested in is needed.

Two main question arise:

* For $\delta = 0$, does the bias converge to zero as $\alpha \downarrow 0$?
* How should one choose $\alpha$ for a given $\delta$ to minimize the total error?


### Generalised Tikhonov regularisation

We have seen in the finite-dimensional setting that Tikhonov regularization may be defined through a variational problem:

$$
\min_{u} \|Ku - f\|^2 + \alpha \|u\|^2.
$$

It turns out we can do the same in the infinite-dimensional setting as long as we use the correct (Hilbert-space) norm. Generalised Tikhonov regularisation is defined in a simular manner through the variation problem

$$
\min_{u} \|Ku - f\|^2 + \alpha \|Lu\|^2,
$$

where $L$ is a compact operator. In many applications, $L$ is a differential operator. This can be used to impose smoothness on the solution.

## Examples 

### Sequence operator
Consider the operator $K:\ell^2 \rightarrow \ell^2$, given by

$$
	u = (u_1,u_2,...) \mapsto (u_1,\frac{1}{2}u_2,\frac{1}{3}u_3,...),
$$

i.e. we have an infinite matrix operator of the form

$$
\left(Ku\right)_i = \sum_{j=1}^\infty a_{ij} u_j := i^{-1} u_i, \quad i = 1,2,...
$$

The operator is obviously linear. To show that is bounded we'll compute its norm:

$$
\|K\| = \sup_{u \neq 0} \frac{\|K(u)\|_{\ell^2}}{\|u\|_{\ell^2}}.
$$

We find ...

To show that the operator is compact, we explicitly construct its singular system.

Now consider obtaining a solution for $f_i = j^{-1}$. We would naively set $u_j = j \cdot f_j$, but the situation is a little bit more subtle.


Next, consider obtaining a solution for $f_i = j^{-2}$.


### Differentiation
Consider 

$$
Ku(x) = \int_0^x u(y)\mathrm{d}y.
$$

Given $f(x) = Ku(x)$ we would naively let $u(x) = f'(x)$. Let's analyse this in more detail.

The operator can be expressed as

$$
Ku(x) = \int_0^1 k(x,y)u(y)\mathrm{d}y,
$$

with $k(x,y) = H(x-y)$, where $H$ denotes the Heaviside stepfunction. The adjoint is found to be

$$
K^*f(y) = \int_0^1 k(x,y) f(x)\mathrm{d}x = \int_y^1 f(x)\mathrm{d}x.
$$

It is readily verified that the operator is bounded. Indeed, use Cauchy-Schwartz we find that for any $u$ we have

$$
\|Ku\|^2 = \int_0^1 \left(\int_0^1 k(x,y) u(y) \mathrm{d}y\right)^2\mathrm{d}x \leq \int_0^1 \|k(x,\cdot)\|^2\cdot \|u\|^2\mathrm{d}x.
$$

Computing for fixed $x$

$$
\|k(x,\cdot)\|^2 = \int_0^1 k(x,y)^2 \mathrm{d}y = \int_0^x \mathrm{d}y = x,
$$

we get

$$
\|Ku\|^2 \leq \|u\|^2 \int_0^1 x\, \mathrm{d}x = \frac{1}{2}\|u\|^2.
$$

Because the kernel is square-integrable, the operator is compact as well.

To derive the singular system, we first need to compute the eigenpairs $(\lambda_k, v_k)$ of $K^*K$. The singular system is then given by $(\sqrt{\lambda_k}, (\sqrt{\lambda_k})^{-1}Kv_k, v_k)$.

We find

$$
K^*Kv(y) = \int_y^1 \int_0^x v(z) \, \mathrm{d}z\mathrm{d}x = \lambda v(y).
$$

At $y = 1$ this yields $v(1) = 0$. Differentiating, we find

$$
\lambda v'(y) = -\int_0^x v(z)\mathrm{d}z,
$$

which yields $v'(0) = 0$. Differentiating once again, we find

$$
\lambda v''(x) = -v(x).
$$

The general solution to this differential equation is

$$
v(x) = a\sin(x/\sqrt{\lambda}) + b\cos(x/\sqrt{\lambda}).
$$

Using the boundary condition at $x = 0$ we find that $a = 0$. Using the boundary condition at $x = 1$ we get

$$
b\cos(1/\sqrt{\lambda}) = 0,
$$

which yields $\lambda_k = 1/((k + 1/2)^2\pi^2)$, $k = 0, 1, \ldots$. We choose $b$ to normalize $\|v_k\| = 1$. We thus find the singular system:

$$
\sigma_k = 1/((k+1/2)\pi), \quad u_k(x) = \sqrt{2}\sin(\sigma_k^{-1} x), \quad v_k(x) = \sqrt{2}\cos(\sigma_k^{-1} x).
$$

The operator can thus be expressed as

$$
Ku(x) = \sum_{k=0}^\infty \frac{\langle u, v_k\rangle}{(k+1/2)\pi} u_k(x),
$$

and the pseudo-inverse by

$$
K^{\dagger}f(x) = \sum_{k=0}^\infty \frac{\langle f, u_k\rangle}{(k+1/2)\pi} v_k(x).
$$

We can now study the ill-posedness of the problem by looking at the Picard condition

$$
\|K^\dagger f\|^2 = \pi^2\sum_{k=0}^\infty f_k^2 (k+1/2)^2,
$$

where $f_k = \langle f, u_k\rangle$ are the (generalized) Fourier coefficients of $f$.
For this infinite sum to converge, we need strong requirements on $f_k$; for example $f_k = 1/k$ does not suffice to make the sum converge. This is quite surprising since such an $f$ is square-integrable. It turns out we need $f_k = \mathcal{O}(1/k^2)$ to satisfy the Picard condition. Effectively this means that $f'$ needs to be square integrable. This makes sense since we saw earlier that $u(x) = f'(x)$ is the solution to $Ku = f$.

The SVD gives us a different view point on the example show before as well. Take measurements $f^{\delta} = Ku + \delta\sin(\delta^{-1}x)$, where $\delta = \sigma_k$ for some $k$. The error $K^\dagger f^{\delta} - u$ is then given by

$$
K^\dagger K u - u + \delta K^{\dagger}\sin(\delta^{-1}\cdot).
$$

Because $\delta^{-1} = \sigma_k$ and $\sin(\sigma_k^{-1}x)$ is a singular vector of $K$, this simplifies to

$$
\sin(\sigma_k^{-1}x).
$$

Thus, the reconstruction error does not go to zero as $\delta\downarrow 0$, even though the error in the data does.

The eigenvalues of $K_{\alpha}^\dagger K$ are given by $(1 + \alpha \sigma_k^{-2})^{-1}$, with $\sigma_k = (\pi(k + 1/2))^{-1}$. The bias is thus given by

$$
\|I - K_{\alpha}^\dagger K\| = \max_{k} \left|1 - (1 + \alpha \sigma_{k}^{-2})^{-1}\right|.
$$

Likewise, the variance is given by

$$
\|K\|\|K_{\alpha}^\dagger\| = \max_{k}\frac{\sigma_1}{\sigma_k + \alpha \sigma_{k}^{-1}}.
$$



## Exercises