# Exercise 05:  Solving differential equation-based Bayesian inverse problems using CUQIpy

Here we build a Bayesian problem in which the forward model is a partial differential equation model, 1D Heat problem in particular.

## Learning objectives of this notebook:
- Solve PDE-based Bayesian problem in cuqi.
- Parametrization of the Bayesian parameters (e.g. KL expansion, non-linear maps).
- Introducing cuqi PDE class.

## Table of contents: 
* [1. Loading the PDE test problem](#PDE_model)
* [2. Building and solving the Bayesian inverse problem](#inverse_problem)
* [3. Parametrizing the Bayesian parameters to enforce positivity](#mapped_geometries)
* [4. (Optional) parametrizing the Bayesian parameters via step function expansion](#step_function)
* [5. (Optional) elaboration: the PDEmodel class](#PDE_model_elaborate)


##  1. Loading the PDE test problem <a class="anchor" id="PDE_model"></a>

We first import the required python standard packages

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from copy import deepcopy

From cuqi package, we import the classes that we use in this exercise

In [None]:
from cuqi.geometry import Continuous1D, MappedGeometry, KLExpansion
from cuqi.pde import SteadyStateLinearPDE
from cuqi.model import PDEModel
from cuqi.distribution import GaussianCov, Posterior, Gaussian, Cauchy_diff
from cuqi.sampler import CWMH, NUTS, pCN, MetropolisHastings
from cuqi.testproblem import Heat_1D
from cuqi.problem import BayesianProblem
from cuqi.samples import CUQIarray
from cuqi.operator import FirstOrderFiniteDifference
from cuqi.pde import SteadyStateLinearPDE

We the load the test problem `Heat_1D` which provides a one dimensional (1D) time dependent heat model. The unknown Bayesian parameters for this model is the initial heat profile. The data are the temperature measurements everywhere in the domain at the final time step.

We can explore the initialization parameters (and hence what can be passed to `get_components` method) of the `Heat_1D` test problem by calling `Heat_1D?`. We choose the following set up for the test problem:

In [None]:
Heat_1D?

In [None]:
N = 50   # number of finite difference nodes            
L = 1    # Length of the domain
T = 0.2  # Final time

model, data, problemInfo = Heat_1D.get_components(dim=N, endpoint=L, max_time=T, field_type = 'KL')

Lets take a look at what we obtain from the test problem. We view the `model`:

In [None]:
model

Note that here we choose the domain geometry to be of type 'KL'. This will represent the initial heat profile in terms of KL expansion (try `KLExpansion?` for more information). 
$$ u(x) = \sum_i p_i  (1/i)^{\text{decay}}  sin(\frac{i L x}{\pi}) $$

In [None]:
KLExpansion?

We can look at the data:

In [None]:
data

And the `problemInfo`:

In [None]:
problemInfo

Now lets plot the exact solution of this inverse problem and the exact and noisy data

In [None]:
problemInfo.exactSolution.plot()

In [None]:
l_exact_data = problemInfo.exactData.plot()
l_noisy_data = data.plot()
plt.legend([l_exact_data[0],l_noisy_data[0]] , ['exact data', 'noisy data'])

## 2. Building and solving the Bayesian inverse problem <a class="anchor" id="inverse_problem"></a>

Here we want to define the prior, the likelihood and the posterior distribution. We start by defining a prior random filed by discretizing a covariance function. The covariance function is defined as: 

In [None]:
var = 10
lc = 0.2
p = 2
C_YY = lambda x1, x2: var*np.exp( -(1/p) * (abs( x1 - x2 )/lc)**p )

To build the prior, we discretize the correlation function to obtain the correlation matrix `sigma`:

In [None]:
x = model.domain_geometry.grid
XX, YY = np.meshgrid(x, x, indexing='ij')
sigma_prior = C_YY(XX, YY)

We define the prior distribution as

In [None]:
mean = 0
prior = GaussianCov(mean*np.ones(N), sigma_prior, geometry= model.domain_geometry)

***
#### Try yourself (optional)
* create prior samples (~1 line).
* plot the 95% confidence interval of the prior samples (~1 line).
* look at the 95% confidence interval of the PDE model solution to quantify the forward uncertainty (~2 lines).
***

We then set up the likelihood. We obtain information about the noise distribution from `problemInfo.infoString`:

In [None]:
SNR = 200
sigma_likelihood = np.linalg.norm(problemInfo.exactData)/SNR
likelihood = Gaussian(mean=model, std=sigma_likelihood, corrmat=np.eye(N), geometry=model.range_geometry)

Now that we have all the components we need, we can create the posterior distribution:

In [None]:
posterior =  Posterior(likelihood, prior, data)

We can now sample the posterior. Lets try component-wise Metropolis Hastings:

In [None]:
MySampler = pCN(posterior,1)
posterior_samples,_ ,_ = MySampler.sample_adapt(10000, 1000)

Let's look at the samples:

In [None]:
posterior_samples.plot_ci(95, exact = problemInfo.exactSolution)
print([posterior])

We also can look at the samples in the KL expansion coefficient space:

In [None]:
prior.sample(500).plot_ci(95, plot_par = True, color = 'r')
posterior_samples.plot_ci(95, plot_par = True, color = 'b')
plt.xticks(np.arange(prior.dim)[::5]);

## 3. Parametrizing the Bayesian parameters to enforce positivity <a class="anchor" id="mapped_geometries"></a> 

Here we introduce the concept of mapped geometries. In many inverse problems, parametrization of the forward model problem through possible nonlinear functions might be needed. For example, in this 1D heat example, we want to enforce positivity of the thermal conductivity. We can use the parametrization $c = e^\kappa$ where $c$ is the thermal conductivity and $\kappa$ is the Bayesian parameters.  

In `CUQIpy`, this can be achieved through a `MappedGeometry` object. Lets update the exact solution, and the domain geometry and test this idea:  



In [None]:
mapped_exact_solution = np.log(problemInfo.exactSolution)
KL_geometry = model.domain_geometry
mapped_model = deepcopy(model)
mapped_model.domain_geometry = MappedGeometry(KL_geometry, map = lambda x : np.exp(x))

We, again, build the posterior distribution:

In [None]:
mapped_likelihood = Gaussian(mean=mapped_model, std=sigma_likelihood, corrmat=np.eye(N), geometry=model.range_geometry)
mapped_posterior =  Posterior(mapped_likelihood, prior, data)


And sample:

In [None]:
MySampler2 = pCN(mapped_posterior,1)
mapped_posterior_samples,_ ,_ = MySampler2.sample_adapt(10000, 1000)

Then plot the confidence interval:

In [None]:
mapped_posterior_samples.plot_ci(95, exact = problemInfo.exactSolution)

## 4. (Optional) parametrizing the Bayesian parameters via step function expansion <a class="anchor" id="step_function"></a>

Here we explore a different parameterization, where the thermal conductivity is represented by a step function with 3 degrees of freedom. The code for this problem will look like:

In [None]:
step_model, step_data, step_problemInfo = Heat_1D.get_components(dim=N, endpoint=L, max_time=T, field_type = 'Step')
step_prior = Gaussian(np.ones(3),1, geometry = step_model.domain_geometry)
step_likelihood = Gaussian(mean=step_model, std=sigma_likelihood, corrmat=np.eye(N), geometry=model.range_geometry)
step_posterior =  Posterior(step_likelihood, step_prior, step_data)

In [None]:
mapped_samples.plot_ci(95, exact= np.log(step_problemInfo.exactSolution))

step_problemInfo.exactSolution.plot()

Try it yourself:
* You can try to use pCN sampler to generate, lets say 10000, posterior sample and view the 100% confidence interval (~3 lines). Use sample_adapt.
* Try to enforce positivity of the posterior samples via the MappedGeometry and run the pCN sampler again (similar to part 3).

## 5. (Optional) elaboration: the PDEmodel class <a class="anchor" id="PDE_model_elaborate"></a>

Lets explore the model for PDE problems.

Try it yourself:

* View: `model`, `model.pde`, `model.pde.PDE_form`

We can create our own PDE model for simple wave poisson equation with zero boundaries for example: 

In [None]:
n_poisson = 1000
L = 1
dx = L/(n_poisson-1)
diff_operator = FirstOrderFiniteDifference(n_poisson,bc_type='zero').get_matrix().todense()/dx
source_term = np.zeros(n_poisson)
source_term[int(n_poisson/2)] = 1/dx 

poisson_form = lambda x: (diff_operator.T@diff_operator, x* source_term)
CUQI_pde = SteadyStateLinearPDE(poisson_form)
CUQI_pde.assemble(5)
sol = CUQI_pde.solve()

In [None]:
plt.plot(np.linspace(dx,L,n_poisson,endpoint=False),sol)

Try it yourself:

* Double the magnitude of the source term by editing the line `CUQI_pde.assemble(np.array([5]))` above. Look at the solution.