# Exercise 05:  Solving differential equation-based Bayesian inverse problems using CUQIpy

Here we build a Bayesian problem in which the forward model is a partial differential equation model, 1D Heat problem in particular.

**Try to at least run through part 1 to 4 before working on the optional exercises**

## Learning objectives of this notebook:
- Solve PDE-based Bayesian problem using CUQIpy.
- Parametrization of the Bayesian parameters (e.g. KL expansion, non-linear maps).
- Introducing CUQIpy's PDE class.

## Table of contents: 
* [1. Loading the PDE test problem](#PDE_model)
* [2. Building and solving the Bayesian inverse problem](#inverse_problem)
* [3. Parametrizing the Bayesian parameters via a general mapping  to enforce positivity](#mapped_geometries)
* [4. Parametrizing the Bayesian parameters via step function expansion](#step_function)
* [5. Parametrizing the Bayesian parameters via KL expansion](#KL_expansion) ★
* [6. Observe on part of the domain](#Partial_Observation) ★
* [7. elaboration: the PDEmodel class](#PDE_model_elaborate) ★


##  1. Loading the PDE test problem <a class="anchor" id="PDE_model"></a>

We first import the required python standard packages:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from math import floor
import sys
sys.path.append("../../CUQIpy/")

From cuqi package, we import the classes that we use in this exercise:

In [None]:
from cuqi.geometry import MappedGeometry, KLExpansion
from cuqi.pde import SteadyStateLinearPDE
from cuqi.distribution import GaussianCov, Posterior, Gaussian, JointDistribution
from cuqi.sampler import pCN, MetropolisHastings, CWMH
from cuqi.testproblem import Heat_1D
from cuqi.operator import FirstOrderFiniteDifference
from cuqi.pde import SteadyStateLinearPDE

We load the test problem `Heat_1D` which provides a one dimensional (1D) time dependent heat model with zero boundary conditions. The model is discretized using finite difference.

The PDE is given by:

$$ \frac{\partial u(x,t)}{\partial t} - c^2 \Delta_x u(x,t)   = f(x,t), \;\text{in}\;\Omega=[0,L] $$
$$u(0,t)= u(L,t)= 0 $$

where $u(x,t)$ is the temperature and $c^2$ is the thermal diffusivity (assumed to be 1 here). We assume the source term $f$ is zero. The unknown Bayesian parameters for this test problem is the initial heat profile $\theta(x):=u(x,0)$. The data $\mathbf{d}$ are the temperature measurements everywhere in the domain at the final time $T$.

We load `Heat_1D` using `get_components` method. We can explore `Heat_1D` initialization parameters (which are the same parameters that can be passed to `get_components` method) of the `Heat_1D` test problem by calling `Heat_1D?`. We choose the following set up for the test problem: Number of finite difference nodes N, length of domain L, and the final time T.

In [None]:
N = 30  # number of finite difference nodes            
L = 1    # Length of the domain
T = 0.05  # Final time

We choose the initial condition (the exact solution for the Bayesian problem) to be a step function with three pieces.

In [None]:
myExactSolution = np.zeros(N)
myExactSolution[:floor(N/3)+1] = 1
myExactSolution[floor(N/3)+1:floor(2*N/3)] = 2
myExactSolution[floor(2*N/3):] = 3

And now we load the `Heat_1D` problem providing our own exact solution:

In [None]:
model, data, problemInfo = Heat_1D.get_components(dim=N, endpoint=L, max_time=T, exactSolution=myExactSolution)

Lets take a look at what we obtain from the test problem. We view the `model`:

In [None]:
model

We can look at the returned `data`:

In [None]:
data

And the `problemInfo`:

In [None]:
problemInfo

Now lets plot the exact solution of this inverse problem and the exact and noisy data:

In [None]:
problemInfo.exactSolution.plot()
problemInfo.exactData.plot()
data.plot()
plt.legend(['exact solution', 'exact data', 'noisy data']);

Note that the values of the initial solution and the data at 0 and $L$ are not included in this plot.

## 2. Building and solving the Bayesian inverse problem <a class="anchor" id="inverse_problem"></a>

Here we want to define the prior $p(x)$, the data distribution $p(y|x)$ and the posterior distribution $p(x|y)$. We start by defining a simple Gaussian prior:

In [None]:
mean = 0
std = 1.2
x = Gaussian(mean*np.ones(N), std, geometry= model.domain_geometry) # The prior distribution


#### Try yourself (optional)
* create prior samples (~1 line).
* plot the 95% credibility interval of the prior samples (~1 line).
* look at the 95% credibility interval of the PDE model solution to quantify the forward uncertainty (~2 lines).


In [None]:
# Your code here



To define the data distribution $p(y|x)$, we first estimate the noise level. Because here we know the exact data, we can estimate the noise level as follows:

In [None]:
sigma_data = np.std(problemInfo.exactData - data)*np.ones(model.range_dim) # noise level

And then define the likelihood: 

In [None]:

y = Gaussian(mean=model, std=sigma_data, geometry=model.range_geometry)

Now that we have all the components we need, we can create the joint distribution $p(x,y)$, from which the posterior distribution can be created by setting $y=\texttt{data}$:

The joint distribution $p(x,y)$:

In [None]:
joint =  JointDistribution([y, x])
print(joint)

We then we set $y=\texttt{data}$:

In [None]:
posterior = joint(y=data)
print(posterior)


We convert the joint distribution to an object of type posterior (this is a temporary hack and in the near future samplers will be able to sample `JointDistributions` directly):

In [None]:
posterior = posterior._reduce_to_single_density() #TODO: eventually remove this line
print(posterior)

We can now sample the posterior. Lets try the preconditioned Crank-Nicolson (pCN) sampler (~60 seconds):

In [None]:
MySampler = pCN(posterior)
posterior_samples = MySampler.sample_adapt(20000)

Let's look at the $95\%$ credible interval:

In [None]:
posterior_samples.plot_ci(95, exact = problemInfo.exactSolution)

We can see that the mean reconstruction of the initial solution matches the general trend of the exact solution to some extent but it does not capture the piece-wise constant nature of the exact solution. Also, if we wish to assume that the initial solution must be positive, we see that the obtained mean and the credible interval has negative values. 

## 3. Parametrizing the Bayesian parameters via a general mapping to enforce positivity <a class="anchor" id="mapped_geometries"></a> 

Here we introduce the concept of mapped geometries. In many inverse problems, parametrization of the forward model input through possibly nonlinear functions might be needed. For example, in this 1D heat example, lets assume that we want to enforce positivity of the initial condition $u(x,0) =\theta(x)$. We can use the parametrization $u(x,0) = e^{\theta(x)}$, where $\theta$ is the Bayesian parameters (log initial condition). 

In `CUQIpy`, this can be achieved through a `MappedGeometry` object.

First let's create the mapping function:

In [None]:
map = lambda x : np.exp(x)

Then let's create a new version of the `Heat_1D` problem components in which we pass the `map` as a parameter to the `Heat_1D.get_components` method:  

In [None]:
model, data, problemInfo = Heat_1D.get_components(dim=N, endpoint=L, max_time=T, exactSolution = myExactSolution, map = map)

Then we repeat the same steps we followed in [section 2](#inverse_problem) to create the posterior distribution from the prior, the data distribution and the data.

In [None]:
# Prior
x = Gaussian(mean*np.ones(N), std, geometry= model.domain_geometry)

# Data distribution
sigma_data = np.std(problemInfo.exactData - data)*np.ones(model.range_dim) # noise level
y = Gaussian(mean=model, std=sigma_data, geometry=model.range_geometry)

# Joint distribution and posterior
joint =  JointDistribution([y, x])
posterior = joint(y=data)
posterior = posterior._reduce_to_single_density()

We create a sampler object to sample the posterior distribution (~65 seconds):

In [None]:
MySampler = pCN(posterior)
posterior_samples = MySampler.sample_adapt(20000)

And we then plot the credible interval:

In [None]:
posterior_samples.plot_ci(95, exact = problemInfo.exactSolution)

The parameterization `map= lambda x : np.exp(x)` ensures that the Bayesian parameters are indeed positive. The solution itself is still not satisfactory. In the next section, we try to improve the solution by incorporating more prior knowledge about the parameters.

## 4. Parametrizing the Bayesian parameters via step function expansion <a class="anchor" id=" step_function"></a> 

One way to improve the solution of this Bayesian problem is to use better prior information. Here we assume the prior is a step function with three pieces. This also makes the Bayesian problem simpler because now we only have three Bayesian parameters to infer.

To test this case we pass `field_type='Step'` to `Heat_1D.get_components`, which creates a `StepExpansion` domain geometry for the model during initializing the `Heat_1D` test problem.

In [None]:
N=30
n_steps = 3 # Number of steps in the StepExpansion geometry. 
#model, data, problemInfo = Heat_1D.get_components(dim=N, endpoint=L, max_time=T,field_type='Step', exactSolution = myExactSolution, n_steps=n_steps)
model, data, problemInfo = Heat_1D.get_components(dim=N, endpoint=L, max_time=T,field_type='Step', n_steps=n_steps)

Lets look at the model.domain_geometry in this case: 

In [None]:
model.domain_geometry

We then continue to create the Bayesian problem (prior, data distribution and posterior) with a prior of dimension = 3. 

In [None]:
# Prior
x = Gaussian(mean*np.ones(n_steps), std, geometry= model.domain_geometry)

# Data distribution
sigma_data = np.std(problemInfo.exactData - data)*np.ones(model.range_dim) # noise level
y = Gaussian(mean=model, std=sigma_data, geometry=model.range_geometry)

And the posterior:

In [None]:
joint =  JointDistribution([y, x])
posterior = joint(y=data)
posterior = posterior._reduce_to_single_density()

We then sample the posterior using pCN (~60 seconds)

In [None]:
MySampler = pCN(posterior)
posterior_samples = MySampler.sample_adapt(20000)

Let's take a look at the posterior:

In [None]:
posterior_samples.plot_ci(95, exact = problemInfo.exactSolution)
posterior_samples.shape

We show the trace plot: a plot of the kernel density estimator (left) and chains (right) of the 3 variables:

In [None]:
posterior_samples.plot_trace()

We show pair plot of 2D marginal posterior distributions: 

In [None]:
posterior_samples.plot_pair()

From the pair plot, we can see clear correlation between the variables.

We compute the effective sample size (ESS) which approximately gives the number of independent samples in the chain:

In [None]:
import arviz as az
az.ess(posterior_samples.to_arviz_inferencedata())

#### Try it yourself (optional):
* For this step function parametrization, try to enforce positivity of the posterior samples via passing `map = lambda x : np.exp(x)` to the `Heat_1D.get_components` method. Then run the pCN sampler again (similar to part 3).

In [None]:
# Your code here



## 5 Parametrizing the Bayesian parameters via KL expansion ★

Here we explore the Bayesian inversion for a more general exact solution. We parametrize the Bayesian parameters using Karhunen–Loève (KL) expansion. This will represent the inferred heat initial profile as a linear combination of sine functions. 
$$ u(x,0) = \sum_i \theta_i  (1/i)^{\text{decay}}  sin(\frac{i L x}{\pi}). $$
Where $\theta_i$ are the Bayesian parameters. 

Lets load the Heat_ID test case and pass `field_type = 'KL'`, which behind the scenes will set the domain geometry of the model to be a KL expansion geometry (`KLExpansion`):

In [None]:
N=35
model, data, problemInfo = Heat_1D.get_components(dim=N, endpoint=L, max_time=T, field_type = 'KL' )

Now we inspect the `model.domain_geometry`:

In [None]:
model.domain_geometry

And the exact solution and the data:

In [None]:
problemInfo.exactSolution.plot()
problemInfo.exactData.plot()
data.plot()
plt.legend(['exact solution', 'exact data', 'noisy data']);

Note that the exact solution here is a general signal that is not constructed from the basis functions. We define the prior $p(x)$:

In [None]:
sigma_prior = 9*np.ones(model.domain_dim) #1, 9
x = GaussianCov(mean*np.ones(N), sigma_prior, geometry= model.domain_geometry)

We define the data distribution:

In [None]:
sigma_data = np.std(problemInfo.exactData - data)*np.ones(model.range_dim) # noise level
y = Gaussian(mean=model, std=sigma_data, geometry=model.range_geometry).to_likelihood(data)

And the posterior distribution:

In [None]:
joint =  JointDistribution([y, x])
posterior = joint(y=data)
posterior = posterior._reduce_to_single_density()

We sample the posterior, here we use Component-wise Metropolis Hastings (~90 seconds):

In [None]:

MySampler = CWMH(posterior, x0=np.ones(N))
posterior_samples = MySampler.sample_adapt(1000)

And plot the $95\%$ credibility interval (you can try plotting different credibility intervals, e.g. $80\%$) 

In [None]:
posterior_samples.plot_ci(95, exact = problemInfo.exactSolution)

The credibility interval can have zero width at some locations where the upper and lower limit seems to intersect and switch order (uppers becomes lower and vice versa). To look into what actually happen here, we plot some samples:

In [None]:
posterior_samples_burnthin = posterior_samples.burnthin(0,10)
for i, s in enumerate(posterior_samples_burnthin):
    model.domain_geometry.plot(s)

The samples seem to paint a different picture than what the credibility interval plot shows. Note that the computed credibility interval above, is computed on the domain geometry parameter space, then converted to the function space for plotting. We can alternatively convert the samples to function values first, then compute and plot the credibility interval.

Convert samples to function values:

In [None]:
funvals_samples = posterior_samples.funvals

Then plot the credibility interval computed from the function values:

In [None]:
funvals_samples.plot_ci(95, exact = problemInfo.exactSolution)

We can see that the credibility interval now reflect what the samples plot shows and does not have these locations where the upper and lower bounds intersect.

Let's look at the effective sample size (ESS):

In [None]:
az.ess(posterior_samples.to_arviz_inferencedata())

We note that the ESS varies considerably among the variables. We can view the trace plot for, let's say, the first and the second variables:

In [None]:
posterior_samples.plot_trace([0,1])

A third way of looking at the credibility intervals, is to look at the expansion coefficients  $\theta_i$ credibility intervals. We plot the credibility intervals for these coefficients from both prior  and posterior samples by passing the flag `plot_par=True` to `plot_ci` function:

The prior:

In [None]:
plt.figure()
x.sample(1000).plot_ci(95, plot_par=True)
plt.xticks(np.arange(x.dim)[::5]);

The posterior:

In [None]:
posterior_samples.plot_ci(95, plot_par=True)
plt.xticks(np.arange(x.dim)[::5]);

## 6. Observe on part of the domain <a class="anchor" id="Partial_Observation"></a> ★

## 7. Elaboration: the PDEmodel class <a class="anchor" id="PDE_model_elaborate"></a> ★

Lets explore the model for PDE problems.

#### Try it yourself (optional):

* View: `model`, `model.pde`, `model.pde.PDE_form`

We can, for example, create our own PDE model for simple Poisson equation with zero boundaries. We first create the forward difference operator using the cuqi operator `FirstOrderFiniteDifference`.

In [None]:
n_poisson = 1000 #Number of nodes
L = 1 # Length of the domain
dx = L/(n_poisson-1) # grid spacing
diff_operator = FirstOrderFiniteDifference(n_poisson,bc_type='zero').get_matrix().todense()/dx

We then construct the source term (point source):

In [None]:
source_term = np.zeros(n_poisson)
source_term[int(n_poisson/2)] = 1/dx 

We create the PDE form which consists of the differential operator and the right hand side, and is a function of the Bayesian parameter x. 

In [None]:
poisson_form = lambda x: (diff_operator.T@diff_operator, x* source_term)

We create the CUQI PDEModel, in this case a `SteadyStateLinearPDE` model.

In [None]:
CUQI_pde = SteadyStateLinearPDE(poisson_form)

The model `CUQI_pde` has three main methods: 

1. assemble, which assembles the differential operator and the RHS given the Bayesian parameter x.
2. solve, which solves the PDE.
3. observe, for now observe returns the solution of the PDE, but it is to be generalized to apply observation operators on the PDE solution (e.g. extracting final temperature at specific or random points).

In the following we assemble and solve this Poisson problem.

In [None]:
CUQI_pde.assemble(5)
sol, info = CUQI_pde.solve()

And plot the solution:

In [None]:
plt.plot(np.linspace(dx,L,n_poisson,endpoint=False),sol)

#### Try it yourself (optional):

* Double the magnitude of the source term by editing the line `CUQI_pde.assemble(5)` above. Look at the solution.

In [None]:
# Your code here

