
<div hidden>

$\gdef\dd{\mathrm{d}}$

</div>

<div hidden>

$\gdef\abs#1{\left\vert#1\right\vert}$

</div>

<div hidden>

$\gdef\ve#1{\bm{#1}}$
</div>

<div hidden>

$\gdef\mat#1{\mathbf{#1}}$
</div>


# Sampling with CUQIpy

In this notebook, we explore uncertainty quantification for inverse problems through Bayesian inference using CUQIpy. We focus on exploring the sampling capabilities of CUQIpy, for two target distribution: 
- a bi-variate "donut" distribution,
- and a posterior distribution for a 1D Poisson-based BIP.

The former is used for illustrative purposes and is not associated with an inverse problem, while the latter is a more realistic example of a BIP.

## <font color=#CD853F> Contents of this notebook: </font>

## <font color=#CD853F> Learning objectives: </font> <a name="r-learning-objectives"></a>


The section marked with ★ is optional and can be skipped if you are short on time.

## References
[1] Gelman, Andrew, et al. "Bayesian workflow." arXiv preprint arXiv:2011.01808 (2020) https://arxiv.org/abs/2011.01808.

[2] Riis, Nicolai AB, et al. "CUQIpy--Part I: computational uncertainty quantification for inverse problems in Python." arXiv preprint arXiv:2305.16949 (2023) https://arxiv.org/abs/2305.16949.


In [None]:
from cuqi.distribution import DistributionGallery, Gaussian, JointDistribution
from cuqi.testproblem import Poisson1D
from cuqi.problem import BayesianProblem
import inspect
import numpy as np
import matplotlib.pyplot as plt
from cuqi.sampler import MH, CWMH
import time
import scipy.stats as sps


In [None]:

def plot2d(val, x1_min, x1_max, x2_min, x2_max, N2=201):
    # plot
    pixelwidth_x = (x1_max-x1_min)/(N2-1)
    pixelwidth_y = (x2_max-x2_min)/(N2-1)

    hp_x = 0.5*pixelwidth_x
    hp_y = 0.5*pixelwidth_y

    extent = (x1_min-hp_x, x1_max+hp_x, x2_min-hp_y, x2_max+hp_y)

    plt.imshow(val, origin='lower', extent=extent)
    plt.colorbar()


def plot_pdf_2D(distb, x1_min, x1_max, x2_min, x2_max, N2=201):
    N2 = 201
    ls1 = np.linspace(x1_min, x1_max, N2)
    ls2 = np.linspace(x2_min, x2_max, N2)
    grid1, grid2 = np.meshgrid(ls1, ls2)
    distb_pdf = np.zeros((N2,N2))
    for ii in range(N2):
        for jj in range(N2):
            distb_pdf[ii,jj] = np.exp(distb.logd(np.array([grid1[ii,jj], grid2[ii,jj]]))) 
    plot2d(distb_pdf, x1_min, x1_max, x2_min, x2_max, N2)

## <font color=#CD853F> The "dount" distribution </font> <a name="r-donut"></a>

In CUQIpy, we provide a set of bi-variate distributions for illustrative purposes. One of these is the "donut" distribution, which is a bi-variate distribution of a donut-shaped. The distribution is defined as follows:

$$

\begin{aligned}
log (p(\mathbf{x})) \propto - \frac{1}{\sigma_\text{donut}^2} \left( \left\| \mathbf{x} \right\| - r_\text{donut} \right)^2

\end{aligned}

$$

Where $\mathbf{x} = (x_1, x_2)$ is a 2D vector, $\left\| \mathbf{x} \right\|$ is the Euclidean norm of $\mathbf{x}$, $r_\text{donut}$ is the radius of the donut, and $\sigma_\text{donut}$ is a scalar value that controls the width of the "donut".

To load the "donut" distribution, we use the following:

In [None]:

target_donut = DistributionGallery("donut")

print(target_donut)

We can plot the distribution probability density function (pdf):

In [None]:
plot_pdf_2D(target_donut, -4, 4, -4, 4)


## <font color=#CD853F> A 1D Poisson-based BIP </font> <a name="r-donut"></a>

##### <font color=#8B4513> The forward model </font> <a name="r-forward-model"></a>

Consider a heat conductive rod of length $L = \pi$ with a varying conductivity (the conductivity of the rod changes from point to point). We fix the temperature at the end-points of the rod and apply a heat source distributed along the length of the rod. We wait until the rod reaches an equilibrium temperature distribution. The equilibrium temperature of the rod is modelled using the second order steady-state PDE as

$$
\left\{
\begin{aligned}
& \dfrac{\dd}{\dd \xi}\left(u(\xi) \dfrac{\dd y(\xi)}{\dd \xi}\right) = -f(\xi), \quad & \xi\in (0,L) \\
& y(0) = y(L) = 0.
\end{aligned}
\right.
$$
Here, $y$ represents the temperature distribution along the rod, $u(\xi) $ is the unknown conductivity of the rod and $f(\xi)$ is a deterministic heat source given by

$$
\begin{aligned}
	f(\xi) = 10\exp( -\frac{ (\xi - L/2)^2} {0.02} ).
\end{aligned}
$$

To ensure that the conductivity of the rod is non-negative, we parameterize $u$ by a random variable $x$ as follows:
 
$$
 \begin{aligned}
 u( \cdot  ) = \exp( x( \cdot  ) )
 \end{aligned}
$$
where $x$ is not necessarily positive.

Let us load the forward model that maps the random variable $x$ to the temperature distribution $y$ in CUQIpy. We will use the following parameters:
* `dim` : number of discretization points for the rod
* `L` : length of the rod
* `f` : a function that represents the heat source

In [None]:
dim = 201
L = np.pi

The source term represents spikes at four locations `xs` with weight `ws`

In [None]:
xs = np.array([0.2, 0.4, 0.6, 0.8])*L
ws = 0.8
sigma_s = 0.05
def f(t):
    s = np.zeros(dim-1)
    for i in range(4):
        s += ws * sps.norm.pdf(t, loc=xs[i], scale=sigma_s)
    return s

Let us plot the source term for visualization:

In [None]:
temp_grid = np.linspace(0, L, dim-1)
plt.plot(temp_grid, f(temp_grid))

Then we can load the 1D Poisson forward model as follows:

In [None]:
A, _, _ = Poisson1D(dim=dim, 
                    endpoint=L,
                    field_type='KL',
                    field_params={'num_modes': 10} ,
                    map=lambda x: np.exp(x), 
                    source=f).get_components()

We print the forward model to see its details.

In [None]:
A

Let us look at the `pde` property of the forward model:

In [None]:
A.pde

We can look at the domain and range geometries of the forward model.

In [None]:
print(A.domain_geometry)
print(A.range_geometry)

And inspect the domain geometry further. Let us look at the mapping in the `MappedGeometry` object.

In [None]:
print(inspect.getsource(A.domain_geometry.map))

We can extract the underlying geometry of the `MappedGeometry` object.

In [None]:
underlying_geometry = A.domain_geometry.geometry
print(underlying_geometry)

The underlying geometry represents a Karhunen–Loève (KL) expansion of a random field. Let us look at some of the properties of this `underlying_geometry` such as the number of modes in the KL expansion and the grid on which the KL expansion basis functions are defined.

In [None]:
print(A.domain_geometry.geometry.num_modes)
print(A.domain_geometry.geometry.grid)

The range geometry is of type `Continuous1D` which represents a 1D continuous signal/field defined on a grid. We can view the grid:

In [None]:
print(A.domain_geometry.grid)

Additionally, the properties `domain_dim` and `range_dim` of the forward model represent the dimension of the input and output of the forward model, respectively.

In [None]:
print(A.domain_dim)
print(A.range_dim)

##### <font color=#8B4513> The BIP: the prior </font> <a name="r-forward-model"></a>

We now build a posterior distribution based on this forward model. The unknown  $\mathbf{x}$ represents the coefficients in the KL expansion. We assume that the prior distribution of $\mathbf{x}$ is an i.i.d Gaussian distribution with mean $0$ and variance $\sigma_x^2$.

In [None]:
sigma_x = 30
x = Gaussian(0, sigma_x**2, geometry=A.domain_geometry)


Let us assume that the true solution we want to infer is a sample from `x`. Note: we fix the random seed for reproducibility.

In [None]:
np.random.seed(12)
x_true = x.sample()

##### <font color=#8B4513> Exercise </font> <a name="r-forward-model"></a>
- Visualize `x_true` in the KL coefficient space. Hint: try `x_true.plot(plot_par=True)`
- Visualize `x_true` in the corresponding function space (after applying the linear combination of KL basis weighted by KL vectors and then applying the exponantial mapping). Hint: all this can be achieved in one line by `x_true.plot(plot_par=False)`

In [None]:
# your code here

##### <font color=#8B4513> The BIP: the likelihood </font> <a name="r-forward-model"></a>

We assume the data we obtain is a noisy measurement of the temperature $y$ over the interval $[0, L]$ in all grid points. The measurements form a vector $\mathbf{y}$. The noise is assumed to be additive Gaussian noise with mean $0$ and variance $\sigma_y^2$.

$$
\mathbf{y} = \mathbf{A}(\mathbf{x}) + \epsilon  \quad \text{where} \quad \epsilon \sim \mathcal{N}(0, \sigma_y^2).

$$

We define the data distribution $p(\mathbf{y} | \mathbf{x})$ in `CUQIpy` in this case as

In [None]:
sigma_y = np.sqrt(0.001)
y = Gaussian(A(x), sigma_y**2, geometry=A.range_geometry)

We create a synthetic data to use it to test solving our BIP.  We denote this data as $\mathbf{y}_{\text{obs}}$ which is a particular observed data realization from a setup where the KL coefficients are `x_true`. To create this data in `CUQIpy`, we use the following:

In [None]:
y_obs = y(x=x_true).sample()

Let us plot the true conductivity field, corresponding to `x_true`, the data `y_true` without noise i.e. `A(x_true)`, and the noisy data `y_obs` in the same plot.

In [None]:
y_obs.plot(label='y_obs')
A(x_true).plot(label='y_true')
x_true.plot(label='x_true')
plt.legend()

##### <font color=#8B4513> The BIP: the posterior distribution  (the high level approach: using the BayesianProblem class)</font> 

The posterior distribution of the Bayesian inverse problem in this case is given by

$$
\begin{align*}
p(\mathbf{x} \mid \mathbf{y}=\mathbf{y}_\mathrm{obs}) \propto L(\mathbf{x} \mid \mathbf{y}=\mathbf{y}_\mathrm{obs})p(\mathbf{x}),
\end{align*}
$$

where we use the notation $L(\mathbf{x} \mid \mathbf{y}=\mathbf{y}_\mathrm{obs}) := p(\mathbf{y}=\mathbf{y}_\mathrm{obs} \mid \mathbf{x})$ for the likelihood function to emphasize that, in the context of the posterior where $\mathbf{y}$ is fixed to $\mathbf{y}_\mathrm{obs}$, it is a function of $\mathbf{x}$ and not on $\mathbf{y}$. In CUQIpy we sometimes use the short-hand printing style `L(x|y)` for brevity.



The simplest way to sample a Bayesian inverse problem in CUQIpy is to use the [BayesianProblem class](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.problem/cuqi.problem.BayesianProblem.html#cuqi.problem.BayesianProblem).

Using the BayesianProblem class, one can easily define and sample from the posterior distribution of a Bayesian inverse problem by providing the distributions for the parameters and data and subsequently setting the observed data.

In [None]:
BP_poisson = BayesianProblem(x, y)      # Create Bayesian problem
BP_poisson.set_data(y=y_obs)           # Provide observed data

In the above example, we provided our assumptions about the data generating process by defining the distributions for the parameters and data and provided the observed data for the problem. `CUQIpy` internally created the posterior distribution using the provided distributions and data. 

We can use this object to sample from the posterior distribution using the `UQ` method, which we will experiment with in this exercise:



##### <font color=#8B4513> Exercise </font> <a name="r-forward-model"></a>
Use the `UQ` method of the `BP_poisson` object to sample the posterior distribution. The `UQ` returns a `Samples` object, store the result in a variable called `BP_poisson_samples`.

In [None]:
# your code here

In the previous exercise we saw that `CUQIpy` automatically decided on using a sampler, preconditioned Crank Nicolson `pCN` in this case, and sampled the posterior distribution. Additionally, the  credibility interval for the parameter $\mathbf{x}$ as well as the mean of the posterior was plotted and compared to the ground truth (`x_true`).

**Note about visualizing the credible interval**:
Using the `UQ` method, the credibility interval is computed for the KL coefficients. Then mapped to the function space and plotted. We can also compute the credibility interval directly on the function values. We will revisit this at a later stage.


In the next section, we show how to define the posterior distribution more explicitly. 

##### <font color=#8B4513> The BIP: the posterior distribution  (the low level approach: using the JointDistribution)</font> 


To define the posterior distribution explicitly in CUQIpy, we first define the joint distribution $p(\mathbf{y},\mathbf{x})$, then we supply the observed data to create the conditional distribution $p(\mathbf{x} \mid \mathbf{y}=\mathbf{y}_\mathrm{obs})$.


Let us first define the joint distribution $p(\mathbf{y},\mathbf{x})$ in CUQIpy. We use the following:

In [None]:
# Define joint distribution p(y,x)
joint = JointDistribution(y, x)

Calling `print` on the joint distribution gives a nice overview matching the mathematical description of the joint distribution.

In [None]:
print(joint)

CUQIpy can automatically derive the posterior distribution for any joint distribution when we pass the observed data as an argument to the "call" (condition) method of the joint distribution.  This is done as follows:

In [None]:
posterior = joint(y=y_obs) # Condition p(x,y) on y=y_data. Applies Bayes' rule automatically


We can now inspect the posterior distribution by calling `print` on it. Notice that the posterior equation matches the mathematical expression we showed above.

In [None]:
print(posterior)


##### <font color=#8B4513> Exercise </font> <a name="r-forward-model"></a>
- The posterior is essentially just another CUQIpy distribution. Have a look at the [Posterior class](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.distribution/cuqi.distribution.Posterior.html) in the online documentation to see what attributes and methods are available.

- Try evaluating the posterior log probability density function (logpdf) and pdf at some points say `x_true` and `x_true*1.1`.


In [None]:
# Your code here

# 1. Defining and sampling a Bayesian Inverse Problem (high-level)<a class="anchor" id="BIP"></a>

Solving a Bayesian inverse problem amounts to characterizing the posterior distribution.

The posterior describes the probability distribution of the parameters we are interested in. This is achieved by combining prior knowledge of the parameters and observed data. In its most general form, the posterior is given by Bayes' theorem:

\begin{align*}
p(\boldsymbol{\theta} \mid \mathbf{y}) = \frac{p(\mathbf{y} \mid \boldsymbol{\theta})p(\boldsymbol{\theta})}{p(\mathbf{y})} \propto p(\mathbf{y} \mid \boldsymbol{\theta})p(\boldsymbol{\theta}),
\end{align*}

where $\boldsymbol{\theta}$ is the parameter vector of *all* the parameters we are interested in inferring and $\mathbf{y}$ is the observable data. In the simplest case one could have a single parameter vector of interest, say $\boldsymbol{\theta}=[\mathbf{x}]$.

The probability density function $p(\boldsymbol{\theta})$ is the prior distribution of the parameters.

Given fixed observed data $\mathbf{y}_\mathrm{data}$, the term $p(\mathbf{y} \mid \boldsymbol{\theta})$ considered as a function of $\boldsymbol{\theta}$ is known as the
likelihood function or just *likelihood*, also denoted $L(\boldsymbol{\theta} \mid \mathbf{y} = \mathbf{y}_\mathrm{data})$, which we note is not a probability density but a density function.

When $\mathbf{y}$ is not fixed, $p(\mathbf{y} \mid \boldsymbol{\theta})$ is a probability density function of the data $\mathbf{y}$ given the value of the parameters $\boldsymbol{\theta}$. In CUQIpy we refer to this distribution as the *data distribution*.

The denominator $p(\mathbf{y})$ is the *evidence* and is a normalization constant (that we typically ignore because it does not affect the MCMC sampling) that ensures that the posterior integrates to 1.


### Note on Bayesian inverse problems with CUQIpy

CUQIpy uses a general approach to Bayesian modeling that not only aims to define the posterior distribution, but instead to define the joint distribution of all the parameters. This more general approach is useful because it allows one to carry out more tasks related to uncertainty quantification of inverse problems such as prior-predictive checks, model checking, posterior-predictive checks and more. For more details on some of these topics see the overview in [1].

In this notebook, we initially focus on how to define and sample a Bayesian inverse problem using the high-level interface in CUQIpy. We then later show a more low-level approach to defining the posterior distribution, which is useful for users who want to have more control the type of sampler used for sampling the posterior distribution.

## 1.1 Deterministic forward model and observed data
Consider a Bayesian inverse problem
$$
\mathbf{y}=\mathbf{A}\mathbf{x} + \mathbf{e},
$$

where $\mathbf{A}: \mathbb{R}^n \to \mathbb{R}^m$ is the (deterministic) forward model of the inverse problem and $\mathbf{y}$ and $\mathbf{x}$ are random variables representing the observed data and parameter of interest respectively. Here $\mathbf{e}$ is a random variable representing additive noise in the data.

For this example let us consider the `Deconvolution1D` testproblem and extract a CUQIpy forward model and some synthetic data denoted $\mathbf{y}_\mathrm{data}$ (a realization of $\mathbf{y}$). 

Note that this is a linear inverse problem, but the same approach can be used for nonlinear inverse problems.

In [None]:
# Load forward model, data and problem information
A, y_data, probInfo = Deconvolution1D(phantom="sinc").get_components()

# For convenience, we define the dimension of the domain of A
n = A.domain_dim

# For convenience, we extract the exact solution as x_exact
x_exact = probInfo.exactSolution


Before going further let us briefly visualize the data and compare with the exact solution to the problem.

Here we should expect to see that the data is a convolved version of the exact solution with some added noise. We can also inspect the `probInfo` variable to get further information about the problem.

In [None]:
# Plot the data
plt.subplot(121); x_exact.plot(); plt.title('Exact Solution')
plt.subplot(122); y_data.plot(); plt.title('Data')

# Print information about the problem
print(probInfo)


### Note on notation

It is common (for convenience in terms of notation) not to explicitly write the dependance of each random variable when specifying a complete Bayesian problem. For example, for the case above one would often write
\begin{align*}
\mathbf{x} &\sim \mathrm{GMRF}(\mathbf{0}, d)\\
\mathbf{y} &\sim \mathrm{Gaussian}(\mathbf{A}\mathbf{x}, s_\mathbf{y}^2 I),
\end{align*}

where the dependance of $\mathbf{y}$ on $\mathbf{x}$ is implicit.

This compact notation completely specifies the Bayesian problem for the so-called *data generating process*, making clear all the assumptions about the data and parameters.

In CUQIpy - when all deterministic parameters and forward models are defined - the Bayesian problem is written in code using almost exactly the same syntax as the mathematical notation above.

In [None]:
# Bayesian problem (repeated in case previous cells were modified)
x = GMRF(np.zeros(n), 50)
y = Gaussian(A@x, s_y**2)


# 2.3 Sampling the posterior

Now that we have defined the posterior distribution for our parameter of interest $\mathbf{x}$ given $\mathbf{y}_\mathrm{data}$, we can characterize the parameter and its uncertainty by samples from the posterior distribution. However, in general the posterior is not a simple distribution that we can easily sample from. Instead, we need to rely on Markov Chain Monte Carlo (MCMC) methods to sample from the posterior.

In CUQIpy, a number of MCMC samplers are provided in the sampler module that can be used to sample probability distributions. All samplers have the same signature, namely `Sampler(target, ...)`, where `target` is the target CUQIpy distribution and `...` indicates any (optional) arguments.

In the case of the posterior above, which is defined from a linear model and Gaussian likelihood and prior, the Linear Randomize-then-Optimize [(LinearRTO)](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.sampler/cuqi.sampler.LinearRTO.html#cuqi.sampler.LinearRTO) sampler is a good choice to efficiently generate samples. This is also the sampler chosen by the BayesianProblem class for this problem.

 Like any of the other samplers, we set up the sampler by simply providing the target distribution.

In [None]:
sampler = LinearRTO(posterior)


After the sampler is defined we can compute samples via the `sample` method.

In [None]:
samples = sampler.sample(500)


Similar to directly sampling distributions in CUQIpy, the returned object is a `cuqi.samples.Samples` object.

This object has a number of methods available. In this case, we are interested in evaluating if the sampling went well. To do this we can have a look at the chain history for 2 different parameters.

In [None]:
samples.plot_chain([30, 45]);


In both cases the chains look very good with no discernible difference between the start and end of the chain. This is a good indication that the chain has converged and there is little need for removing samples that are part of a "burn-in" period. In practice, the samples should be inspected with more rigor to ensure that the MCMC chain has converged, but this is outside the scope of this notebook.

The good sampling is in large part due to the LinearRTO sampler, which is built specifically for the type of problem of this example. For the sake of presentation let us remove the first 100 samples using the `burnthin` method (see [samples.burnthin](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.samples/cuqi.samples.Samples.burnthin.html#cuqi.samples.Samples.burnthin)) and store the "burnthinned" samples in a new variable.

In [None]:
samples_final = samples.burnthin(Nb=100)


Finally, we can plot a credibility interval of the samples and compare to the exact solution (from `probInfo`).

This is what the `UQ` method of the BayesianProblem class did under the hood for us earlier.

In [None]:
samples_final.plot_ci(95, exact=probInfo.exactSolution)


### Trying out other samples

The LinearRTO sampler can only sample Gaussian posteriors that also have an underlying linear model.

It is possible to try out other CUQIpy samplers (which also work for a broader range of problems). For example:

* **[pCN](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.sampler/cuqi.sampler.pCN.html#cuqi.sampler.pCN)** - preconditioned Crank-Nicolson sampler.
* **[CWMH](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.sampler/cuqi.sampler.CWMH.html)** - Component-wise Metropolis-Hastings sampler.
* **[ULA](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.sampler/cuqi.sampler.ULA.html)** - Unadjusted Langevin Algorithm.
* **[MALA](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.sampler/cuqi.sampler.MALA.html)** - Metropolis Adjusted Langevin Algorithm.
* **[NUTS](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.sampler/cuqi.sampler.NUTS.html)** - No U-Turn Sampler: A variant of the Hamiltonian Monte Carlo sampler well-established in literature.

Note in particular that ULA, MALA and NUTS all require the gradient of the logpdf. This is handled automatically in CUQIpy for linear models.

#### ★ Try yourself (optional):  

Try sampling the posterior above using one of the suggested samplers (click the links to look at the online documentation for the sampler to get more info on it).

Compare results (chain, credibility interval etc.) to the results from LinearRTO.

All the suggested samplers (except NUTS) will likely require > 5000 samples to give reasonable results, and for some playing with step sizes (scale) is needed. This is because they are not as efficient as LinearRTO or NUTS. For some samplers, the method [sample_adapt](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.sampler/cuqi.sampler.NUTS.sample_adapt.html#cuqi.sampler.NUTS.sample_adapt) will auto-scale the step size according to some criteria, e.g. reach approximately optimal acceptance rate and a burn-in should be added to specify how many samples to use for the adaptation.



In [None]:
# Your code here





# 3. Exploring different prior choices<a class="anchor" id="ModifyPriors"></a>

In the above example, we used a GMRF prior for the parameter $\mathbf{x}$. However, it is not always obvious what prior to use for a given problem. In such cases, it is often useful to try out different priors to see how they affect the posterior distribution.

In CUQIpy, it is easy to modify the prior and re-sample the posterior distribution. This is most easily done by using the BayesianProblem class.

####  Try yourself (optional):  

Please carry out the following exercise to see how the prior affects the posterior distribution. 

Note that: here we use the [sample_posterior](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.problem/cuqi.problem.BayesianProblem.sample_posterior.html#cuqi.problem.BayesianProblem.sample_posterior) method of the BayesianProblem class to sample the posterior distribution and store the samples without plotting. We then manually plot the samples using the `plot_ci` method of the `Samples` object.

- Try another prior such as [LMRF](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.distribution/cuqi.distribution.LMRF.html#cuqi.distribution.LMRF) or [CMRF](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.distribution/cuqi.distribution.CMRF.html#cuqi.distribution.CMRF) for the 1D case (look up appropriate arguments in the documentation) using [`BayesianProblem`](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.problem/cuqi.problem.BayesianProblem.html#cuqi.problem.BayesianProblem).
- Try switching the testproblem from Deconvolution1D to [Deconvolution2D](https://cuqi-dtu.github.io/CUQIpy/api/_autosummary/cuqi.testproblem/cuqi.testproblem.Deconvolution2D.html#cuqi.testproblem.Deconvolution2D) (look up appropriate arguments in the documentation).
- You may also try defining your own Bayesian inverse problem using this interface. ★

In [None]:
# Your code here
# You can modify this code or write your own from scratch

# 1. Forward model and data
A, y_data, probInfo = Deconvolution1D(phantom="sinc").get_components()

# 2. Distributions
x = GMRF(np.zeros(A.domain_dim), 50) # Try e.g. LMRF or CMRF (also update scale parameters!)
y = Gaussian(A@x, 0.01**2)

# 3. Bayesian problem
BP = BayesianProblem(y, x).set_data(y=y_data)

# 4. Sample posterior
samples = BP.sample_posterior(500)

# 5. Analyze posterior
samples.plot_ci(exact=probInfo.exactSolution)


You may have noticed that finding suitable parameters for the prior could be a challenge. To see how to automatically find suitable parameters for the prior, see the [Gibbs notebook](Exercise06_Gibbs.ipynb).

# 4. Computing point estimates of the posterior ★ <a class="anchor" id="pointestimates"></a>

In addition to sampling the posterior, we can also compute point estimates of the posterior. A common point estimate to consider is the Maximum A Posteriori (MAP) estimate, which is the value of the Bayesian parameter that maximizes the posterior density. That is,

\begin{align*}
\mathbf{x}_\mathrm{MAP} = \arg\max_\mathbf{x} p(\mathbf{x} \mid \mathbf{y}_\mathrm{data}).
\end{align*}

The easiest way to compute the MAP estimate is to use the `MAP` method of the `BayesianProblem` class as follows:

In [None]:
# We redefine in case something was changed

# Deterministic forward model
A, y_data, probInfo = Deconvolution1D(phantom="sinc").get_components()

# Distributions for each parameter
x = GMRF(np.zeros(A.domain_dim), 50)
y = Gaussian(mean=A@x, cov=0.01**2)

# Define Bayesian problem
BP = BayesianProblem(y, x).set_data(y=y_data)


In [None]:
x_map = BP.MAP() # Maximum a posteriori estimate


The automatic solver selection is also still work-in-progress.

After we have computed the MAP, we can then estimate to the exact solution (from `probInfo`)

In [None]:
x_map.plot()
plt.title('MAP estimate')
plt.show()

probInfo.exactSolution.plot()
plt.title('Exact solution')
plt.show()


#### ★ Try yourself (optional):  

- Try switching to the Deconvolution2D testproblem. You may have to play with the prior standard deviation to get a good MAP estimate.
- Try switching the prior to a CMRF distribution for the 1D case. Does the MAP estimate change?

## Story 1: MH adaptive scaling, donuts (it pays off to adapt)

## Sample a target

In [None]:
from cuqi.distribution import DistributionGallery
import numpy as np
import matplotlib.pyplot as plt
from cuqi.sampler import MH, CWMH
import time

target = DistributionGallery("donut")


print(target)






# MH
# try scale = 0.1, 1, 10
Ns = 30000
Nb = 5000
scale = 0.05
MH_sampler = MH(target, scale=scale, x0=np.array([0,0]))
ti = time.time()
MH_fixed_samples = MH_sampler.sample(Ns, Nb)
print('Elapsed time MH:', time.time() - ti, '\n')
MH_fixed_samples.plot_pair(ax=plt.gca())


# try sample adapt
MH_adapted_samples = MH_sampler.sample_adapt(Ns, Nb)
print('Elapsed time MH:', time.time() - ti, '\n')

#set defult color matplotlib
MH_adapted_samples.plot_pair(ax=plt.gca(), scatter_kwargs={'c':'r'})


#CWMH_sampler = CWMH(target, scale=scale, x0=np.array([0,0]))
#CWMH_fixed_samples = CWMH_sampler.sample(Ns, Nb)
#print('Elapsed time CWMH:', time.time() - ti, '\n')
#CWMH_fixed_samples.plot_pair(ax=plt.gca(), scatter_kwargs={'c':'g'})
#
#
#CWMH_adapted_samples = CWMH_sampler.sample_adapt(Ns, Nb)
#print('Elapsed time CWMH:', time.time() - ti, '\n')
#CWMH_adapted_samples.plot_pair(ax=plt.gca(), scatter_kwargs={'c':'y'})



what do you notice about acceptance rate?
scale?
plot diagnostics, ESS

try 60000 / 10000


In [None]:
print(MH_fixed_samples.compute_ess())
print(MH_adapted_samples.compute_ess())
MH_fixed_samples.plot_trace()
MH_adapted_samples.plot_trace()


## story 2: Poisson, CW vs MH  (not all unkowns are created equal)

In [None]:
# Poisson inverse problem
from cuqi.testproblem import Poisson1D
from cuqi.distribution import GMRF, Gaussian, CMRF
from cuqi.problem import BayesianProblem
from cuqi.sampler import MH, CWMH, pCN

# Load forward model, data and problem information
n = 20
Ns = 3000
Nb = 2000
cov = 4.62905925e-05
A, y_data, probInfo = Poisson1D(dim=n).get_components()

x = GMRF(np.zeros(A.domain_dim), 8, geometry=A.domain_geometry,  order=1)
y = Gaussian(A(x), cov)

BP = BayesianProblem(x, y).set_data(y=y_data)
posterior = BP.posterior()

scale = 0.06
MH_sampler = MH(posterior, scale = scale, x0=np.ones(n))
MH_samples = MH_sampler.sample_adapt(Ns, Nb)


CWMH_sampler = CWMH(posterior, scale = scale, x0=np.ones(n))
CWMH_samples = CWMH_sampler.sample_adapt(Ns, Nb)
int(0.1*Ns)

In [None]:
MH_samples.burnthin(int(0.1*(Ns))).plot_ci(95, exact=probInfo.exactSolution)
plt.figure()
CWMH_samples.burnthin(int(0.1*Ns)).plot_ci(95, exact=probInfo.exactSolution)



In [None]:
CWMH_sampler.scale


time both
insptect scale for both
use KL expansion and compare both samplers 

In [None]:
print(MH_samples.compute_ess())
print(CWMH_samples.compute_ess())

In [None]:
test = Poisson1D(dim=n)

In [None]:
test.likelihood.distribution.cov

## Story 3: Gradient info, ULA (univariate, bias) (Wandering randomly or be Guided by the force / let the force guide you)

## Story 4: Gradient info, MALA (is it ever possible to fix the bias)

## Story 5: NUTS (it is time to go nuts with nuts)

inspect sampler / its functions

In [None]:

##### <font color=#8B4513> Exercise </font> <a name="r-forward-model"></a>
- Compute the credibility interval in the function space and plot it. Hint: use the property `funvals` of the samples: `BP_poisson_samples.funvals` which returns another samp