# Monte Carlo methods for Bayesian Inversion

The estimation of parameters in mathematical models from observational data has led to the development of many different mathematical and computational advances. The Bayesian approach to the inversion problem allows us to take incorporate different kind of uncertainties in the estimation of these parameters, thus enlarging the space of well-posed problems to be solved. 

In this notebook, we present a simple and illustrative inversion problem arising from estimating the parameter of a differential equation from noisy measurements of an experiment. This example will help us analyse and compare different numerical solutions to the Bayesian formulation of its inversion problem.

### The experiment

The experiment presented in this notebook was designed for the estimation of the Earth's gravitational constant $g$, using measurements of the movement of a pendulum. The movement of the pendulum is described by an ordinary differential equation modeling the evolution of the pendulum's angular acceleration depending on $g$.

$$
    \frac{d^2\theta_g}{dt^2} + \frac{g}{l} \sin \theta_g = 0
$$

In which $l$ is a known parameter representing the length of the pendelum and $g$ is the parameter we would like to estimate.

The data used for making an estimation of the parameter $g$ was obtained by letting children run an experiment with the pendulum, in which they were asked, after letting the pendulum free from a $5^{\circ}$ angle at time $t_0 = 0$, to measure times $(t_1, ..., t_n)$ at which the pendulum met a zero angle, thus giving $\theta\left(t_i\right) = 0$.

### Bayesian inverse problem

In a more general setting, we are trying to find an estimate of the real parameter $\theta^\star \in X \subset \mathbb{R}^n$ of a model $\mathcal{G} : X \rightarrow Y$ with $Y \subset \mathbb{R}^m$ and $m \in \mathbb{N}$ from observations $y \in Y$.

To model the uncertainties present in the observations due to different factors such as measurements or model error, we treat $y$ as the realization of a mean zero Gaussian error term $\eta \sim \mathcal{N}\left(0, \Gamma\right)$ added to the model giving

$$
    y = \mathcal{G}\left(\theta^\star\right) + \eta
$$

the problem of inverting the function $\mathcal{G} + \eta$ to reconstruct $\theta^\star$ is ill-defined, such that in general, there is no $\theta \in X$ satisfying $\mathcal{G}\left(\theta\right) = y$. This leads us to reformulate our problem in which we do not try anymore to find $\theta^\star$ but rather try to estimate a probasbility measure for the parameter $\theta$ given the realization vector $y$.

This is done by assuming $\theta \sim \mu_0$ where $\mu_0$ is a probability measure called the _prior measure_, which encodes existing knowledge about the distribution of the searched parameter. How to define $\mu_0$ is out of the scope of this work, but much effort can be put in designing $\mu_0$ in order to constraint, define and refine the explorable parameter space, interested readers are invited to refer to [...]. By further assuming independence of $\theta$ and $\eta$, we can define the _posterior measure_ $\mu^y$, solution of the Bayesian inverse problem  

$$
    \mu^y := \mathbb{P}\left(\theta \in \cdot \mid \mathcal{G}\left(\theta\right) + \eta = y\right)
$$

We are now interested in being able to evaluate expression of the form $\mathbb{E}_{\mu^y}(\varphi)$ for any $\mu^y$ integrable function $\varphi$.

Because $\mu^y$ usually doesn't have an analytical solution, we now focus on further exploring its structure in hope of being able to better understand how to numerically construct an estimate of it.

$$
\begin{split}
\mu^y(A) 
&= \mathbb{P}\left(\theta \in A \mid \mathcal{G}\left(\theta\right) + \eta = y\right) \\
&= \mathbb{P}\left(\theta \in A \mid \eta = y - \mathcal{G}\left(\theta\right)\right) \\
&= \frac{\mathbb{P}\left(\theta \in A, \eta = y - \mathcal{G}\left(\theta\right)\right)}{\mathbb{P}\left(\eta = y - \mathcal{G}\left(\theta\right)\right)} \\
&= \frac{\mathbb{P}\left(\eta = y - \mathcal{G}\left(\theta\right) \mid \theta \in A\right)\mathbb{P}\left(\theta \in A \right)}{\mathbb{P}\left(\eta = y - \mathcal{G}\left(\theta\right)\right)} \\
&= \frac{ \int_A \mathbb{P}\left(\eta = y - \mathcal{G}\left(\theta\right)\right) \text{d}\mu_0(\theta)}{\int_X \mathbb{P}\left(\eta = y - \mathcal{G}\left(\theta\right)\right)\ \text{d}\mu_0(\theta)}
\end{split}
$$

Further, we define from the distribution of $\eta$ the so called _likelihood function_ $\Phi$ as
$$\Phi(\theta) := \mathbb{P}\left(\eta = y - \mathcal{G}\left(\theta\right)\right) = \frac{1}{2}\lVert \Gamma^{-\frac{1}{2}}(y - \mathcal{G}(\theta))\rVert^2 
$$

allowing us to rewrite $\mu^y(A)$ in a more terce way

$$
\mu^y(A) = \frac{1}{Z_y} \int_A \exp\left(-\Phi(\theta)\right)\ \text{d}\mu_0(\theta)
$$

where $Z_y$ is the _normalizing constant_ defined as

$$
Z_y := \int_X \exp(-\Phi(\theta))\ \text{d}\mu_0(\theta)
$$

