# Markov chain Monte Carlo

A particularly popular tool in Bayesian modelling is {term}`Markov chain Monte Carlo` (MCMC). 
This is where random sampling on a [Markov chain](https://en.wikipedia.org/wiki/Markov_chain) is used to sample a given probability distribution. 
In teh context of Bayesian modelling, the probability distribution that is sampled is the {term}`posterior distribution`, $P(A|B)$, which is given by Bayes' theorem:

```{math}
:label: bayes
P(A|B) = \frac{P(B|A)P(A)}{P(B)} \propto P(B|A)P(A),
```

where $P(B|A)$ is the {term}`likelihood`, $P(A)$ is the probability distribution that describes our {term}`prior knowledge` and $P(B)$ is the evidence term, which in the context of MCMC is ignored as the data is not variable.

Here, we will show how `easyCore` can be combined with the popular MCMC library `emcee`.
Before performing MCMC sampling, it is necessary that we have a good estimate of the MLE (or maximum a posteriori) parameters are available as these are used as starting points for the analysis. 
Therefore, below we complete the bounded analysis discussed previously. 

In [None]:
from easyCore import np
from easyCore.Objects.Variable import Parameter
from easyCore.Objects.ObjectClasses import BaseObj
from easyCore.Fitting.Fitting import Fitter

np.random.seed(123)

a_true = -0.9594
b_true = 7.294
c_true = 3.102

N = 50
x = np.sort(10 * np.random.rand(N))
yerr = 0.1 + 3 * np.random.rand(N)
y = a_true * x ** 2 + b_true * x + c_true
y += np.abs(y) * 0.2 * np.random.randn(N)

a = Parameter(name='a', value=a_true, fixed=False, min=-5.0, max=0.5)
b = Parameter(name='b', value=b_true, fixed=False, min=0, max=10)
c = Parameter(name='c', value=c_true, fixed=False, min=-20, max=50)

def math_model(x, *args, **kwargs):
    return a.raw_value * x ** 2 + b.raw_value * x + c.raw_value

quad = BaseObj(name='quad', a=a, b=b, c=c)
f = Fitter(quad, math_model)

res = f.fit(x=x, y=y, weights=1/yerr)

a, b, c

First, we will define a Python object that describes our data as a multi-dimensional (one dimension per data point) normal distribution. 

In [None]:
from scipy.stats import multivariate_normal

mv = multivariate_normal(mean=y, cov=np.diag(yerr))

We can then use this object to define a log-likelihood function (the logarithm is used for numerical simplicity). 

In [None]:
def log_likelihood(theta, x):
    """
    The log-likelihood function for the data given a model. 

    :theta: the model parameters.
    :x: the value over which the model is computed.

    :return: log-likelihood for the given parameters.
    """
    a.value, b.value, c.value = theta
    model = f.evaluate(x)
    logl = mv.logpdf(model)
    return logl

Having defined the log-likelihood, we now do the same for the log-prior probability. 

In [None]:
from scipy.stats import uniform

priors = []
for p in f.fit_object.get_parameters():
    priors.append(uniform(loc=p.min, scale=p.max - p.min))

def log_prior(theta):
    return sum([p.logpdf(theta[i]) for i, p in enumerate(priors)])

Then these come together to give the log-posterior function, which is the object that we sample. 

In [None]:
def log_posterior(theta, x):
    return log_prior(theta) + log_likelihood(theta, x)

The `emcee` package can then be used to perform the MCMC sampling. 
Below, we perform 32 individual {term}`walkers` that sample the distribution 500 times. 
Note, here we set progress of `False` for the benefit of the web rendering, it might be valuable to have this as `True` in your Notebook.

In [None]:
import emcee

pos = list(res.p.values()) + 1e-4 * np.random.randn(32, 3)
nwalkers, ndim = pos.shape

sampler = emcee.EnsembleSampler(nwalkers, ndim, log_posterior, args=[x])
sampler.run_mcmc(pos, 500, progress=False);

We can visualised the values of the parameters investigated in the sampling process as shown below. 

In [None]:
import matplotlib.pyplot as plt

fig, axes = plt.subplots(3, figsize=(10, 4), sharex=True)
samples = sampler.get_chain()
labels = ["a", "b", "c"]
for i in range(ndim):
    ax = axes[i]
    ax.plot(samples[:, :, i], "k", alpha=0.3)
    ax.set_xlim(0, len(samples))
    ax.set_ylabel(labels[i])
    ax.yaxis.set_label_coords(-0.1, 0.5)

axes[-1].set_xlabel("step number");

It is clear from the above plot that there is a lag period before the sampling is being carried out evenly. 
This means that we should ignore the samples before the first (approximately) 100 steps. 
We can get flat, i.e., all the {term}`walkers` combined, sets of samples using the function below. 
Additionally, we perform {term}`thinning` this so that only every 10th sample is used (this is to remove any correlation between samples). 

In [None]:
flat_samples = sampler.get_chain(discard=100, thin=10, flat=True)

With the flat samples we can now plots these as histograms, using the corner library. 

In [None]:
import corner

corner.corner(flat_samples, labels=labels)
plt.show()

Using this plot, we can visually inspect the marginal posterior distributions for our parameters, and get a handle on the parametermetric uncertainties. 
Additionally, we can begin to investigate the correlations present between different parameters, e.g., above we can see that `a` is negatively correlated with `b` but positively correlated with `c`. 

We can also produce a plot showing the posterior distribution of models on the data with a range of {term}`credible intervals`, shown here with the blue shaded areas.

In [None]:
credible_intervals = [[16, 84], [2.5, 97.5], [0.15, 99.85]]
alpha = [0.6, 0.4, 0.2]
distribution = flat_samples[:, 0] * x[:, np.newaxis] ** 2 + flat_samples[:, 1] * x[:, np.newaxis] + flat_samples[:, 2]

plt.errorbar(x, y, yerr, marker='.', ls='', color='k')
for i, ci in enumerate(credible_intervals):
    plt.fill_between(x,
                     *np.percentile(distribution, ci, axis=1),
                     alpha=alpha[i],
                     color='#0173B2',
                     lw=0)
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.show()