# Optimized Cython MCMC implementation

The implementation in the previous section still used `numpy.random` calls, which go through Python.

In this notebook, we'll demonstrate creating a separate Cython package called `cython_mcmc` that uses another Cython package named `mt19937` for faster random number generation.

We will see a significant speedup for the MCMC sampler as a result of our efforts.

## First step -- compile external packages and run perfomance tests

In [None]:
%%bash
cd ./rng
python ./setup.py develop
cython -a ./rng/mt19937.pyx

In [None]:
%%bash
cd cython_mcmc
python ./setup.py develop
cython -a ./cython_mcmc/mcmc.pyx

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns

import numpy as np
from scipy.stats import norm

from cython_mcmc import mcmc

np.random.seed(123)
data = np.random.randn(20)

In [None]:
%timeit mcmc.log_sampler(data, samples=15000, mu_init=1.0)

In [None]:
posterior = mcmc.log_sampler(data, samples=15000, mu_init=1.0)
plt.plot(posterior);

In [None]:
def calc_posterior_analytical(data, x, mu_0, sigma_0):
    sigma = 1.
    n = len(data)
    mu_post = (mu_0 / sigma_0**2 + data.sum() / sigma**2) / (1. / sigma_0**2 + n / sigma**2)
    sigma_post = (1. / sigma_0**2 + n / sigma**2)**-1
    return norm(mu_post, np.sqrt(sigma_post)).pdf(x)

In [None]:
ax = plt.subplot()

sns.distplot(posterior[500:], ax=ax, label='estimated posterior')
x = np.linspace(-.7, .9, 500)
post = calc_posterior_analytical(data, x, 0, 1)
ax.plot(x, post, 'g', label='analytic posterior')
_ = ax.set(xlabel='mu', ylabel='belief');
ax.legend();

## Standard error in $\mu$ shrinks as more data is collected

In [None]:
%%time
data_2000 = np.random.randn(2000)
posterior_2000 = mcmc.log_sampler(data_2000, samples=150000, mu_init=1.0)

In [None]:
ax = plt.subplot()

sns.distplot(posterior_2000[500::5], ax=ax, label='estimated posterior')
x = np.linspace(-.1, .1, 500)
post = calc_posterior_analytical(data_2000, x, 0, 1)
ax.plot(x, post, 'g', label='analytic posterior')
_ = ax.set(xlabel='mu', ylabel='belief');
ax.legend();

## `log_sampler()` using external library

In [None]:
%%html
./cython_mcmc/cython_mcmc/mcmc.html

### Worth noting

* `from [...] cimport RandomState`
* `from [...] import RandomState`
* `sample_norm()` and `accept_p()` implementations.
* `norm_logpdf()` implementation uses raw C buffers.

## `RandomState` extension type

Cython-level compile-time interface defined in `mt19937.pxd`:

In [None]:
!cat ./rng/rng/mt19937.pxd | nl

In [None]:
%%html
./rng/rng/mt19937.html

## Putting it all together in `setup.py` with `cythonize()`

* We use a `setup.py` script in the `cython_mcmc` package to compile everything together.
* We specify all the `mcmc.pyx` external Cython and C dependencies in an `Extension()` object.
* We use the `Cython.Build.cythonize()` command to pull everything together and compile things for us.
* The `python setup.py develop` command is what kicks things off for us, and produces the `mcmc.so` shared object file.

In [None]:
%cat ./cython_mcmc/setup.py