# Optimized Cython MCMC implementation

The implementation in the previous section still used `numpy.random` calls, which go through Python.

In this notebook, we'll demonstrate creating a separate Cython package called `cython_mcmc` that uses another Cython package named `mt19937` for faster random number generation.

We will see a significant speedup for the MCMC sampler as a result of our efforts.

## First step -- compile external packages and run perfomance tests

In [30]:
%%bash
cd ./mt19937
python ./setup.py develop
cython -a ./srs/mt19937.pyx

Compiling ./srs/mt19937.pyx because it depends on srs/bounded_integers.pxi.
[1/1] Cythonizing ./srs/mt19937.pyx
running develop
running egg_info
writing srs.egg-info/PKG-INFO
writing dependency_links to srs.egg-info/dependency_links.txt
writing top-level names to srs.egg-info/top_level.txt
reading manifest file 'srs.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'srs.egg-info/SOURCES.txt'
running build_ext
building 'srs.mt19937' extension
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DRS_RANDOMKIT=1 -I./srs -I/opt/conda/lib/python3.6/site-packages/numpy/core/include -I./srs/src/random-kit -I/opt/conda/include/python3.6m -c ./srs/mt19937.c -o build/temp.linux-x86_64-3.6/./srs/mt19937.o -std=c99 -msse2
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DRS_RANDOMKIT=1 -I./srs -I/opt/conda/lib/python3.6/site-packages/numpy/core/include -I./srs

In file included from /opt/conda/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1788:0,
                 from /opt/conda/lib/python3.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:18,
                 from /opt/conda/lib/python3.6/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from ./srs/mt19937.c:465:
  ^
 static PyObject *__pyx_f_3srs_7mt19937_float_fill_from_double(aug_state *__pyx_v_state, void *__pyx_v_func, PyObject *__pyx_v_size, PyObject *__pyx_v_lock) {
                  ^
./srs/distributions.c: In function ‘gauss_zig_julia’:
         if (rabs < ki[idx])
                  ^
./srs/distributions.c: In function ‘gauss_zig_double’:
         if (rabs < ki_double[idx])
                  ^
./srs/distributions.c: In function ‘gauss_zig_float’:
         if (rabs < ki_float[idx])
                  ^


In [26]:
%%bash
cd cython_mcmc
python ./setup.py develop
cython -a ./cython_mcmc/mcmc.pyx

See http://cython.readthedocs.io/en/latest/src/userguide/sharing_declarations.html for sharing declarations among Cython files.
running develop
running egg_info
writing cython_mcmc.egg-info/PKG-INFO
writing dependency_links to cython_mcmc.egg-info/dependency_links.txt
writing top-level names to cython_mcmc.egg-info/top_level.txt
reading manifest file 'cython_mcmc.egg-info/SOURCES.txt'
writing manifest file 'cython_mcmc.egg-info/SOURCES.txt'
running build_ext
copying build/lib.linux-x86_64-3.6/cython_mcmc/mcmc.cpython-36m-x86_64-linux-gnu.so -> cython_mcmc
Creating /opt/conda/lib/python3.6/site-packages/cython-mcmc.egg-link (link to .)
cython-mcmc 0.0.0 is already the active version in easy-install.pth

Installed /home/jovyan/cython_mcmc
Processing dependencies for cython-mcmc==0.0.0
Finished processing dependencies for cython-mcmc==0.0.0


In [27]:
from cython_mcmc import mcmc
import numpy as np
np.random.seed(123)
data = np.random.randn(20)

In [28]:
%timeit mcmc.log_sampler_cy_v3(data, samples=15000, mu_init=1.0)

6.34 ms ± 352 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## `log_sampler()` using external library

In [29]:
%%html
./cython_mcmc/cython_mcmc/mcmc.html

### Worth noting

* `from [...] cimport RandomState`
* `from [...] import RandomState`
* `sample_norm()` and `accept_p()` implementations.
* `norm_logpdf()` implementation uses raw C buffers.

## `RandomState` extension type

Cython-level compile-time interface defined in `mt19937.pxd`:

In [34]:
!cat ./mt19937/srs/mt19937.pxd | nl

     1	from binomial cimport binomial_t
     2	cimport numpy as np
     3	cimport cython
     4	from libc cimport string
     5	from libc.stdint cimport (uint8_t, uint16_t, uint32_t, uint64_t,
     6	                          int8_t, int16_t, int32_t, int64_t, intptr_t)
     7	from libc.stdlib cimport malloc, free
     8	from libc.math cimport sqrt
     9	from cpython cimport Py_INCREF, PyComplex_FromDoubles
    10	from cython_overrides cimport PyFloat_AsDouble, PyInt_AsLong, PyComplex_RealAsDouble, PyComplex_ImagAsDouble
    11	from distributions cimport aug_state
       
    12	cdef class RandomState:
       
    13	    cdef void *rng_loc
    14	    cdef binomial_t binomial_info
    15	    cdef aug_state rng_state
    16	    cdef object lock
    17	    cdef object __seed
    18	    cdef object __stream
    19	    cdef object __version
       
    20	    cdef double c_standard_normal(self)
    21	    cdef double c_random_sample(self)
    22	    cdef inline _shu

In [31]:
%%html
./mt19937/srs/mt19937.html

## Putting it all together in `setup.py` with `cythonize()`

* We use a `setup.py` script in the `cython_mcmc` package to compile everything together.
* We specify all the `mcmc.pyx` external Cython and C dependencies in an `Extension()` object.
* We use the `Cython.Build.cythonize()` command to pull everything together and compile things for us.
* The `python setup.py develop` command is what kicks things off for us

In [32]:
%cat ./cython_mcmc/setup.py

from setuptools.extension import Extension
from setuptools import setup
from Cython.Build import cythonize
import numpy

ext = Extension('cython_mcmc.mcmc',
                 sources=['cython_mcmc/mcmc.pyx',
                          '../mt19937/srs/mt19937.pyx',
			  '../mt19937/srs/distributions.c',
			  '../mt19937/srs/aligned_malloc.c',
			  '../mt19937/srs/src/random-kit/random-kit.c',
			  '../mt19937/srs/interface/random-kit/random-kit-shim.c'],
                 include_dirs=[numpy.get_include(), '../mt19937/srs'])

setup(name='cython_mcmc', ext_modules=cythonize([ext], include_path=['../mt19937']))
