# MCMC Sampling

The  `Model` class method  `sample` invokes Stan's adaptive HMC-NUTS
sampler which uses the Hamiltonian Monte Carlo (HMC) algorithm
and its adaptive variant the no-U-turn sampler (NUTS) to produce a set of
draws from the posterior distribution of the model parameters conditioned on the data.
It returns a `StanMCMC` object
which provides properties to retrieve information about the sample, as well as methods
to run CmdStan's summary and diagnostics tools.

In order to evaluate the fit of the model to the data, it is necessary to run
several Monte Carlo chains and compare the set of draws returned by each.
By default, the `sample` command runs 4 sampler chains, i.e.,
CmdStanPy invokes CmdStan 4 times.
CmdStanPy uses Python's `subprocess` and `multiprocessing` libraries
to run these chains in separate processes.
This processing can be done in parallel, up to the number of
processor cores available.

## Fitting a model to data

In this example we use the CmdStan example model
[bernoulli.stan](https://github.com/stan-dev/cmdstanpy/blob/master/test/data/bernoulli.stan)
and data file
[bernoulli.data.json](https://github.com/stan-dev/cmdstanpy/blob/master/test/data/bernoulli.data.json>).

We instantiate a model and do sampling using the default CmdStan settings:

In [None]:
import os
from cmdstanpy.model import Model
from cmdstanpy.utils import cmdstan_path
    
bernoulli_dir = os.path.join(cmdstan_path(), 'examples', 'bernoulli')
bernoulli_path = os.path.join(bernoulli_dir, 'bernoulli.stan')
bernoulli_data = os.path.join(bernoulli_dir, 'bernoulli.data.json')

# instantiate bernoulli model, compile Stan program
bernoulli_model = Model(stan_file=bernoulli_path)
bernoulli_model.compile()

bern_fit = bernoulli_model.sample(data=bernoulli_data)
bern_fit.sample.shape
bern_fit.summary()

## Running a data-generating model using `fixed_param=True`

In this example we use the CmdStan example model
[bernoulli_datagen.stan](https://github.com/stan-dev/cmdstanpy/blob/master/test/data/bernoulli_datagen.stan)
to generate a simulated dataset given fixed data values.

In [None]:
datagen_stan = os.path.join('..', '..', 'test', 'data', 'bernoulli_datagen.stan')
datagen_model = Model(stan_file=datagen_stan)
datagen_model.compile()

sim_data = datagen_model.sample(fixed_param=True)

sim_data.summary()

Compute, plot histogram of total successes for `N` Bernoulli trials with chance of success `theta`:

In [None]:
drawset_pd = sim_data.get_drawset()
drawset_pd.columns
y_sims = drawset_pd.drop(columns=['lp__', 'accept_stat__'])
y_sums = y_sims.sum(axis=1)
y_sums.astype('int32').plot.hist(range(0,datagen_data['N']+1))