# MCMC Sampling

## Overview

Stan's MCMC sampler implements the Hamiltonian Monte Carlo (HMC) algorithm and its adaptive variant
the no-U-turn sampler (NUTS).
It creates a set of draws from the posterior distribution of the model conditioned on the data,
allowing for exact Bayesian inference of the model parameters.
Each draw consists of the values for all parameter, transformed parameter, and
generated quantities variables, reported on the constrained scale.

The [CmdStanModel sample](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanModel.sample) method
wraps the CmdStan [sample](https://mc-stan.org/docs/cmdstan-guide/mcmc-config.html) method.
Underlyingly, the CmdStan outputs are a set of per-chain Stan CSV files.
In addition to the resulting sample, reported as one row per draw,
the Stan CSV files encode information about the inference engine configuration
and the sampler state.
The NUTS-HMC adaptive sampler algorithm also outputs the per-chain
HMC tuning parameters `step_size` and `metric`.

The `sample` method returns a [CmdStanMCMC](https://mc-stan.org/cmdstanpy/api.html#cmdstanmcmc) object,
which provides access to the disparate information from the Stan CSV files.
Accessor functions allow the user
to access the sample in whatever data format is needed for further analysis,
either as tabular data (i.e., in terms of the per-chain CSV file rows and columns),
or as structured objects which correspond to the variables in the Stan model
and the individual diagnostics produced by the inference method.


- The [stan_variable](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanMCMC.stan_variable) and
[stan_variables](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanMCMC.stan_variables) methods 
return a Python [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html#numpy.ndarray)
containing all draws from the sample where the structure of each draw corresponds to the structure of the
Stan variable.

- The [draws](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanMCMC.draws) method returns the sample as either a 2-D or 3-D numpy.ndarray.

- The [draws_pd](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanMCMC.draws) method returns the entire sample or selected variables as a pandas.DataFrame.

- The [draws_xr](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanMCMC.draws_xr) method returns a structured Xarray dataset over the Stan model variables.

- The [method_variables](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanMCMC.method_variables) returns a
Python dict over all sampler method variables.


In addition, the `CmdStanMCMC` object has accessor methods for

- The per-chain HMC tuning parameters `step_size` and `metric` 

- The CmdStan run configuration and console outputs

- The mapping between the Stan model variables and the corresponding CSV file columns.

### Notebook prerequisites


CmdStanPy displays progress bars during sampling via use of package [tqdm](https://github.com/tqdm/tqdm).
In order for these to display properly in a Jupyter notebook, you must have the 
[ipywidgets](https://ipywidgets.readthedocs.io/en/latest/index.html) package installed,
and depending on your version of Jupyter or JupyterLab, you must enable it via command:

In [1]:
!jupyter nbextension enable --py widgetsnbextension

usage: jupyter [-h] [--version] [--config-dir] [--data-dir] [--runtime-dir]
               [--paths] [--json] [--debug]
               [subcommand]

Jupyter: Interactive Computing

positional arguments:
  subcommand     the subcommand to launch

optional arguments:
  -h, --help     show this help message and exit
  --version      show the versions of core jupyter packages and exit
  --config-dir   show Jupyter config dir
  --data-dir     show Jupyter data dir
  --runtime-dir  show Jupyter runtime dir
  --paths        show all Jupyter paths. Add --json for machine-readable
                 format.
  --json         output paths as machine-readable json
  --debug        output debug information about paths

Available subcommands: dejavu execute kernel kernelspec migrate nbconvert run
troubleshoot trust

Jupyter command `jupyter-nbextension` not found.


For more information, see the the
[installation instructions](https://ipywidgets.readthedocs.io/en/latest/user_install.html#), 
also [this tqdm GitHub issue](https://github.com/tqdm/tqdm/issues/394#issuecomment-384743637).


    

## Fitting the model and data

In this example we use the CmdStan example model
[bernoulli.stan](https://github.com/stan-dev/cmdstanpy/blob/master/test/data/bernoulli.stan)
and data file
[bernoulli.data.json](https://github.com/stan-dev/cmdstanpy/blob/master/test/data/bernoulli.data.json>).

We instantiate a `CmdStanModel` from the Stan program file

In [2]:
import os
from cmdstanpy import CmdStanModel

# instantiate, compile bernoulli model
model = CmdStanModel(stan_file='bernoulli.stan')

  from .autonotebook import tqdm as notebook_tqdm


15:37:47 - cmdstanpy - INFO - compiling stan file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli.stan to exe file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli


15:37:56 - cmdstanpy - INFO - compiled model executable: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli


By default, the model is compiled during instantiation.  The compiled executable is created in the same directory as the program file.  If the directory already contains an executable file with a newer timestamp, the model is not recompiled.

We run the sampler on the data using all default settings:  4 chains, each of which runs 1000 warmup and sampling iterations.

In [3]:
# run CmdStan's sample method, returns object `CmdStanMCMC`
fit = model.sample(data='bernoulli.data.json')

15:37:56 - cmdstanpy - INFO - CmdStan start processing


chain 1 |[33m          [0m| 00:00 Status




chain 2 |[33m          [0m| 00:00 Status

[A





chain 3 |[33m          [0m| 00:00 Status

[A[A






chain 4 |[33m          [0m| 00:00 Status

[A[A[A

chain 1 |[34m██████████[0m| 00:00 Sampling completed


chain 2 |[34m██████████[0m| 00:00 Sampling completed


chain 3 |[34m██████████[0m| 00:00 Sampling completed


chain 4 |[34m██████████[0m| 00:00 Sampling completed

                                                                                

                                                                                

                                                                                

                                                                                


15:37:56 - cmdstanpy - INFO - CmdStan done processing.





The `CmdStanMCMC` object records the command, the return code, and the paths to the sampler output csv and console files.  The sample is lazily instantiated on first access of either the draws or the HMC tuning parameters, i.e., the step size and metric.

The string representation of this object displays the CmdStan commands and the location of the output files.
Output filenames are composed of the model name, a timestamp in the form YYYYMMDDhhmmss and the chain id, plus the corresponding filetype suffix, either '.csv' for the CmdStan output or '.txt' for the console messages, e.g. bernoulli-20220617170100_1.csv.

In [4]:
fit

CmdStanMCMC: model=bernoulli chains=4['method=sample', 'algorithm=hmc', 'adapt', 'engaged=1']
 csv_files:
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_1.csv
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_2.csv
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_3.csv
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_4.csv
 output_files:
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_0-stdout.txt
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_1-stdout.txt
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_2-stdout.txt
	/tmp/tmpq5qkow4o/bernoulli516vtcwv/bernoulli-20220822153756_3-stdout.txt

In [5]:
print(f'draws as array:  {fit.draws().shape}')
print(f'draws as structured object:\n\t{fit.stan_variables().keys()}')
print(f'sampler diagnostics:\n\t{fit.method_variables().keys()}')

draws as array:  (1000, 4, 8)
draws as structured object:
	dict_keys(['theta'])
sampler diagnostics:
	dict_keys(['lp__', 'accept_stat__', 'stepsize__', 'treedepth__', 'n_leapfrog__', 'divergent__', 'energy__'])


### Sampler Progress

Your model make take a long time to fit.  The `sample` method provides two arguments:
    
- visual progress bar:  `show_progress=True`
- stream CmdStan output to the console - `show_console=True`

By default, CmdStanPy displays a progress bar during sampling, as seen above.
Since the progress bars are only visible while the sampler is running and the bernoulli example model takes no time at all to fit, we run this model for 200K iterations, in order to see the progress bars in action.

In [6]:
fit = model.sample(data='bernoulli.data.json', iter_warmup=100000, iter_sampling=100000, show_progress=True)


15:37:56 - cmdstanpy - INFO - CmdStan start processing


chain 1 |[33m          [0m| 00:00 Status




chain 2 |[33m          [0m| 00:00 Status

[A





chain 3 |[33m          [0m| 00:00 Status

[A[A






chain 4 |[33m          [0m| 00:00 Status

[A[A[A

chain 1 |[33m█         [0m| 00:00 Iteration:  20400 / 200000 [ 10%]  (Warmup)




chain 2 |[33m▊         [0m| 00:00 Iteration:  17100 / 200000 [  8%]  (Warmup)

[A

chain 1 |[33m██▏       [0m| 00:00 Iteration:  43600 / 200000 [ 21%]  (Warmup)




chain 2 |[33m██        [0m| 00:00 Iteration:  40400 / 200000 [ 20%]  (Warmup)

[A

chain 1 |[33m███▎      [0m| 00:00 Iteration:  67000 / 200000 [ 33%]  (Warmup)




chain 2 |[33m███▏      [0m| 00:00 Iteration:  64300 / 200000 [ 32%]  (Warmup)

[A

chain 1 |[33m████▌     [0m| 00:00 Iteration:  90500 / 200000 [ 45%]  (Warmup)




chain 2 |[33m████▍     [0m| 00:00 Iteration:  88100 / 200000 [ 44%]  (Warmup)

[A




chain 2 |[34m█████▌    [0m| 00:00 Iteration: 110900 / 200000 [ 55%]  (Sampling)

[A

chain 1 |[34m█████▋    [0m| 00:00 Iteration: 113400 / 200000 [ 56%]  (Sampling)




chain 2 |[34m██████▌   [0m| 00:00 Iteration: 130300 / 200000 [ 65%]  (Sampling)

[A

chain 1 |[34m██████▋   [0m| 00:00 Iteration: 132400 / 200000 [ 66%]  (Sampling)




chain 2 |[34m███████▎  [0m| 00:00 Iteration: 146200 / 200000 [ 73%]  (Sampling)

[A

chain 1 |[34m███████▍  [0m| 00:00 Iteration: 148000 / 200000 [ 74%]  (Sampling)




chain 2 |[34m███████▉  [0m| 00:01 Iteration: 159800 / 200000 [ 79%]  (Sampling)

[A

chain 1 |[34m████████  [0m| 00:01 Iteration: 161500 / 200000 [ 80%]  (Sampling)




chain 2 |[34m████████▌ [0m| 00:01 Iteration: 171900 / 200000 [ 85%]  (Sampling)

[A

chain 1 |[34m████████▋ [0m| 00:01 Iteration: 173600 / 200000 [ 86%]  (Sampling)




chain 2 |[34m█████████▏[0m| 00:01 Iteration: 183000 / 200000 [ 91%]  (Sampling)

[A

chain 1 |[34m█████████▏[0m| 00:01 Iteration: 184600 / 200000 [ 92%]  (Sampling)




chain 2 |[34m█████████▋[0m| 00:01 Iteration: 193400 / 200000 [ 96%]  (Sampling)

[A

chain 1 |[34m█████████▊[0m| 00:01 Iteration: 194900 / 200000 [ 97%]  (Sampling)





chain 3 |[33m          [0m| 00:01 Status

[A[A






chain 4 |[33m          [0m| 00:01 Status

[A[A[A





chain 3 |[33m█▏        [0m| 00:01 Iteration:  23800 / 200000 [ 11%]  (Warmup)

[A[A






chain 4 |[33m█         [0m| 00:01 Iteration:  21900 / 200000 [ 10%]  (Warmup)

[A[A[A





chain 3 |[33m██▍       [0m| 00:01 Iteration:  48000 / 200000 [ 24%]  (Warmup)

[A[A






chain 4 |[33m██▏       [0m| 00:01 Iteration:  44600 / 200000 [ 22%]  (Warmup)

[A[A[A





chain 3 |[33m███▋      [0m| 00:01 Iteration:  72700 / 200000 [ 36%]  (Warmup)

[A[A






chain 4 |[33m███▎      [0m| 00:01 Iteration:  67300 / 200000 [ 33%]  (Warmup)

[A[A[A





chain 3 |[33m████▉     [0m| 00:01 Iteration:  97600 / 200000 [ 48%]  (Warmup)

[A[A






chain 4 |[33m████▍     [0m| 00:01 Iteration:  89800 / 200000 [ 44%]  (Warmup)

[A[A[A






chain 4 |[34m█████▌    [0m| 00:02 Iteration: 110300 / 200000 [ 55%]  (Sampling)

[A[A[A





chain 3 |[34m██████    [0m| 00:02 Iteration: 119900 / 200000 [ 59%]  (Sampling)

[A[A






chain 4 |[34m██████▍   [0m| 00:02 Iteration: 128300 / 200000 [ 64%]  (Sampling)

[A[A[A





chain 3 |[34m██████▉   [0m| 00:02 Iteration: 138000 / 200000 [ 69%]  (Sampling)

[A[A






chain 4 |[34m███████▏  [0m| 00:02 Iteration: 143300 / 200000 [ 71%]  (Sampling)

[A[A[A





chain 3 |[34m███████▋  [0m| 00:02 Iteration: 153200 / 200000 [ 76%]  (Sampling)

[A[A






chain 4 |[34m███████▊  [0m| 00:02 Iteration: 156300 / 200000 [ 78%]  (Sampling)

[A[A[A





chain 3 |[34m████████▎ [0m| 00:02 Iteration: 166400 / 200000 [ 83%]  (Sampling)

[A[A






chain 4 |[34m████████▍ [0m| 00:02 Iteration: 168000 / 200000 [ 84%]  (Sampling)

[A[A[A





chain 3 |[34m████████▉ [0m| 00:02 Iteration: 178300 / 200000 [ 89%]  (Sampling)

[A[A






chain 4 |[34m████████▉ [0m| 00:02 Iteration: 178800 / 200000 [ 89%]  (Sampling)

[A[A[A





chain 3 |[34m█████████▍[0m| 00:02 Iteration: 189300 / 200000 [ 94%]  (Sampling)

[A[A






chain 4 |[34m█████████▍[0m| 00:03 Iteration: 189000 / 200000 [ 94%]  (Sampling)

[A[A[A





chain 3 |[34m█████████▉[0m| 00:03 Iteration: 199700 / 200000 [ 99%]  (Sampling)

[A[A






chain 4 |[34m█████████▉[0m| 00:03 Iteration: 198800 / 200000 [ 99%]  (Sampling)

[A[A[A

chain 1 |[34m██████████[0m| 00:03 Sampling completed                           


chain 2 |[34m██████████[0m| 00:03 Sampling completed                           


chain 3 |[34m██████████[0m| 00:03 Sampling completed                           


chain 4 |[34m██████████[0m| 00:03 Sampling completed                           

                                                                                

                                                                                

                                                                                

                                                                                


15:37:59 - cmdstanpy - INFO - CmdStan done processing.





To see the CmdStan console outputs instead of progress bars, specify ``show_console=True``
This will stream all CmdStan messages to the terminal while the sampler is running.
This option will allow you to debug a Stan program using the Stan language `print` statement.

In [7]:
fit = model.sample(data='bernoulli.data.json', chains=2, parallel_chains=1, show_console=True)



15:38:02 - cmdstanpy - INFO - Chain [1] start processing


15:38:02 - cmdstanpy - INFO - Chain [1] done processing


15:38:02 - cmdstanpy - INFO - Chain [2] start processing


15:38:02 - cmdstanpy - INFO - Chain [2] done processing


Chain [1] method = sample (Default)
Chain [1] sample
Chain [1] num_samples = 1000 (Default)
Chain [1] num_warmup = 1000 (Default)
Chain [1] save_warmup = 0 (Default)
Chain [1] thin = 1 (Default)
Chain [1] adapt
Chain [1] engaged = 1 (Default)
Chain [1] gamma = 0.050000000000000003 (Default)
Chain [1] delta = 0.80000000000000004 (Default)
Chain [1] kappa = 0.75 (Default)
Chain [1] t0 = 10 (Default)
Chain [1] init_buffer = 75 (Default)
Chain [1] term_buffer = 50 (Default)
Chain [1] window = 25 (Default)
Chain [1] algorithm = hmc (Default)
Chain [1] hmc
Chain [1] engine = nuts (Default)
Chain [1] nuts
Chain [1] max_depth = 10 (Default)
Chain [1] metric = diag_e (Default)
Chain [1] metric_file =  (Default)
Chain [1] stepsize = 1 (Default)
Chain [1] stepsize_jitter = 0 (Default)
Chain [1] num_chains = 1 (Default)
Chain [1] id = 1 (Default)
Chain [1] data
Chain [1] file = bernoulli.data.json
Chain [1] init = 2 (Default)
Chain [1] random
Chain [1] seed = 13152
Chain [1] output
Chain [1] file 

## Checking the fit

The first question to ask of the `CmdStanMCMC` object is:  _is this a valid sample from the posterior?_

It is important to check whether or not the sampler was able to fit the model given the data.  Often, this is not possible, for any number of reasons.
To appreciate the sampler diagnostics, we use a hierarchical model which, given a small amount of data, encounters difficulty: the centered parameterization of the 
"8-schools" model (Rubin, 1981).
The "8-schools" model is a simple hierarchical model, first developed on a dataset taken from
an experiment was conducted in 8 schools, with only treatment effects and their standard errors reported.

The Stan model and the original dataset are in files `eight_schools.stan` and `eight_schools.data.json`.

**eight_schools.stan**

In [8]:
with open('eight_schools.stan', 'r') as fd:
    print(fd.read())

data {
  int<lower=0> J; // number of schools
  array[J] real y; // estimated treatment effect (school j)
  array[J] real<lower=0> sigma; // std err of effect estimate (school j)
}
parameters {
  real mu;
  array[J] real theta;
  real<lower=0> tau;
}
model {
  theta ~ normal(mu, tau);
  y ~ normal(theta, sigma);
}




**eight_schools.data.json**

In [9]:
with open('eight_schools.data.json', 'r') as fd:
    print(fd.read())

{
    "J" : 8,
    "y" : [28,8,-3,7,-1,1,18,12],
    "sigma" : [15,10,16,11,9,11,10,18],
    "tau" : 25
}



Because there is not much data, the geometry of posterior distribution is highly curved, 
thus the sampler may encounter difficulty in fitting the model.
By specifying the initial seed for the pseudo-random number generator,
we insure that the sampler will have difficulty in fitting this model.
In particular, some post-warmup iterations diverge, resulting in a biased sample.
In addition, some post-warmup iterations hit the maximum allowed treedepth before
the trajectory hits the "U-turn" condition of the NUTS algorithm,
in which case the sampler may fail to properly explore the entire posterior.

These diagnostics are checked for automatically at the end of each run; if problems are detected, a WARNING message is logged.

In [10]:
eight_schools_model = CmdStanModel(stan_file='eight_schools.stan')
eight_schools_fit = eight_schools_model.sample(data='eight_schools.data.json', seed=55157)

15:38:02 - cmdstanpy - INFO - compiling stan file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/eight_schools.stan to exe file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/eight_schools


15:38:13 - cmdstanpy - INFO - compiled model executable: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/eight_schools


15:38:13 - cmdstanpy - INFO - CmdStan start processing


chain 1 |[33m          [0m| 00:00 Status




chain 2 |[33m          [0m| 00:00 Status

[A





chain 3 |[33m          [0m| 00:00 Status

[A[A






chain 4 |[33m          [0m| 00:00 Status

[A[A[A

chain 1 |[34m█████████▌[0m| 00:00 Iteration: 1800 / 2000 [ 90%]  (Sampling)




chain 2 |[34m█████████▌[0m| 00:00 Iteration: 1800 / 2000 [ 90%]  (Sampling)

[A





chain 3 |[33m▍         [0m| 00:00 Status

[A[A






chain 4 |[33m▍         [0m| 00:00 Status

[A[A[A





chain 3 |[34m██████████[0m| 00:00 Iteration: 1900 / 2000 [ 95%]  (Sampling)

[A[A

chain 1 |[34m██████████[0m| 00:00 Sampling completed                       


chain 2 |[34m██████████[0m| 00:00 Sampling completed                       


chain 3 |[34m██████████[0m| 00:00 Sampling completed                       







chain 4 |[34m██████████[0m| 00:00 Sampling completed

[A[A[A

chain 4 |[34m██████████[0m| 00:00 Sampling completed

                                                                                

                                                                                

                                                                                

                                                                                


15:38:13 - cmdstanpy - INFO - CmdStan done processing.


	Chain 1 had 29 divergent transitions (2.9%)
	Chain 2 had 208 divergent transitions (20.8%)
	Chain 3 had 17 divergent transitions (1.7%)
	Chain 4 had 31 divergent transitions (3.1%)
	Use function "diagnose()" to see further information.





More information on how to address convergence problems can be found at https://mc-stan.org/misc/warnings

The number of post-warmup divergences and iterations which hit the maximum treedepth can be inspected directly via properties `divergences` and `max_treedepths`.

In [11]:
print(f'divergences:\n{eight_schools_fit.divergences}\niterations at max_treedepth:\n{eight_schools_fit.max_treedepths}')

divergences:
[ 29 208  17  31]
iterations at max_treedepth:
[0 0 0 0]


### Summarizing the sample

The `summary` method reports the R-hat statistic, a measure of how well the sampler chains have converged.

In [12]:
eight_schools_fit.summary()

Unnamed: 0,Mean,MCSE,StdDev,5%,50%,95%,N_Eff,N_Eff/s,R_hat
lp__,-17.8042,1.2976,5.69607,-26.3065,-18.5489,-8.18723,19.2695,121.192,1.18588
mu,7.98088,0.196141,5.1703,-0.847043,8.46135,16.4354,694.855,4370.16,1.0083
theta[1],11.6953,0.357324,8.65978,-0.384251,10.3276,28.1659,587.339,3693.95,1.00867
theta[2],7.76656,0.201329,6.38375,-2.51854,7.25419,18.4439,1005.4,6323.29,1.00249
theta[3],5.96852,0.239867,8.27095,-8.75627,7.21487,18.3907,1188.96,7477.76,1.01065
theta[4],7.7166,0.201524,6.85139,-3.64471,8.26866,18.8078,1155.86,7269.54,1.00593
theta[5],4.97621,0.447597,6.65889,-6.70188,5.63811,14.8899,221.325,1391.98,1.03708
theta[6],5.8804,0.212933,6.89437,-6.31614,6.83414,16.4137,1048.34,6593.35,1.01177
theta[7],10.9525,0.243737,6.90153,0.384918,9.93285,23.5881,801.764,5042.54,1.00408
theta[8],8.47301,0.218012,8.06921,-4.22251,8.14475,22.1972,1369.94,8615.98,1.00222


### Sampler Diagnostics

The `diagnose()` method provides more information about the sample.

In [13]:
print(eight_schools_fit.diagnose())

Processing csv files: /tmp/tmpq5qkow4o/eight_schoolsj96cnpp1/eight_schools-20220822153813_1.csv, /tmp/tmpq5qkow4o/eight_schoolsj96cnpp1/eight_schools-20220822153813_2.csv, /tmp/tmpq5qkow4o/eight_schoolsj96cnpp1/eight_schools-20220822153813_3.csv, /tmp/tmpq5qkow4o/eight_schoolsj96cnpp1/eight_schools-20220822153813_4.csv

Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
285 of 4000 (7.12%) transitions ended with a divergence.
These divergent transitions indicate that HMC is not fully able to explore the posterior distribution.
Try increasing adapt delta closer to 1.
If this doesn't remove all divergences, try to reparameterize the model.

Checking E-BFMI - sampler transitions HMC potential energy.
The E-BFMI, 0.28, is below the nominal threshold of 0.30 which suggests that HMC may have trouble exploring the target distribution.
If possible, try to reparameterize the model.

Effective sample size satisfactor

## Accessing the sampler outputs

In [14]:
fit = model.sample(data='bernoulli.data.json')

15:38:13 - cmdstanpy - INFO - CmdStan start processing


chain 1 |[33m          [0m| 00:00 Status




chain 2 |[33m          [0m| 00:00 Status

[A





chain 3 |[33m          [0m| 00:00 Status

[A[A






chain 4 |[33m          [0m| 00:00 Status

[A[A[A

chain 1 |[34m██████████[0m| 00:00 Sampling completed


chain 2 |[34m██████████[0m| 00:00 Sampling completed


chain 3 |[34m██████████[0m| 00:00 Sampling completed


chain 4 |[34m██████████[0m| 00:00 Sampling completed

                                                                                

                                                                                

                                                                                

                                                                                


15:38:13 - cmdstanpy - INFO - CmdStan done processing.





### Extracting the draws as structured Stan program variables

Per-variable draws can be accessed as either a numpy.ndarray object
via method `stan_variable` or as an xarray.Dataset object via `draws_xr`.

In [15]:
print(fit.stan_variable('theta'))

[0.249879 0.195015 0.513211 ... 0.229213 0.222328 0.279447]


The `stan_variables` method returns a Python `dict` over all Stan variables in the output.

In [16]:
for k, v in fit.stan_variables().items():
    print(f'name: {k}, shape: {v.shape}')

name: theta, shape: (4000,)


In [17]:
print(fit.draws_xr('theta'))

<xarray.Dataset>
Dimensions:  (draw: 1000, chain: 4)
Coordinates:
  * chain    (chain) int64 1 2 3 4
  * draw     (draw) int64 0 1 2 3 4 5 6 7 8 ... 992 993 994 995 996 997 998 999
Data variables:
    theta    (chain, draw) float64 0.2499 0.195 0.5132 ... 0.2292 0.2223 0.2794
Attributes:
    stan_version:        2.30.0
    model:               bernoulli_model
    num_draws_sampling:  1000


### Extracting the draws in tabular format

The sample can be accessed either as a `numpy` array or a pandas `DataFrame`:

In [18]:
print(f'sample as ndarray: {fit.draws().shape}\nfirst 2 draws, chain 1:\n{fit.draws()[:2, 0, :]}')

sample as ndarray: (1000, 4, 8)
first 2 draws, chain 1:
[[-6.74802   0.671774  1.02906   2.        3.        0.        7.98602
   0.249879]
 [-6.85642   0.970885  1.02906   1.        3.        0.        6.86586
   0.195015]]


In [19]:
fit.draws_pd().head()

Unnamed: 0,lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,theta
0,-6.74802,0.671774,1.02906,2.0,3.0,0.0,7.98602,0.249879
1,-6.85642,0.970885,1.02906,1.0,3.0,0.0,6.86586,0.195015
2,-8.48052,0.659398,1.02906,1.0,3.0,0.0,8.65449,0.513211
3,-6.76115,1.0,1.02906,1.0,3.0,0.0,7.86867,0.230118
4,-10.6242,0.414807,1.02906,1.0,3.0,0.0,12.1148,0.031935


### Extracting sampler method diagnostics

In [20]:
for k, v in fit.method_variables().items():
    print(f'name: {k}, shape: {v.shape}')

name: lp__, shape: (1000, 4)
name: accept_stat__, shape: (1000, 4)
name: stepsize__, shape: (1000, 4)
name: treedepth__, shape: (1000, 4)
name: n_leapfrog__, shape: (1000, 4)
name: divergent__, shape: (1000, 4)
name: energy__, shape: (1000, 4)


### Extracting the per-chain HMC tuning parameters

In [21]:
print(f'adapted step_size per chain\n{fit.step_size}\nmetric_type: {fit.metric_type}\nmetric:\n{fit.metric}')

adapted step_size per chain
[1.02906  1.00709  1.06506  0.901484]
metric_type: diag_e
metric:
[[0.598537]
 [0.545455]
 [0.435049]
 [0.557904]]


### Extracting the sample meta-data

In [22]:
print('sample method variables:\n{}\n'.format(fit.metadata.method_vars_cols.keys()))
print('stan model variables:\n{}'.format(fit.metadata.stan_vars_cols.keys()))

sample method variables:
dict_keys(['lp__', 'accept_stat__', 'stepsize__', 'treedepth__', 'n_leapfrog__', 'divergent__', 'energy__'])

stan model variables:
dict_keys(['theta'])


## Saving the sampler output files

The sampler output files are written to a temporary directory which
is deleted upon session exit unless the ``output_dir`` argument is specified.
The ``save_csvfiles`` function moves the CmdStan CSV output files
to a specified directory without having to re-run the sampler.
The console output files are not saved. These files are treated as ephemeral; if the sample is valid, all relevant information is recorded in the CSV files.

## Parallelization via multi-threaded processing

Stan's multi-threaded processing is based on the Intel Threading Building Blocks (TBB) library, which must be linked to by the C++ compiler.   To take advantage of this option, you must compile (or recompile) the program with the the C++ compiler option `STAN_THREADS`.
The CmdStanModel object constructor and its `compile` method both have argument `cpp_options`
which takes as its value a dictionary of compiler flags.

We compile the example model `bernoulli.stan`, this time with arguments `cpp_options` and `compile`, and use the function `exe_info()` to check that the model has been compiled for multi-threading.

In [23]:
model = CmdStanModel(stan_file='bernoulli.stan',
                     cpp_options={'STAN_THREADS': 'TRUE'},
                     compile='force')
model.exe_info()

15:38:14 - cmdstanpy - INFO - compiling stan file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli.stan to exe file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli


15:39:52 - cmdstanpy - INFO - compiled model executable: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/bernoulli


{'stan_version_major': '2',
 'stan_version_minor': '30',
 'stan_version_patch': '0',
 'STAN_THREADS': 'true',
 'STAN_MPI': 'false',
 'STAN_OPENCL': 'false',
 'STAN_NO_RANGE_CHECKS': 'false',
 'STAN_CPP_OPTIMS': 'false'}

### Cross-chain multi-threading

As of version CmdStan 2.28, it is possible to run the NUTS-HMC sampler on
multiple chains from within a single executable using threads.
This has the potential to speed up sampling.  It also
reduces the overall memory footprint required for sampling as
all chains share the same copy of data.the input data.
When using within-chain parallelization all chains started
within a single executable can share all the available threads
and once a chain finishes the threads will be reused.

The sample program argument ``parallel_chains`` takes an integer value which
specifies how many chains to run in parallel.
For models which have been compiled with option `STAN_THREADS` set, all chains are run from
within a single process and the value of the ``parallel_chains`` argument specifies the total number of threads.

In [24]:
fit = model.sample(data='bernoulli.data.json', parallel_chains=4)

15:39:52 - cmdstanpy - INFO - CmdStan start processing


chain 1 |[33m          [0m| 00:00 Status




chain 2 |[33m          [0m| 00:00 Status

[A





chain 3 |[33m          [0m| 00:00 Status

[A[A






chain 4 |[33m          [0m| 00:00 Status

[A[A[A

chain 1 |[34m██████████[0m| 00:00 Sampling completed


chain 2 |[34m██████████[0m| 00:00 Sampling completed


chain 3 |[34m██████████[0m| 00:00 Sampling completed


chain 4 |[34m██████████[0m| 00:00 Sampling completed

                                                                                

                                                                                

                                                                                

                                                                                


15:39:53 - cmdstanpy - INFO - CmdStan done processing.





### Within-chain multi-threading

The Stan language
[reduce_sum](https://mc-stan.org/docs/stan-users-guide/reduce-sum.html)
function provides within-chain parallelization.
For models which require computing the sum of a number of independent function evaluations,
e.g., when evaluating a number of conditionally independent terms in a log-likelihood,
the `reduce_sum` function is used to parallelize this computation.

To see how this works, we run the "reflag" model, used in the 
[reduce_sum minimal example](https://mc-stan.org/users/documentation/case-studies/reduce_sum_tutorial.html) case study.
The Stan model and the original dataset are in files "redcard_reduce_sum.stan" and "redcard.json".

In [25]:
with open('redcard_reduce_sum.stan', 'r') as fd:
    print(fd.read())

functions {
  real partial_sum(array[] int slice_n_redcards, int start, int end,
                   array[] int n_games, vector rating, vector beta) {
    return binomial_logit_lpmf(slice_n_redcards | n_games[start : end], beta[1]
                                                                    + beta[2]
                                                                    * rating[start : end]);
  }
}
data {
  int<lower=0> N;
  array[N] int<lower=0> n_redcards;
  array[N] int<lower=0> n_games;
  vector[N] rating;
  int<lower=1> grainsize;
}
parameters {
  vector[2] beta;
}
model {
  beta[1] ~ normal(0, 10);
  beta[2] ~ normal(0, 1);
  
  target += reduce_sum(partial_sum, n_redcards, grainsize, n_games, rating,
                       beta);
}




As before, we compile the model specifying argument `cpp_options`.

In [26]:
redcard_model = CmdStanModel(stan_file='redcard_reduce_sum.stan',
                     cpp_options={'STAN_THREADS': 'TRUE'},
                     compile='force')
redcard_model.exe_info()

15:39:53 - cmdstanpy - INFO - compiling stan file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/redcard_reduce_sum.stan to exe file /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/redcard_reduce_sum


15:40:17 - cmdstanpy - INFO - compiled model executable: /home/runner/work/cmdstanpy/cmdstanpy/docsrc/users-guide/examples/redcard_reduce_sum


{'stan_version_major': '2',
 'stan_version_minor': '30',
 'stan_version_patch': '0',
 'STAN_THREADS': 'true',
 'STAN_MPI': 'false',
 'STAN_OPENCL': 'false',
 'STAN_NO_RANGE_CHECKS': 'false',
 'STAN_CPP_OPTIMS': 'false'}

The `sample` method argument `threads_per_chain` specifies the number of threads allotted to each chain; this corresponds to CmdStan's `num_threads` argument.

In [27]:
redcard_fit = redcard_model.sample(data='redcard.json', threads_per_chain=4)

15:40:17 - cmdstanpy - INFO - CmdStan start processing


chain 1 |[33m          [0m| 00:00 Status




chain 2 |[33m          [0m| 00:00 Status

[A





chain 3 |[33m          [0m| 00:00 Status

[A[A






chain 4 |[33m          [0m| 00:00 Status

[A[A[A





chain 3 |[33m▍         [0m| 00:00 Status

[A[A

chain 1 |[33m▍         [0m| 00:00 Status





chain 3 |[33m▉         [0m| 00:05 Iteration:    1 / 2000 [  0%]  (Warmup)

[A[A

chain 1 |[33m▉         [0m| 00:05 Iteration:    1 / 2000 [  0%]  (Warmup)





chain 3 |[33m█▎        [0m| 00:10 Iteration:  100 / 2000 [  5%]  (Warmup)

[A[A

chain 1 |[33m█▎        [0m| 00:10 Iteration:  100 / 2000 [  5%]  (Warmup)

chain 1 |[33m█▊        [0m| 00:15 Iteration:  200 / 2000 [ 10%]  (Warmup)





chain 3 |[33m█▊        [0m| 00:15 Iteration:  200 / 2000 [ 10%]  (Warmup)

[A[A

chain 1 |[33m██▎       [0m| 00:20 Iteration:  300 / 2000 [ 15%]  (Warmup)





chain 3 |[33m██▎       [0m| 00:20 Iteration:  300 / 2000 [ 15%]  (Warmup)

[A[A

chain 1 |[33m██▋       [0m| 00:25 Iteration:  400 / 2000 [ 20%]  (Warmup)





chain 3 |[33m██▋       [0m| 00:26 Iteration:  400 / 2000 [ 20%]  (Warmup)

[A[A

chain 1 |[33m███▏      [0m| 00:29 Iteration:  500 / 2000 [ 25%]  (Warmup)





chain 3 |[33m███▏      [0m| 00:30 Iteration:  500 / 2000 [ 25%]  (Warmup)

[A[A

chain 1 |[33m███▋      [0m| 00:33 Iteration:  600 / 2000 [ 30%]  (Warmup)





chain 3 |[33m███▋      [0m| 00:34 Iteration:  600 / 2000 [ 30%]  (Warmup)

[A[A

chain 1 |[33m████      [0m| 00:37 Iteration:  700 / 2000 [ 35%]  (Warmup)





chain 3 |[33m████      [0m| 00:38 Iteration:  700 / 2000 [ 35%]  (Warmup)

[A[A

chain 1 |[33m████▌     [0m| 00:41 Iteration:  800 / 2000 [ 40%]  (Warmup)





chain 3 |[33m████▌     [0m| 00:42 Iteration:  800 / 2000 [ 40%]  (Warmup)

[A[A

chain 1 |[33m█████     [0m| 00:46 Iteration:  900 / 2000 [ 45%]  (Warmup)





chain 3 |[33m█████     [0m| 00:47 Iteration:  900 / 2000 [ 45%]  (Warmup)

[A[A





chain 3 |[34m█████▉    [0m| 00:52 Iteration: 1001 / 2000 [ 50%]  (Sampling)

[A[A

chain 1 |[34m█████▉    [0m| 00:52 Iteration: 1001 / 2000 [ 50%]  (Sampling)





chain 3 |[34m██████▎   [0m| 00:57 Iteration: 1100 / 2000 [ 55%]  (Sampling)

[A[A

chain 1 |[34m██████▎   [0m| 00:58 Iteration: 1100 / 2000 [ 55%]  (Sampling)





chain 3 |[34m██████▊   [0m| 01:02 Iteration: 1200 / 2000 [ 60%]  (Sampling)

[A[A

chain 1 |[34m██████▊   [0m| 01:03 Iteration: 1200 / 2000 [ 60%]  (Sampling)





chain 3 |[34m███████▎  [0m| 01:06 Iteration: 1300 / 2000 [ 65%]  (Sampling)

[A[A

chain 1 |[34m███████▎  [0m| 01:09 Iteration: 1300 / 2000 [ 65%]  (Sampling)





chain 3 |[34m███████▋  [0m| 01:12 Iteration: 1400 / 2000 [ 70%]  (Sampling)

[A[A

chain 1 |[34m███████▋  [0m| 01:14 Iteration: 1400 / 2000 [ 70%]  (Sampling)





chain 3 |[34m████████▏ [0m| 01:16 Iteration: 1500 / 2000 [ 75%]  (Sampling)

[A[A

chain 1 |[34m████████▏ [0m| 01:19 Iteration: 1500 / 2000 [ 75%]  (Sampling)





chain 3 |[34m████████▋ [0m| 01:21 Iteration: 1600 / 2000 [ 80%]  (Sampling)

[A[A

chain 1 |[34m████████▋ [0m| 01:24 Iteration: 1600 / 2000 [ 80%]  (Sampling)





chain 3 |[34m█████████ [0m| 01:26 Iteration: 1700 / 2000 [ 85%]  (Sampling)

[A[A

chain 1 |[34m█████████ [0m| 01:30 Iteration: 1700 / 2000 [ 85%]  (Sampling)





chain 3 |[34m█████████▌[0m| 01:31 Iteration: 1800 / 2000 [ 90%]  (Sampling)

[A[A

chain 1 |[34m█████████▌[0m| 01:36 Iteration: 1800 / 2000 [ 90%]  (Sampling)





chain 3 |[34m██████████[0m| 01:36 Iteration: 1900 / 2000 [ 95%]  (Sampling)

[A[A






chain 4 |[33m▍         [0m| 01:36 Status

[A[A[A

chain 1 |[34m██████████[0m| 01:41 Iteration: 1900 / 2000 [ 95%]  (Sampling)




chain 2 |[33m▍         [0m| 01:41 Status

[A






chain 4 |[33m▉         [0m| 01:43 Iteration:    1 / 2000 [  0%]  (Warmup)

[A[A[A




chain 2 |[33m▉         [0m| 01:48 Iteration:    1 / 2000 [  0%]  (Warmup)

[A






chain 4 |[33m█▎        [0m| 01:48 Iteration:  100 / 2000 [  5%]  (Warmup)

[A[A[A






chain 4 |[33m█▊        [0m| 01:54 Iteration:  200 / 2000 [ 10%]  (Warmup)

[A[A[A




chain 2 |[33m█▎        [0m| 01:54 Iteration:  100 / 2000 [  5%]  (Warmup)

[A






chain 4 |[33m██▎       [0m| 01:58 Iteration:  300 / 2000 [ 15%]  (Warmup)

[A[A[A




chain 2 |[33m█▊        [0m| 01:59 Iteration:  200 / 2000 [ 10%]  (Warmup)

[A






chain 4 |[33m██▋       [0m| 02:03 Iteration:  400 / 2000 [ 20%]  (Warmup)

[A[A[A




chain 2 |[33m██▎       [0m| 02:04 Iteration:  300 / 2000 [ 15%]  (Warmup)

[A






chain 4 |[33m███▏      [0m| 02:07 Iteration:  500 / 2000 [ 25%]  (Warmup)

[A[A[A




chain 2 |[33m██▋       [0m| 02:09 Iteration:  400 / 2000 [ 20%]  (Warmup)

[A






chain 4 |[33m███▋      [0m| 02:12 Iteration:  600 / 2000 [ 30%]  (Warmup)

[A[A[A




chain 2 |[33m███▏      [0m| 02:14 Iteration:  500 / 2000 [ 25%]  (Warmup)

[A






chain 4 |[33m████      [0m| 02:16 Iteration:  700 / 2000 [ 35%]  (Warmup)

[A[A[A




chain 2 |[33m███▋      [0m| 02:17 Iteration:  600 / 2000 [ 30%]  (Warmup)

[A






chain 4 |[33m████▌     [0m| 02:19 Iteration:  800 / 2000 [ 40%]  (Warmup)

[A[A[A




chain 2 |[33m████      [0m| 02:21 Iteration:  700 / 2000 [ 35%]  (Warmup)

[A






chain 4 |[33m█████     [0m| 02:24 Iteration:  900 / 2000 [ 45%]  (Warmup)

[A[A[A




chain 2 |[33m████▌     [0m| 02:25 Iteration:  800 / 2000 [ 40%]  (Warmup)

[A






chain 4 |[34m█████▉    [0m| 02:29 Iteration: 1001 / 2000 [ 50%]  (Sampling)

[A[A[A




chain 2 |[33m█████     [0m| 02:30 Iteration:  900 / 2000 [ 45%]  (Warmup)

[A






chain 4 |[34m██████▎   [0m| 02:33 Iteration: 1100 / 2000 [ 55%]  (Sampling)

[A[A[A




chain 2 |[34m█████▉    [0m| 02:34 Iteration: 1001 / 2000 [ 50%]  (Sampling)

[A






chain 4 |[34m██████▊   [0m| 02:37 Iteration: 1200 / 2000 [ 60%]  (Sampling)

[A[A[A




chain 2 |[34m██████▎   [0m| 02:39 Iteration: 1100 / 2000 [ 55%]  (Sampling)

[A






chain 4 |[34m███████▎  [0m| 02:41 Iteration: 1300 / 2000 [ 65%]  (Sampling)

[A[A[A




chain 2 |[34m██████▊   [0m| 02:43 Iteration: 1200 / 2000 [ 60%]  (Sampling)

[A






chain 4 |[34m███████▋  [0m| 02:45 Iteration: 1400 / 2000 [ 70%]  (Sampling)

[A[A[A




chain 2 |[34m███████▎  [0m| 02:47 Iteration: 1300 / 2000 [ 65%]  (Sampling)

[A






chain 4 |[34m████████▏ [0m| 02:50 Iteration: 1500 / 2000 [ 75%]  (Sampling)

[A[A[A




chain 2 |[34m███████▋  [0m| 02:51 Iteration: 1400 / 2000 [ 70%]  (Sampling)

[A






chain 4 |[34m████████▋ [0m| 02:54 Iteration: 1600 / 2000 [ 80%]  (Sampling)

[A[A[A




chain 2 |[34m████████▏ [0m| 02:55 Iteration: 1500 / 2000 [ 75%]  (Sampling)

[A






chain 4 |[34m█████████ [0m| 02:58 Iteration: 1700 / 2000 [ 85%]  (Sampling)

[A[A[A




chain 2 |[34m████████▋ [0m| 02:59 Iteration: 1600 / 2000 [ 80%]  (Sampling)

[A






chain 4 |[34m█████████▌[0m| 03:02 Iteration: 1800 / 2000 [ 90%]  (Sampling)

[A[A[A




chain 2 |[34m█████████ [0m| 03:04 Iteration: 1700 / 2000 [ 85%]  (Sampling)

[A






chain 4 |[34m██████████[0m| 03:07 Iteration: 1900 / 2000 [ 95%]  (Sampling)

[A[A[A




chain 2 |[34m█████████▌[0m| 03:07 Iteration: 1800 / 2000 [ 90%]  (Sampling)

[A




chain 2 |[34m██████████[0m| 03:10 Iteration: 1900 / 2000 [ 95%]  (Sampling)

[A

chain 1 |[34m██████████[0m| 03:10 Sampling completed                       


chain 2 |[34m██████████[0m| 03:10 Sampling completed                       


chain 3 |[34m██████████[0m| 03:10 Sampling completed                       


chain 4 |[34m██████████[0m| 03:10 Sampling completed                       

                                                                                

                                                                                

                                                                                

                                                                                


15:43:28 - cmdstanpy - INFO - CmdStan done processing.





The number of threads to use is passed to the model exe file by means of the shell environment variable `STAN_NUM_THREADS`.

On my machine, which has 4 cores, all 4 chains are run in parallel from within a single process.
Therefore, the total number of threads used by this process will be `threads_per_chain` * `chains`.
To check this, we examine the shell environment variable `STAN_NUM_THREADS`.

In [28]:
os.environ['STAN_NUM_THREADS']

'8'