In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import pymc3 as pm
import arviz as az

## Numerical Diagnostics

We will discuss:

* Effective Sampler Size
* $\hat R$
* mcse error 

In [None]:
bad_chains = np.linspace(0, 1, 1000).reshape(2, -1)
good_chains = stats.uniform.rvs(0, 1,size=(2,500))

## Effective Sample Size (ess)


    az.ess(trace)

* Is the sample large enough?

* A sample with autocorrelation has less information than a sample of the same size without autocorrelation.

* We can estimate the **effective sample size**. i.e. the size of a given sample with the equivalent information but without autocorrelation

* We recommend requiring that the rank-normalized ESS is greater than 400

In [8]:
az.ess(bad_chains), az.ess(good_chains)

(2.284600376742084, 1032.4518803020005)

In [9]:
az.summary(good_chains)

Unnamed: 0,mean,sd,hpd_3%,hpd_97%,mcse_mean,mcse_sd,ess_mean,ess_sd,ess_bulk,ess_tail,r_hat
x,0.503,0.292,0.056,0.99,0.009,0.006,1032.0,1032.0,1032.0,986.0,1.0


## $\hat R$ (aka R hat)

* Did the chains mix well?

* Compares the _between chain_ variance with the _in chain_ variance .

* If all the chains have converged, the variance should be similar across all chains, and the pooled sample of all chains.

* Ideally it should be 1, numbers  $\lessapprox 1.01$ are considered safe. 

* It can be interpreted as the overestimation of variance due to MCMC finite sampling. If you continue sampling infinitely you should get a reduction of the variance of your estimation by a $\hat R$ factor.


# mcse error 


* One of the quantities returned by `summary` is the mc_error.

* This is an estimation of the error introduced by the sampling method.

* The estimation takes into account that the samples are not truly independent of each other.

* This error should be below the precision we want in our results