# InferenceData Cookbook

Here we present a collection of common manipulations when working with InferenceData

In [2]:
import arviz as az

In [3]:
idata = az.load_arviz_data("centered_eight")
idata

In [4]:
idata.posterior

## Combine chains and draws

In [5]:
stacked = idata.posterior.stack(draws=("chain", "draw"))
stacked

## Obtain a numpy array for a given parameter

Let's say we want to get the values for `mu` as a numpy array 

In [6]:
stacked.mu.values

array([-3.47698606, -2.45587061, -2.82625433, ...,  4.59705819,
        5.89850592,  0.16138927])

## Get the number of variables

How many groups are in our hierarchical model?

In [7]:
len(idata.observed_data.school)

8

## Getting the variable names

Which are the name for the groups in our hierarchical model?

In [8]:
idata.observed_data.school.values

array(['Choate', 'Deerfield', 'Phillips Andover', 'Phillips Exeter',
       'Hotchkiss', 'Lawrenceville', "St. Paul's", 'Mt. Hermon'],
      dtype=object)

## Get a subset of chains

Let say we want to evaluate only chain 0 and 2 

In [9]:
idata.sel(chain=[0, 2]).posterior

## Remove the first n draws (burn-in)

Let say that we have reasons to remove the first 100 samples, from all InferenceData groups with draws and from all chains

In [10]:
burnin = idata.sel(draw=slice(100, None))

If you check the `burnin` object you will see that the groups `posterior`, `posterior_predictive`, `prior` and `sample_stats` have 400 draws comapred to idata that has 500. The group `observed_data` has not been affected as this group does not have the `draw` dimension. ALternativelly you can specify wich group or groups you want to change.

In [11]:
burnin_posterior = idata.sel(draw=slice(100, None), groups="posterior")

## Compute posterior mean values along draw and chains dimensions

Let say you want to compute the mean value of the posterior samples, you can simply do


In [12]:
idata.posterior

This will efectivelly compute the mean along all dimension. This is probably what you want for `mu` and `tau`,which have two dimensions (`chain` and `draw`), but maybe not what you expected for theta, which has one more dimension `school`. You can specify along wich dimension you want to compute the mean (or other functions).

In [13]:
idata.posterior.mean(dim=['chain', 'draw'])