In [None]:
import arviz as az
import math
import matplotlib.pyplot as plt
import numpy as np
import pymc as pm
import scipy.stats as stats
import pandas as pd
import xarray as xr
from causalgraphicalmodels import CausalGraphicalModel

RANDOM_SEED = 42
rng = np.random.default_rng(RANDOM_SEED)
hdi_fill_args = { 'color': 'gray', 'alpha': 0.2 }

__Probabilistic Programming. Wasowski. Pardo. IT University of Copenhagen__

This file contains the list of exercises for the week, as well as any related code.

## Exercises

The exercises for this week are: __all exercises__ from Chapter 5, McElreath, with exceptions noted below. Some remarks:
* __5M4__ is the first programming exercises. All the questions before it are about math on paper, or discussion (this remark is here to make sure you don't get into large open coding projects).
* Exercise __5M4__ requires finding a dataset with percent of population of members of Church of Jesus Christ of Latter-day Saints (LDS).  Below we provide this dataset and add it to the loaded `WaffleDivorce.csv` data set, to get you started more efficiently.

In [None]:
data = pd.read_csv("WaffleDivorce.csv", delimiter=";")
data["pct_LDS"] = np.asarray([0.75, 4.53, 6.18, 1, 2.01, 2.82, 0.43, 0.55, 0.38, 0.75, 0.82, 5.18, 26.35, 0.44, 0.66, 0.87, 1.25, 0.77, 0.64, 0.81,
                              0.72, 0.39, 0.44, 0.58, 0.72, 1.14, 4.78, 1.29, 0.61, 0.37, 3.34, 0.41, 0.82, 1.48, 0.52, 1.2, 3.85, 0.4, 0.37, 0.83, 1.27, 0.75,
                              1.21, 67.97, 0.74, 1.13, 3.99, 0.92, 0.44, 11.5])
data.head()

* __5M5__ and __5H1__, are a no-code exercises again. (Well. CasualGraphicalModels can help with __5H1__)

* I found the combination of __5H1__ and __5H2__ particularly useful for learning about counterfactuals, because the causal model that we are working with is quite different from the one in the class. It makes it much more clear that the counterfactual predictions should be made from a causal model, and the Bayesian hierarchical model we are using, should follow the same structure as the causal model.  As a hint, this is the model I used:

![image.png](attachment:d3015cd8-ff2e-4157-ac01-04b90731ca60.png)

* To estimate the effect of doublig/halving one variable (M both in 5H2 and 5H3) we need to choose a particular value of M. A mean is a good choice.  Then we do a normal prior prediction, but just with two points. The mean, and double that, or the mean and half that (for the corresponding exercises). One needs to remember to standardize the two points.

* Similarly, this is the PyMC model structure I arrived at for Ex. __5H3__:

![image.png](attachment:15f31732-ba3d-445d-8177-c1df42e2efad.png)

* Skip the last exercise __5H5__, unless excited about it. 