In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("lab04.ipynb")

# Lab 4:  Rejection Sampling and More PyMC Practice

Welcome to the 4th Data 102 lab! 

The goal of this Lab is to get you familiar with rejection sampling, as well as give you additional practice utilizing Bayesian analysis methods with PyMC.

##### Please read the introduction and the instructions to each problem carefully.

## Collaboration Policy
Data science is a collaborative activity. While you may talk with others about the labs, we ask that you **write your solutions individually**. If you do discuss the assignments with others please **include their names** in the cell below.

**Collaborators**: 

## Submission
See the [Gradescope Submission Guidelines](https://edstem.org/us/courses/42657/discussion/3350112) for details on how to submit your lab. Unlike the last few weeks, as part of the submission process, you'll need to **generate the PDFs and upload them to a separate gradescope assignment on your own; the autograder will no longer do that for you!**

Again, since this lab involves sampling, **tests may take awhile to run. Please submit as early as possible, as last minute submissions may overwhelm Datahub, preventing yourself and others from submitting on-time.**

**For full credit, this assignment should be completed and submitted before Wednesday, September 27th, 2023 at 11:59 PM PST.**

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from scipy.stats import multivariate_normal, norm, uniform
from ipywidgets import interact, interactive

from mpl_toolkits.mplot3d import axes3d
from matplotlib import cm

import pymc as pm

import hashlib

sns.set(style="dark")
plt.style.use("ggplot")

def get_hash(num, significance = 4):
    num = round(num, significance)
    """Helper function for assessing correctness"""
    return hashlib.md5(str(num).encode()).hexdigest()


## Setup

In this Lab you are given a two dimensional unnormalized density function $q(x,y)$ represented by `target_density` below. The goal of Question 1 of this lab is to build up a sampler that can output samples from the distribution proportional to $q(x,y)$. 

In **Question 1** we will compute samples via *Rejection Sampling*. In part **1.a** we will build a sampler for a 1-dimensional projection of the density. In part **1.b** we will extend the approach to two dimensions.

*Throughout Question 1, we will assume that our computers have access only to normal and uniform random variables.*

In [None]:
# This is the target unnormalized density from which we would like to sample
# Run this to define the function
# No TODOs here
@np.vectorize # <- decorator, makes function run faster
def target_density(x, y):
    mean1 = [1, 1.7]
    mean2 = [2, 1.3]
    mean3 = [1.5, 1.5]
    mean4 = [2, 2.1]
    mean5 = [1, 1.2]
    cov1=0.2*np.array([[0.2, -0.05], [-0.05, 0.1]])
    cov2 = 0.3*np.array([[0.1, 0.07], [0.07, 0.2]])
    cov3= np.array([[0.1, 0], [0, 0.1]])
    cov4 = 0.1*np.array([[0.3, 0.04], [0.04, 0.2]])
    cov5 = 0.1*np.array([[0.4, -0.04], [-0.04, 0.2]])
    return(multivariate_normal.pdf([x, y], mean=mean1, cov=cov1) + 
           multivariate_normal.pdf([x, y], mean=mean2, cov=cov2) +
           2*multivariate_normal.pdf([x, y], mean=mean3, cov=cov3) +
           0.5*multivariate_normal.pdf([x, y], mean=mean4, cov=cov4)+
           0.5*multivariate_normal.pdf([x, y], mean=mean5, cov=cov5))

#### Let's visualize this density. 

Run the cell below to see a 3D plot of the function along with a contour plot.

In [None]:
# No TODOs here, just run the cell to make plots
# Create a meshgrid of coordinates
coords = np.arange(0.5, 2.5, 0.02)
X, Y = np.meshgrid(coords, coords)

# Compute the value of the target density at all pairs of (x,y) values
Z = target_density(X,Y)

In [None]:
# Display the 3D plot of the target density
fig = plt.figure(figsize=(15,6))

ax0 = fig.add_subplot(121, projection='3d')
ax1 = fig.add_subplot(122)

surf = ax0.plot_surface(X,Y,Z, cmap=cm.plasma, linewidth=0, antialiased=False,alpha = 0.9,)

# Customize the z axis.
ax0.set_zlim(0, 7)
ax0.set_xlabel("X")
ax0.set_ylabel("Y")
ax0.set_zlabel("Z")
ax0.set_title("3D plot of the target density")

# Rotate the axes: you can change these numbers in order to see the distribution from other angles
ax0.view_init(50, 25)

# Plot the contour plot of the density
cont = ax1.contour(X,Y,Z, levels = 20, cmap=cm.plasma, linewidths=1)
ax1.set_xlabel("X")
ax1.set_ylabel("Y")
ax1.set_title("Contour plot of the target density")

# Add a color bar which maps values to colors.
fig.colorbar(surf, shrink=0.5, aspect=5, ax=ax1)
plt.tight_layout()
plt.show()

Take a moment to examine the plots. Make sure you can see correspondances between each peak in the 3D plot on the left; and the "high-altitude" regions in the countour plot on the right.

Next we will plot 1-dimensional projections of the target densities onto the $X$ and $Y$ axis. These correspond to conditional target distributions of the form $q(x, y=y')$ and $q(x=x', y)$.

In [None]:
# Do not modify
# Run the cell below to define the plotting functions

COORDINATES = np.arange(0, 3, 0.02)
def plot_x_cond(y_val):
    fig, axs = plt.subplots(1, 2)
    fig.set_figheight(5)
    fig.set_figwidth(12)
    axs[0].contour(X,Y,Z, levels = 20, cmap=cm.plasma, alpha = 0.8, linewidths=0.8)
    axs[0].axhline(y_val,  ls="--", color = 'olive', lw = 2)
    axs[0].set_xlabel("X")
    axs[0].set_ylabel("Y")
    axs[0].set_title("Contour plot of the target density")
    
    axs[1].plot(COORDINATES, target_density(COORDINATES, y_val), color = 'olive')
    axs[1].set_ylim(0,10)
    axs[1].set_xlim(0,3)
    axs[1].set_xlabel("X")
    axs[1].set_title("Conditional target density: q(x | y={:.1f})".format(y_val))
    plt.show()
    
def plot_y_cond(x_val):
    fig, axs = plt.subplots(1, 2)
    fig.set_figheight(5)
    fig.set_figwidth(12)
    axs[0].contour(X,Y,Z, levels = 20, cmap=cm.plasma, alpha = 0.8, linewidths=0.8)
    axs[0].axvline(x_val,  ls="--", color = 'olive', lw = 2)
    axs[0].set_xlabel("X")
    axs[0].set_ylabel("Y")
    axs[0].set_title("Contour plot of the target density")
    
    axs[1].plot(COORDINATES, target_density(x_val, COORDINATES), color = 'olive')
    axs[1].set_ylim(0,10)
    axs[1].set_xlim(0,3)
    axs[1].set_xlabel("Y")
    axs[1].set_title("Conditional target density: q(y | x={:.1f})".format(x_val))
    plt.show()

In [None]:
# Display interactive plot
interactive_plot = interactive(plot_x_cond, y_val=(0, 3, 0.1), add_proposal=False)
interactive_plot

Set different values of `y_val`, observe the changes in the conditional target density.

In [None]:
# Display interactive plot
interactive_plot = interactive(plot_y_cond, x_val=(0, 3, 0.1), add_proposal=False)
interactive_plot

Set different values of `x_val`, observe the changes in the conditional target density.

<!-- BEGIN QUESTION -->

### A Quick Understanding Check:

We said that $q$ is an unnormalized density function. What does this mean? How could we test whether or not the function is normalized?

_Type your answer here, replacing this text._

<!-- END QUESTION -->

## Question 1: Rejection Sampling

In this question, we will build a rejection sampler. First, let's review the basics. 

Assume we want to sample from an unnormalized target density $q(x)$, using a proposal distribution $F$, with density $f(x)$. The proposal distribution is chosen such that we have access to samples from it. 

#### Rejection sampling proceeds as follows:

- Find constant $c$, such that $cq(x)\leq f(x)$ on the support
- At each iteration:
    - Sample $x_i \sim F$
    - Compute the ratio $r = \frac{c(q(x_i))}{f(x_i)} \leq 1$
    - Sample $\gamma_i \sim Uniform(0,1)$:
        - `accept` the sample if $\gamma_i \leq r$: Add $x_i$ to the list of samples.
        - `reject` the sample otherwise: do nothing
        
### 1a) Sample from the one-dimensional density $q(x, y=1.2)$
Throughout part 1.a, we will restrict our attention to the range $[0,3]$ for simplicity. That way we can use Uniform(0,3) as our proposal distribution. Meaning that $f(x) = \frac{1}{3} \ \forall x\in[0,3]$.

In [None]:
# Create the target 1D density q(x, y = 1.2)
def target_1D_density(x):
    return(target_density(x, 1.2))

Finish implementing the steps of the rejection sampling algorithm by filling in the following code.

*Hint: both scipy and numpy provide methods for drawing from a uniform distribution.*

In [None]:
def sample_1D_proposed_distribution(N):
    """ 
    Produces N samples from the Uniform(0,3) proposal distribution
    
    Inputs:
        N : int, desired number of samples
        
    Outputs:
        proposed_samples : an 1d-array of size N which contains N independent samples from the proposal
    """
    ...

@np.vectorize
def compute_ratio_1D(proposed_sample, c):
    """
    Computes the ratio between the scaled target density and proposal density evaluated at the 
    proposed sample point
    
    Inputs:
        proposed_sample : float, proposed sample
        c : float, constant scaling factor that ensures that the proposal density is above the target density
        
    Outputs:
        ratio : float
    """
    ...

@np.vectorize
def accept_proposal(ratio):
    """ 
    Accepts or rejects a proposal with probability equal to ratio
    
    Inputs: 
        ratio: float, probability of acceptance
    
    Outputs:
        accept: True/False, if True, accept the proposal
    """
    ...

You can use the following cell to test your functions to convince yourself that they work. 

In [None]:
# WRITE YOUR TEST CASES HERE

Now we have all the ingredients for making a sampler:

In [None]:
def get_1D_samples(N, c): 
    """ 
    Produces samples from target_1D_density
    
    Inputs:
        N : int, number of proposed_samples
        c : float, constant scaling factor that ensures that the proposal density is above the target density
        
    Outputs:
        rejection_samples : an 1d-array which contains independent samples from the target
    """
    ...

In [None]:
grader.check("q1a_ii")

From the interactive plots we made earlier, we can see that $q(x, y=1.2)$ is allways smaller than 5. Hence to make it smaller than $f(x) = 1/3$ we need to scale the target density by a factor $c \leq \frac{1}{3}\cdot\frac{1}{5} = 1/15$. 

#### Let's use $c=1/15$, compute target samples and plot their histogram

In [None]:
# No TODOs here
# Just run it once you passed the tests above

fig = plt.figure(figsize = (6, 4))
c = 1/15
target_samples = get_1D_samples(1000, c)
density_values =  target_1D_density(COORDINATES)*c
plt.plot(COORDINATES, density_values, label='Target')
plt.axhline(1/3, ls = '--', label = 'Proposal')
n, bins, rects = plt.hist(target_samples, density = True, label="Accepted Samples")
max_height = np.max([r.get_height() for r in rects])
for r in rects:
    r.set_height(r.get_height()*np.max(density_values)/max_height)
plt.legend()
plt.xlim(0,3)
plt.ylim(0,0.45)
plt.show()

#### Computing the acceptance ratio for varying scaling constants c

In [None]:
# No TODOs here
# Just run it and comment in the section below

N = 1000
c_values = [0.06, 0.05, 0.04, 0.03, 0.02, 0.01]
for c in c_values:
    # compute target samples
    target_samples = get_1D_samples(N, c)
    acceptance_percentage = 100*len(target_samples)/N
    print("For c = {:.2f}, the acceptance percentage is {:.1f}%".format(c, acceptance_percentage))

<!-- BEGIN QUESTION -->

#### In the cell below explain why the accepted percentage decreases as $c$ decreases:

_Type your answer here, replacing this text._

<!-- END QUESTION -->

### 1.b Sample from the two-dimensional density $q(x, y)$

In two dimensions Rejection Sampling is nearly identical to the 1-dimension case:

- Find constant $c$, such that $cq(x, y)\leq f(x, y)$ on the support
- At each iteration:
    - Sample $(x_i, y_i) \sim F$
    - Compute the ratio $r = \frac{c(q(x_i, y_i))}{f(x_i, y_i)} \leq 1$
    - Sample $\gamma_i \sim Uniform(0,1)$:
        - `accept` the sample if $\gamma_i \leq r$: add $(x_i, y_i)$ to the list of samples.
        - `reject` the sample otherwise: do nothing

Throughout part 1.b we will consider $(x, y)\sim Uniform(0,3)\times Uniform(0,3)$ as our proposal distribution. Meaning that $f(x, y) = \frac{1}{9}\ \forall x, y\in[0,3]$

Fill in the 2-d ratio calculation:

In [None]:
@np.vectorize
def compute_ratio_2D(proposed_sample_x, proposed_sample_y, c):
    """
    Computes the ratio between the scaled target density and proposal density evaluated at the 
    proposed sample point
    
    Inputs:
        proposed_sample_x : float, x components of the proposed sample point
        proposed_sample_y : float, y components of the proposed sample point
        c : float, constant scaling factor that ensures that the proposal density is above the target density
        
    Outputs:
        ratio : float
    """
    ratio = ...
    assert(ratio <= 1)
    return(ratio)

Use the following cell to convince yourself that your code is correct:

In [None]:
# WRITE YOUR TEST CASES HERE

Now we have all the ingredients for making the 2-d sampler.

In [None]:
# No TODOs here, just run the 2D version of the functions we built in 1.a
def get_2D_samples(N, c): 
    """ 
    Produces samples from target_density
    
    Inputs:
        N : int, number of proposed_samples
        c : float, constant scaling factor that ensures that the proposal density is above the target density
        
    Outputs:
        rejection_samples : ndarray of which contains independent samples from the target
    """
    proposed_samples_x = sample_1D_proposed_distribution(N)
    proposed_samples_y = sample_1D_proposed_distribution(N)
    ratios = compute_ratio_2D(proposed_samples_x, proposed_samples_y, c)
    accept_array = accept_proposal(ratios)
    proposed_samples = np.concatenate((proposed_samples_x.reshape(N,1), proposed_samples_y.reshape(N,1)), axis = 1)
    rejection_samples = proposed_samples[accept_array]
    return(rejection_samples)

In [None]:
grader.check("q1b_i")

From the contour plot we made previously, we can see that $q(x, y=1.2)$ is allways smaller than 7.4. Hence to make it smaller than $f(x) = 1/9$ we need to scale the target density by a factor $c \leq \frac{1}{7.4}\cdot\frac{1}{8} = 0.015$. 

#### Let's use $c=0.015$, compute target samples and plot them on top the contour lines

In [None]:
fig = plt.figure(figsize=(6,4))

# Plot the contour plot of the density
cont = plt.contour(X,Y,Z, levels = 20, cmap=cm.plasma, linewidths=1, alpha = 0.8)
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Scatterplot of samples obtained via Rejection Sampling")

# Add sample points obtained via rejection sampling
c = 1/72
target_samples = get_2D_samples(3000, c)
plt.scatter(target_samples[:,0], target_samples[:,1], c='b', alpha = 1, s = 10, label = 'Samples')

plt.legend()
plt.tight_layout()
plt.show()

In [None]:
# No need to modify this
# just run it and comment in the section below

N = 3000
c_values = [0.015, 0.01, 0.005, 0.001]
for c in c_values:
    # compute target samples
    target_samples = get_2D_samples(N, c)
    acceptance_percentage = 100*len(target_samples)/N
    print("For c = {:.3f}, the acceptance percentage is {:.1f}%".format(c, acceptance_percentage))

<!-- BEGIN QUESTION -->

#### In the cell below explain why the accepted percentage when sampling from 2D distribution is so much smaller than sampling from the 1D version in 1.a.

_Type your answer here, replacing this text._

<!-- END QUESTION -->

## Question 2: Bayesian A/B Testing

Now, we will pivot back to PyMC so you can get more practice working with PyMC. In particular, we'll be taking a look at Bayesian approaches to hypothesis testing.

Recall from Data 8 that you can perform [hypothesis testing using the permutation test](https://inferentialthinking.com/chapters/12/1/AB_Testing.html) (Chapter 12.1). Before continuing, we highly encourage you to read the above linked Data 8 chapters; this question is a direct continuation on the example from this chapter!

In a particular medical study, a sample of newborn babies was obtained from a large hospital system.  We will treat the data as if it were a simple random sample, though the sampling was done in multiple stages. Deborah Nolan and Terry Speed discuss the larger dataset in [Stat Labs](https://www.stat.berkeley.edu/~statlabs/).

One of the aims of the study was to see whether maternal smoking was associated with birth weight. Following the standard hypothesis testing procedure, they proposed the following two hypotheses:

> **Null hypothesis ($H_0$)**: In the population, the distribution of birth weights of babies is the same for mothers who don’t smoke as for mothers who do. The (observed) difference in the sample is due to chance. In other words, let $\mu_0$ be the population mean of the birth weights of babies of non-smoking mothers, and $\mu_1$ be the population mean of the birth weights of babies of smoking mothers, we claim that $\mu_0 = \mu_1$.

> **Alternative hypothesis ($H_1$)**: In the population, the babies of the mothers who smoke have a <i>**different**</i> birth weight, on average, than the babies of the non-smokers. In other words, we claim $\mu_0 \neq \mu_1$.

In recent years, however, the validity of hypothesis testing has been called into question, with the [ASA going so far as to acknowledge the limitations of $p$-values](https://amstat.tandfonline.com/doi/pdf/10.1080/00031305.2016.1154108). As a result, Bayesian alternatives to traditional hypothesis testing have begun to rise in popularity. In this lab, we'll now be revisiting this A/B Testing example from a **Bayesian approach**, with the help of `PyMC`.

<!-- BEGIN QUESTION -->

### 2(a) Limitations of the $p$-value

In this study, after conducting a permutation test, researchers were able to find enough evidence to reject the null hypothesis (i.e a p-value less than 0.05). Would it be equivalent to say that they found evidence that the alternative hypothesis is true? Why or why not?

_Type your answer here, replacing this text._

<!-- END QUESTION -->

In class, you learned that one of the advantages to Bayesian statistics is its ability to handle **data sparsity**. In the kidney cancer example, we were able to utilize priors to get around a lack of data, improving the performance of detection metrics in counties with low population counts.

Here, we'll see yet another reason why these approaches are becoming widely adopted: their **ease of interpretability**. As opposed to frequentist hypothesis testing and its arcane interpretation of the $p$-value, Bayesians are able to directly prove and disprove the hypotheses in question. In particular, we can find out just how probable our alternative hypothesis is.

To see how, let's start by opening up our data, and isolating our variables of interest.

In [None]:
baby = pd.read_csv("baby.csv")
baby.head()

This is the same dataset you've seen in Data 8/100. We will focus on the `'Birth Weight'` and `'Maternal Smoker'` columns for this lab.

In [None]:
baby = baby[['Birth Weight', 'Maternal Smoker']]
baby.head()

To formalize what we're looking for, let's revisit and define some notations.

- Let $z_i$ denote the smoker status of record $i$: if $z_i = 1$, then the $i$th baby's mother is a smoker.  
- $\mu_0$ and $\mu_1$ are the population mean of the weights of babies born to non-smokers (0) and smokers (1).

To gain information about these population parameters, we'll also need to incorporate the data we have in some way. We'll call these data $X_i$, where $X_i$ represents the $i$th baby's weight. We then define the mixture model as follows:

- $X_i \mid z_i = 0, \mu_0, \mu_1 \sim \mathcal{N}(\mu_0, \sigma_0^2)$. In words, the birth weights of babies born to non-smokers follows the normal distribution with mean $\mu_0$ and variance $\sigma_0^2$.
- $X_i \mid z_i = 1, \mu_0, \mu_1 \sim \mathcal{N}(\mu_1, \sigma_1^2)$. In words, the birth weights of babies born to smokers follows the normal distribution with mean $\mu_1$ and variance $\sigma_1^2$.

For simplicity of anaylysis, we will assume $\sigma_0^2$ and $\sigma_1^2$ is known in advance.

The difference in these population parameters ($\mu_{0} - \mu_{1}$) serves as our quantity of interest, or in the case of frequentist testing, our **test statistic**.

The goal, then, is to find the distribution $p(\mu_0 - \mu_1 | X_1,...,X_n)$ by utilizing our posterior distribution $p(\mu_0, \mu_1 | X_1,...,X_n)$.

To do that, we'll need a prior!

### 2(b) Finding a prior via Empirical Bayes

First, we'll find a prior on $\mu_0$ and $\mu_1$. As in lecture, we'll do this by utilizing the data that we already have in an Empirical Bayes approach.

In frequentist settings, researchers often use the sample mean as a way of estimating the true population mean. With bootstrap resamples of our data, we can simulate new draws from the population, and get the distribution of sample means.

For this question, we'd like to use this distribution of sample means as a prior on $\mu_0$ and $\mu_1$.

<!-- BEGIN QUESTION -->

#### 2b (i)

Fill in the blanks in the following statement:

With the assumptions that sample size is large and the sample is drawn i.i.d. from the population, the sample mean follows the ____ A ____ distribution. This is a result of the _____ B _____. 

Fill in blank A with the name of a known distribution, blank B with the name of a famous theorem.

_Type your answer here, replacing this text._

<!-- END QUESTION -->

<!-- BEGIN QUESTION -->

#### 2b (ii)

Since $\mu_0$ and $\mu_1$ represent independent draws from this prior distribution, calculate $p(\mu_0 > \mu_1)$. What does this say about our belief regarding the average birthweights of children of smoking and non-smoking mothers?

_Type your answer here, replacing this text._

<!-- END QUESTION -->

#### 2b (iii)
Now that we have some intuition about the shape of our prior, and the beliefs it encodes, let's go ahead and find the distribution of our sample means, and fit our prior over it!

Fill in the code cell below to get bootstrapped estimates of the population mean.

In [None]:
np.random.seed(42) #Do not change this line!

def get_bootstrap_mean():
    ...

means = np.array([])

for i in np.arange(10000):
    means = ...

means

In [None]:
grader.check("q2b_iii")

### 2b (iv)

With our bootstrapped sample means in hand, we can now fit our prior distribution over it!

Fill in the code cell below to define and visualize our prior distribution. 

**Remember**: since we're using an empirical Bayes approach here, your visualized prior should fit the data almost exactly.

In [None]:
support = np.linspace(116, 122, 100) # Defines points to evaluate our pdf over

prior_pdf = ...

# Plot the histogram of bootstrapped means
sns.histplot(means, stat='density');

# Plot the prior pdf
plt.plot(support, prior_pdf);
plt.title(r"Prior Distribution for $\mu_0$ and $\mu_1$");

In [None]:
grader.check("q2b_iv")

## Question 3: Defining our Model

Now, with the pieces you've defined above, craft your final A/B Testing model in the code cell below. As a reminder, our model currently consists of the following likelihoods and priors:

$$
\begin{aligned}
\mu_0, \mu_1 &\sim \text{distribution we discovered in 2(b)}\\
X_i \mid z_i = 0, \mu_0, \mu_1 &\sim \mathcal{N}(\mu_0, \sigma_0^2)\\
X_i \mid z_i = 1, \mu_0, \mu_1 &\sim \mathcal{N}(\mu_1, \sigma_1^2)
\end{aligned}
$$

To simplify your calculations, we utilized a similar bootstrapping strategy to define $\sigma_0^2$ and $\sigma_1^2$. Without proof, you can use $\sigma_0^2 = \sigma_1^2 = 18.3^2$ for your model.

**Note:** To pass the test, make sure the name parameter you pass to each `Distribution` object matches the variable name it's assigned to. Your answers should follow the following format: 

```
with pm.Model() as model:

    mu = pm.Some_Distribution('mu', ...)
    X = pm.Some_Distribution('X', ...)`
    ...
```

**Hint 1**: Since we're interested in calculating a test statistic based on our posterior distribution of $\mu_0$ and $\mu_1$, we can utilize `pm.Deterministic` ([documentation](https://www.pymc.io/projects/docs/en/v5.6.0/api/generated/pymc.Deterministic.html)) to calculate it for every simulated sample generated by PyMC.

**Hint 2**: When defining `X`, it might be helpful to use `NumPy`'s [fancy indexing](https://jakevdp.github.io/PythonDataScienceHandbook/02.07-fancy-indexing.html) to indicate smoker status of mothers.

In [None]:
with pm.Model() as model:
    mu = ...
    X = ...
    test_stat = ...

    # Do not modify trace; settings required to pass autograder
    trace = pm.sample(500, chains=4, tune=1000, target_accept=0.95, return_inferencedata=False, progressbar=False)

In [None]:
grader.check("q3")

## Question 4: Interpreting our Results
Now that we have our posterior samples, let's take a look at the posterior distribution of $\mu_0 - \mu_1$. 

In [None]:
test_statistics = trace['test_stat']

# Draw the credible interval
plt.hlines(0, np.percentile(test_statistics, 2.5), np.percentile(test_statistics, 97.5), colors='blue', linewidth=10)

sns.histplot(test_statistics, stat = "density");
plt.title(r"Posterior Distribution of $\mu_0 - \mu_1$");
plt.xlabel(r"$\mu_0 - \mu_1$");

Notice the blue interval that we've placed on our posterior: this is called the **credible interval** (not to be confused with the frequentist *confidence interval*). In particular, the credible interval we're looking at above represents the **95% credible interval**. This means with 95% confidence, we believe that $\mu_0 - \mu_1$ lies between 1.737 and 4.154.

**Note**: your exact bounds may differ slightly due to randomness in the sampling process.

<!-- BEGIN QUESTION -->

### 4(a) Credible vs. Confidence Intervals
In Data 8, you've seen a frequentist interval called the confidence interval. For your convenience, we list their definitions side-by-side:

> **95% Confidence Interval**: With a large number of repeated samples, 95% of such calculated confidence intervals would include the true value of the parameter. We say we are 95% confident that the true estimate would lie within the interval.

> **95% Credible Interval**: With our prior belief of the parameter and the data we observe, there's 95% probability that the true parameter would lie within this interval.

How are the interpretations of these intervals different? Answer the following questions:
1. What is random in the frequentist setting?
2. What is random in the Bayesian setting?
3. Which one is more intuitive to interpret?

Answer each question with **1-2 sentences**.

_Type your answer here, replacing this text._

<!-- END QUESTION -->

<!-- BEGIN QUESTION -->

### 4(b) Interpreting the Credible Interval

Notice that our 95% credible interval lies between 1.737 and 4.154. Does this prove or disprove our alternative hypothesis? If we're using the 95% credible interval as our rejection criteria, how sure are we that the alternative hypothesis is true?

Answer the question with **2 sentences** or less.

_Type your answer here, replacing this text._

<!-- END QUESTION -->

## Congratulations! You have finished Lab 4! ##

Below, you will see two cells. Running the first cell will automatically generate a PDF of all questions that need to be manually graded, and running the second cell will automatically generate a zip with your autograded answers. **You are responsible for both the coding portion (the zip from Lab 4) and the written portion (the PDF of written responses from Lab 4) to their respective Gradescope portals.** The coding proportion should be submitted to the `Lab 4` assignment as a single zip file, and the written portion should be submitted to `Lab 4 PDF` assignment as a single pdf file. When submitting the written portion, please ensure you select pages appropriately.

If there are issues with automatically generating the PDF in the first cell, you can try downloading the notebook as a PDF by clicking on `File -> Save and Export Notebook As... -> PDF`. If that doesn't work either, you can manually take screenshots of your answers to the manually graded questions and submit those. Either way, **you are responsible for ensuring your submission follows our requirements, we will NOT be granting regrade requests for submissions that don't follow instructions.**

In [None]:
import matplotlib.image as mpimg
from otter.export import export_notebook
from os import path
from IPython.display import display, HTML
export_notebook("lab04.ipynb", filtering=True, pagebreaks=True)
if(path.exists('lab04.pdf')):
    img = mpimg.imread('baby_seal.png')
    imgplot = plt.imshow(img)
    imgplot.axes.get_xaxis().set_visible(False)
    imgplot.axes.get_yaxis().set_visible(False)
    plt.show()
    display(HTML("Download your PDF <a href='lab04.pdf' download>here</a>."))
else:
    print("\n Pdf generation fails, please try the other methods described above")

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(run_tests=True)