# Assignment 5

In this assignment, we'll finally work with some nonconjugate models. I will also introduce you to reparameterization techniques.

## Instructions

Please complete this Jupyter notebook and **don't** convert it to a `.py` file. Upload this notebook, along with any `.stan` files and any data sets as a `zip` file to Gradescope. 

Your work will be manually graded by our TA. There is no autograder for this assignment. For free response questions, feel free to add a markdown cell and type in there. Try to keep the preexisting structure as much as possible, and to be organized and label which cells correspond with which questions.



### Problem 1: Poisson Data

In the last assignment, we modeled a vector of counts $y = (y_1, \ldots, y_n)$ using a multinomial distribution. 

Unlike last time, all of these counts will now assumed to be independent. Further, we can't reasonably put a bound on what each count could be. So, in this problem, we'll use a **Poisson likelihood**:

$$
L(y \mid \theta) = \prod_{i=1}^n L(y_i \mid \theta) \propto \prod_{i=1}^n e^{-\theta}\theta^{y_i} = e^{-n\theta}\theta^{\sum_i y_i}
$$

With this likelihood, $\theta > 0$ is interpreted as a rate or average.

The data can be found in `Road_Casualties_in_Great_Britain_1969___84_434_19.csv` Use the `DriversKilled` column only.

1.

Name a conjugate prior for this likelihood! Write your single-word answer in Gradescope.

2.

Suppose that the previous answer does not suite your needs, and that you want to use a lognormal prior! Pick a specific prior distribution (i.e. specify the hyperparameters), and describe a rationale as to why you chose them. 



3.

Use `stan` to estimate your model for the "DriversKilled" column. Please be sure to 

 - report an $\hat{R}$ diagnostic and comment on whether it is close to $1$
 - display trace plots of your samples obtained and comment on whether they look like "fuzzy caterpillars."

Then, after checking diagnostics...

 - display a histogram of the posterior for $\theta$
 - report estimates of the mean, 5th and 95th percentiles of this posterior
 - comment on whether your posterior mean is close to the frequentist estimator of $\theta$ (which is the sample mean of your data)


4.

Now use `stan` to estimate a slightly reparameterized model. Suppose you want to use a normal prior on an unconstrained parameter. Notice that if something is positive, then the (natural) log of it is unconstrained. Similarly, if something is unconstrained, the exponential of it is positive.

Therefore, use the following model


$$
\theta \sim \text{Normal}(a,b)
$$
and
$$
y_i \mid \theta \sim \text{Poisson}(e^{\theta})
$$

    
Use `stan` to estimate your model for the "DriversKilled" column. Please be sure to 

 - report an $\hat{R}$ diagnostic and comment on whether it is close to $1$
 - display trace plots of your samples obtained and comment on whether they look like "fuzzy caterpillars."

Then, after checking diagnostics...

 - display a histogram of the posterior for $\theta$
 - display a histogram of the posterior for the transformed parameter, too.
 - report estimates of the mean, 5th and 95th percentiles of the posterior of the unconstrained $\theta$
 - comment on whether your posterior mean is close to the frequentist estimator (which is the sample mean of your data)


### Problem 2: Binomial Data (again!)

Suppose that you have $m > 1$ count data points $y_1, \ldots, y_m$, each having a $\text{Binomial}(n,\eta)$ distribution. Assume further that they're all independent.

Here $n$ is the maximum for each data point. $m$ is the number of data points.

In our second homework we used the beta prior for the parameter that was bounded between $0$ and $1$. 

Now, you must use a normal prior for an unconstrained parameter. 

If $0 < \eta < 1$, then the *logit* transformation is a way to make $-\infty < \theta < \infty$ (unconstrained). Alternatively, if you have $\eta$ that's unconstrained, then the `inv_logit` will squash the value to lie between $0$ and $1$.


`stan` conveniently has a `logit()` and an `inv_logit()` function already made for you.



Use `stan` to estimate your model on any fictitious data you would like. Be sure to

 - report an $\hat{R}$ diagnostic and comment on whether it is close to $1$
 - display trace plots of your samples obtained and comment on whether they look like "fuzzy caterpillars."

Then, after checking diagnostics...

 - display a histogram of the posterior for $\theta$
 - display a histogram of the posterior for the transformed parameter, too.
 - report estimates of the mean, 5th and 95th percentiles of the posterior of the unconstrained $\theta$
 - comment on whether your posterior mean is close to the frequentist estimator (which is the sample mean of your data, again).
