In [47]:
import numpy as np
import pandas as pd
import bokeh
from bokeh.plotting import figure, output_notebook, show
from scipy.stats import norm, gamma
from scipy.optimize import minimize, leastsq
output_notebook()

David Diaz  
SEFS 590F (Bayesian Models) - Winter 2016-17

## Homework 3

### 1. You have collected temperature data on the December solstice for the last 30 years in Pittsburgh. Assume the data are normally distributed with known standard deviation 12. In a previous study researchers found mean temperature to be 20 with a standard deviation of 7. The solstice temperature measurements are stored in the data frame “temps.csv”.

#### a. Identify the appropriate conjugate distribution and write out the full posterior for the quantity you are modeling. 

Posterior = normal  
Likelihood $y\sim normal(\mu,\sigma^2)$; $\sigma^2$ is known  
$\mu\sim normal\left(\frac{\left(\frac{\mu_0}{\sigma_0^2}+\frac{\sum_{i=1}^n y_i}{\sigma^2}\right)}{\left(\frac{1}{\sigma_0^2}+\frac{n}{\sigma^2}\right)},\left(\frac{1}{\sigma_0^2}+\frac{n}{\sigma^2}\right)^{-1}\right)$

#### b. Plot both the posterior and the prior distributions for the modeled quantity and interpret.

In [32]:
temps = pd.read_csv("temps.csv")

sigma, n = 12, len(df) # set parameters for the data
mu_0, sigma_0 = 20, 7 # set parameters for the prior
# parameters for the posterior
loc = ((mu_0/sigma_0**2)+(temps.mintemp.sum()/sigma**2))/(1/sigma_0**2+n/sigma**2)
scale = (1/sigma_0**2+n/sigma**2)**(-1)

xs = np.arange(0, 50, 0.01)
priors = norm.pdf(xs, mu_0, sigma_0)
posteriors = norm.pdf(xs, loc, scale)
likelihood = norm.pdf(xs, temps.mintemp.mean(), temps.mintemp.std())

p = figure(title="Distributions of Mean Min. Temp for Winter Solstice", 
            x_axis_label='Mu', y_axis_label='Density')
p.line(xs, priors, line_width=2, color='red', legend="Prior")
p.line(xs, posteriors, line_width=2, color='blue', legend="Posterior")
p.line(xs, likelihood, line_width=2, color="orange", legend="Likelihood")
show(p)

Interpretation:  
Following collection of 40 new observations of minimum winter solstice temperatures (with mean 23.3 and SD 11.68), we moved our prior estimate of the mean minimum temperature up from 20 to ~22 and shrank our estimate of the distribution of the mean. 

#### c. Using the posterior distribution, determine the 95% credible interval for the parameter.

In [46]:
print("({}, {})".format(norm.ppf(0.025, loc=loc, scale=scale), 
                 norm.ppf(0.975, loc=loc, scale=scale)))

(16.50118542066795, 29.64710355271607)


#### d. Comment on the informative nature of the prior in terms of its relationship to the posterior.

The prior was moderately informative. It contributed to shrinkage in the posterior distribution, but did not strongly pull the estimate of mean minimum temperature far from the observations.

### 2. You are conducting research on White-crowned Sparrows and are interested in their relative abundance near Mt. Rainier. You select 30 sites and conduct 5 minute fixed radius surveys (i.e., at a given site you count all the white-crowed sparrows you observed within a 50m radius of you during a 5 minute period). Use these data (“sparrows.csv”) to answer the following questions. 

#### a. What is the likelihood distribution you would select for these data and why?

In [40]:
sparrows = pd.read_csv('sparrows.csv')
print("mean: ", sparrows['count'].mean())
print("var: ", sparrows['count'].var())

mean:  4.6
var:  4.041379310344827


I would model these sparrow counts using the Poisson distribution. These are discrete counts and the variance and mean are comparable in this survey.

#### b. What parameter will you need a prior distribution for and what distribution is appropriate?

I would need to define a prior distribution for the parameter $\lambda$, which represents the mean number of occurences per observation. I would select the conjugate prior, which is of a gamma distribution.  

#### c. Take the prior distribution that you select and examine 4-5 different values for the parameters. Create a plot with these different parameter values –below as an idea of what the graph should look like.

In [73]:
xs = np.arange(0.05, 15, 0.001)
uninformative = gamma.pdf(xs, a=0.001, scale=1/0.001)
inform_0 = gamma.pdf(xs, a=0.1, scale=1/0.1)
inform_1 = gamma.pdf(xs, a=1, scale=1/1)
inform_2 = gamma.pdf(xs, a=2, scale=1/2)
inform_3 = gamma.pdf(xs, a=2, scale=1/0.5)

p2 = figure(title="Distributions of lambda", 
            x_axis_label='lambda', y_axis_label='Density')
p2.line(xs, uninformative, line_width=2, color='red', legend="a=0.001, b=0.001")
p2.line(xs, inform_0, line_width=2, color='blue', legend="a=0.1, b=0.1")
p2.line(xs, inform_1, line_width=2, color="orange", legend="a=1, b=1")
p2.line(xs, inform_2, line_width=2, color="green", legend="a=2, b=2")
p2.line(xs, inform_3, line_width=2, color="aqua", legend="a=2, b=0.5")

show(p2)

#### d. From the values you explored, select a vague prior and an informative prior. Calculate the posterior distribution under these two different priors. Plot both the posterior and the prior distributions for the modeled quantity and interpret. (example graph below, shows a prior distribution with the dashed line and the posterior as a solid line. The red lines are one prior and posterior, and the black is the other pair).

In [91]:
alpha_0, beta_0 = 0.001, 0.001 # vague prior  
alpha_1, beta_1 = 8, 5 # informative prior

post_vague = alpha_0 + sparrows['count'].sum(), beta_0 + len(sparrows)
post_inform = alpha_1 + sparrows['count'].sum(), beta_1 + len(sparrows)

p3 = figure(title="Distributions of lambda", 
            x_axis_label='lambda', y_axis_label='Density')
p3.line(xs, gamma.pdf(xs, a=alpha_0, scale=1/beta_0),
        line_width=3, color='red', legend="Vague Prior", alpha=0.5)
p3.line(xs, gamma.pdf(xs, a=alpha_1, scale=1/beta_1),
        line_width=3, color='blue', legend="Informative Prior", alpha=0.5)
p3.line(xs, gamma.pdf(xs, post_vague[0], scale=1/post_vague[1]), line_dash='dashed',
        line_width=3, color='red', legend="Vague Posterior", alpha=0.5)
p3.line(xs, gamma.pdf(xs, post_inform[0], scale=1/post_inform[1]),line_dash='dashed',
        line_width=3, color='blue', legend="Informative Posterior", alpha=0.5)
show(p3)

Increasing the level of the beta parameter of the gamma distribution of the lambda prior leads to a very strong influence/bias in the posterior distribution. 

#### e. Using the posterior distributions from your ‘vague’ and informative priors, determine the posterior mean and the 95% credible interval for the parameter. Comment on the effect of the parameter estimate and the credible interval under the two different priors. 

In [112]:
print("Vague", gamma.stats(post_vague[0], scale=1/post_vague[1], moments='m'),
      "({:3.3f}, {:3.3f})".format(gamma.ppf(0.025, post_vague[0], scale=1/post_vague[1]), 
                 gamma.ppf(0.975, post_vague[0], scale=1/post_vague[1])))
print("Informative", gamma.stats(post_inform[0], scale=1/post_inform[1], moments='m'),
      "({:3.3f}, {:3.3f})".format(gamma.ppf(0.025, post_inform[0], scale=1/post_inform[1]), 
                 gamma.ppf(0.975, post_inform[0], scale=1/post_inform[1])))

Vague 4.599880003999867 (3.864, 5.398)
Informative 4.171428571428572 (3.522, 4.875)


The higher the beta parameter, the narrower the credible interval for the posterior and the greater the influence on shifting the mean.

#### f. Fit the sparrow data using a generalized linear model using a likelihood approach (e.g., using the glm command in R). Determine the mean parameter estimate and calculate the 95% confidence interval for the parameter.

#### g. Describe the difference between the 95% confidence interval you found in part f with the credible interval you found in part e (I’m specifically asking you to interpret each one, not just to compare the values you found).

### 3. For the analysis of the sparrow data, you decide to use a log link in your model. Thus you have for your model:
$$y_i~Pois(\lambda)$$
$$log(\lambda)=\alpha$$

#### a. Use the vague prior you selected in 2d as the prior for $\alpha$. Will the prior still be vague for $\lambda$? Demonstrate this mathematically or in a plot.

#### b. Select a prior for $\alpha$ that is also vague for $\lambda$. Show the relationship in a plot.