# Problem 2: Bayesian vs. Frequentist

_Here, you will explore the difference between a frequentist and a Bayesian approach._

You have a measurement station on the roof of the CSU Atmospheric Science building that measures wind
direction and the concentration of Nitrogen (C) in the air. You know from many years of measurements
that winds from the east (E) flow over the Greeley feedlots and the air that arrives at your instrument has
anomalous Nitrogen concentrations that follow a normal distribution with mean μE = 0.2 and σE = 1. On
the other hand, winds from the west (W) bring relatively pristine air from the mountains and that air has
anomalous Nitrogen concentrations that follow a normal distribution with mean μW = 0.0 and σW = 1.

But wait! You have a problem. You just went to check on your instrument, and you found that it is no
longer recording the hourly wind direction or the concentrations! Due to glitch in the software, all you know
is that the average Nitrogen concentration of the air over the past 100 samples was C100 = 0.13. **Do you
think the wind over the last 100 samples was from the east or the west?**

(a) Use hypothesis testing, can you reject the null hypothesis that the air is actually pristine air from the west
(H0 : μ = 0)? Use a two-tailed confidence interval of 95%.

Setting up the problem: 

1. Significance level: alpha = 0.05 

2. State null hypothesis and the alternative
    
        H0: wind is air from the west (mu_W = 0)

        H1: wind is not air from the west (mu_W =/= 0)

3. Statistic to be used: N>30, therefore z-statistics can be used. 

4. Critical region: z_c = alpha/2

5. Evaluate statistic and state the conclusion 

In [1]:
#.............................................
# IMPORT STATEMENTS
#.............................................
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import matplotlib as mpl
import scipy.stats as stats
import random

# set figure defaults
mpl.rcParams['figure.dpi'] = 150
plt.rcParams['figure.figsize'] = (10.0/2, 7.0/2)

The z-statistics: 

z = ((x_bar1 - x_bar2)- Delta_1,2)/(sqrt(sigma_1^2/N_1 + sigma_2^2/N_2))

Delta_1,2 is the hypothesized difference (usually 0)

In [8]:
#Constants

alpha = 0.05
z_c = alpha/2

std_E = 1 #Standard deviation from the East 
std_W = 1 #Standard deviation from the West 

mu_E = 0.2 #Mean of normal distribution of winds from the East 
mu_W = 0.0 #Mean of normal distribution of winds from the West 

C_bar100 = 0.13 #Average Nitrogen concentrations of the air over the past 100 samples **Were these samples from the East or from the West??
N = 100

z_E = (C_bar100 - mu_E)/(std_E/(np.sqrt(N)))
z_W = (C_bar100 - mu_W)/(std_W/(np.sqrt(N)))

print("z_c =",z_c,", z_E =",z_E,", z_W =",z_W)
print("Since z_W > z_c, we can reject the null hypothesis that the wind came from the West.")

z_c = 0.025 , z_E = -0.7000000000000001 , z_W = 1.3
Since z_W > z_c, we can reject the null hypothesis that the wind came from the West.


(b) You start discussing your issue with a graduate student in the department. They don’t know which way
the wind was recently blowing, but they do point you toward the Iowa Environmental Mesonet website
which provides information about the annual-mean wind directions (see Figure 1). Analyzing the wind rose
information, you see that the winds blow out of the west (between SW and NW) 40% of the time and that
the winds blow out of the east (between NE and SE) 60% of the time.

Let γ denote the fraction of time the air comes from the west (so, γ = 0.4). Use Bayes’ Theorem to compute
the probability that the air came from the west over your 100-sample period.

**Hint 1:**
If the Bayes formulation says the probability that the winds were out of the west is greater than 50%, assume that
the winds were out of the west. If the probability is less than 50%, assume the winds were out of the east.

**Hint 2:**
Although the problem is formulated as Pr(West|C_bar100 = 0.13), we can’t actually test for C_bar100 = 0.13 since the
probability of getting any one value is always zero. Thus, we will instead re-formulate the question as follows: what is the probability that the winds were out of the west given a mean Nitrogen concentration of C_bar100 within δ of 0.13? That is, Pr(West|0.13 − δ <= C_bar100 <= 0.13 + δ). For this homework, let δ = 0.01. 


To Bayes' theorem: 

1. Define your variables

    Pr(W) = 40% = 0.40 #Pr that air came from the West
    
    Pr(E) = Pr(~W) =  60% = 0.60 #Pr that air did not come from the West 
    
    Pr([N]) = 0.12 #Nitrogen concentration from the west is greater or equal to 0.13  ?????
    
    Pr(~[N]) = 0.14 #Nitrogen concentration from the west is less or equal to 0.13??????
    
    
2. Clearly state what you want to know
3. List all of the information the problem gives you 
4. Check the assumptions for your method of solving 
5. Then, solve for what you listed in Step 2

