# Alarm or no alarm?

_y_ is binomial(n,$\theta$), that the action space is {alarm, no alarm}. The loss function is as follows:

\begin{align*}
 L(\theta, \text{no alarm}) &= 
 \begin{cases} 
  5000 & \text{if}\; \theta > 0.15 \\
  0 & \text{if}\; \theta < 0.15
  \end{cases} \\
 L(\theta, \text{alarm}) &= 
  \begin{cases} 
  0 & \text{if}\; \theta > 0.15 \\
  1000 & \text{if}\; \theta < 0.15
  \end{cases}
\end{align*}

When is it the right decision to push "alarm"? Find the posterior distribution given prior $p(\theta)$. So, for n=50, for what values of _y_ should one decide on "alarm"?

## (a) 

Given a uniform prior, find the posterior and _y_ value for when to push alarm.

We can use $\theta \sim Beta(1,1)$ as a uniform prior. We know that the prior is:

$$\theta \sim Beta(a, b)$$

and our likelihood function is:

$$y | \theta \sim Bin(n, \theta), \quad \text{n=50}$$

Calculation of the posterior:

\begin{align*}
\pi(\theta|y) &= \frac{1}{f_{marg}(y)} p(\theta) f(y|\theta) \\
&= \frac{1}{f_{marg}(y)} \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)} \theta^{a-1} (1-\theta)^{b-1} \binom{n}{y} \theta^y (1-\theta)^{n-y} \\
&\propto \theta^{a+y-1} (1-\theta)^{b+n-y-1}
\end{align*}

which will yield

$$\theta | y \sim Beta(a+y, b+n-y)$$

We also must take into considuration the risk of "alarm" and "no alarm". So first the risk of pushing alarm:

\begin{align*}
 R_{\text{alarm}} &= E\left(L(\theta, \text{ alarm})|data\right) \\
 &= 1000 \cdot Pr(\theta < 0.15 | data) \\
 &= 1000 G, \quad \text{where } G = Pr(\theta<0.15|data)
\end{align*}

Risk of not pushing alarm

\begin{align*}
 R_{\text{no alarm}} &= E\left(L(\theta, \text{ no alarm})|data\right) \\
 &= 5000 \cdot Pr(\theta > 0.15 | data) \\
 &= 5000(1 - G) 
\end{align*}

We want to push alarm when $R_{\text{alarm}} < R_{\text{no alarm}}$ (expected loss is greater for not pushing alarm):

\begin{align*}
 1000\cdot G(0.15) &< 5000 \cdot (1 - G(0.15)) \\
 G(0.15) &< 5 - 5G(0.15) \\
 G(0.15) &< \frac56 = 0.8333
\end{align*}

So for a=b=1, this will give us

In [1]:
import numpy as np
import scipy.stats as stats
n = 50
y_val = np.arange(0, n+1, 1)
limit = 5/6
a = 1
b = 1

G1 = stats.beta.cdf(0.15, a+y_val, b+n-y_val)


for i, prob in enumerate(G1):
    print(i, prob)
    if prob < limit:
        print(f"We need a y value of at least {i} to push alarm")
        break

0 0.9997486000358442
1 0.997486000358442
2 0.9875039429581385
3 0.9587321304513812
4 0.8978035863194245
5 0.7967338837005309
We need a y value of at least 5 to push alarm


## (b)

Now do the same thing, but with prior $\theta \sim Beta(2,8)$

In [3]:
import numpy as np
import scipy.stats as stats
n = 50
y_val = np.arange(0, n+1, 1)
limit = 5/6
a = 2
b = 8

G1 = stats.beta.cdf(0.15, a+y_val, b+n-y_val)


for i, prob in enumerate(G1):
    print(i, prob)
    if prob < limit:
        print(f"We need a y value of at least {i} to push alarm")
        break

0 0.9992182472355202
1 0.9955680923476412
2 0.9833293377235763
3 0.9530924145347104
4 0.8943972106975
5 0.8011754163678123
We need a y value of at least 5 to push alarm


## (c) 

Now, a more special situation. We now have a mixture prior with two Beta distributions:

$$\theta \sim \frac12 Beta(2,8) + \frac12 Beta(8,2)$$

This will make the math a bit more complicated (we'll need the marginal distribution). So, given the prior above and a binomial as our likelihood, we'll get the following posterior:

$$\theta | y \sim w_1(y) Beta(2+y, 8+n-y) + w_2(y) Beta(8+y, 2+n-y)$$

where $w_1(y) = \frac{f_1(y)}{f_1(y) + f_2(y)}$ and $w_2(y) = \frac{f_2(y)}{f_1(y) + f_2(y)}$. 

$f_i(y)$ (where i=1,2) is the marginal distribution of i:

\begin{equation*}
f_i(y) = \frac{n!}{y!(n-y)!} \cdot \frac{\Gamma(a_i+b_i)}{\Gamma(a_i)\Gamma(b_i)} \cdot \frac{\Gamma(a_i+y) \Gamma(b_i + n - y)}{\Gamma(a_i+b_i+n)}
\end{equation*}

In this exercise we'll be using the log gamma function to avoid too large of numbers (and avoid floating limits). So first lets calculate the marginals:

In [4]:
from scipy.special import loggamma as lg
a1, b1 = 2, 8
a2, b2 = 8, 2
f1 = [lg(n+1) - lg(y+1)-lg(n-y+1) + lg(a1+y) + lg(b1+n-y) - lg(n+a1+b1) + lg(a1+b1) - lg(a1)-lg(b1) for y in y_val]  
f2 = [lg(n+1) - lg(y+1)-lg(n-y+1) + lg(a2+y) + lg(b2+n-y) - lg(n+a2+b2) + lg(a2+b2) - lg(a2)-lg(b2) for y in y_val]

In [5]:
# note that f1 and f2 are pdfs and should sum to 1
print(sum(np.exp(f1)), sum(np.exp(f2)))

1.0000000000000049 1.000000000000007


Now that we have have the marginals, we can caluclate the "weights" and find when we should push alarm:

In [6]:
w1 = np.exp(f1) / (np.exp(f1)+np.exp(f2))
w2 = np.exp(f2) / (np.exp(f1)+np.exp(f2))

G3 = w1*stats.beta.cdf(0.15, a1+y_val, b1+n-y_val) + \
w2*stats.beta.cdf(0.15, a2+y_val, b2+n-y_val)


for i, prob in enumerate(G3):
    print(i, prob)
    if prob < limit:
        print(f"We need a y value of at least {i} to push alarm")
        break

0 0.9992181850060875
1 0.9955676940824276
2 0.9833276210296632
3 0.9530868181347092
4 0.8943825466878529
5 0.8011434012397657
We need a y value of at least 5 to push alarm


In [14]:
# also note, w1+w2 should be equal to 1 for all y values
w1+w2

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])