Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your collaborators below:

In [None]:
COLLABORATORS = ""

---

In [None]:
import numpy as np

# Bayes' rule for discrete hypotheses

$$ P(h|d)=\frac{P(d|h)\cdot P(h)}{P(d)}$$

Posterior = Prior * Likelihood / Evidence

## Example: Medical diagnosis.

![](images/graphmod1.png)

### Data:
The patient coughed ($C=1$).

### Hypotheses ($H$):
1. Healthy ($h_0$)
1. Cold ($h_1$)
1. Lung cancer ($h_2$)

### Prior probabilities ($P(H)$):
1. $P(h_0)=0.90$
1. $P(h_1)=0.09$
1. $P(h_2)=0.01$

### Likelihood function ($P(D|H)$):
1. $P(C=1|h_0)=0.01$
1. $P(C=1|h_1)=0.5$
1. $P(C=1|h_2)=0.99$

### Posterior ($P(H|d)$)

According to Bayes' rule the posterior probability of having a cold ($h_2$) is 
$$P(H=h_2|C=1)=\frac{P(h_2)\cdot P(d|h_2)}{P(C=1)}.$$

The marginal probability of the data is
$$P(C=1)=\sum_h P(h)\cdot P(C=1|h).$$

In [None]:
#hypotheses (H)
hypotheses=np.array([0,1,2])

#data (D)
C=1

#prior (P(h))
p_h=np.array([0.9, 0.09, 0.01])

#likelihood (P(d|h))
p_cough_given_h=np.array([0.01,0.5,0.99])
p_not_cough_given_h=1-p_cough_given_h
p_d_given_h = np.array([p_not_cough_given_h,p_cough_given_h])

#marginal probability of the data
p_d=np.sum(p_h * p_d_given_h[C,:])

#Bayes rule
p_healthy_given_cough = p_h[0] * p_d_given_h[C,0] / p_d
p_cold_given_cough = p_h[1] * p_d_given_h[C,1] / p_d
p_lung_cancer_given_cough = p_h[2] * p_d_given_h[C,2] / p_d

print("P(Healthy=1|Cough=1)={}".format(p_healthy_given_cough))
print("P(Cold=1|Cough=1)={}".format(p_cold_given_cough))
print("P(LungCancer=1|Cough=1)={}".format(p_lung_cancer_given_cough))

In [None]:
#Getting the entire posterior at once by vectorized computation
p_h_given_d = p_h * p_d_given_h[C,:] / np.sum(p_h * p_d_given_h[C])
p_h_given_d 

In [None]:
#Posterior = Joint / Marginal probability of the data (aka. evidence)
joint=p_h * p_d_given_h[C,:]
p_of_d= np.sum(p_h * p_d_given_h[C])

posterior = joint/p_of_d
posterior

In [None]:
#What if the patient hadn't coughed?
c0=0
p_h_given_d = p_h * p_d_given_h[c0,:] / np.sum(p_h * p_d_given_h[c0])
p_h_given_d 

## Example 2: Biased Coins and Magic Powers

![](images/graphmod2.png)

### Hypotheses:
1. $h_{0,0}$: unbiased coin and no magic powers ($B=0 \wedge M=0$)
1. $h_{0,1}$: unbiased coin and magic powers ($B=0 \wedge M=1$) 
1. $h_{1,0}$: biased coin and no magic powers ($B=1 \wedge M=0$)
1. $h_{1,1}$: biased coin and magic powers ($B=1 \wedge M=1$)


In [None]:
hypotheses=np.array([[0,0],[0,1],[1,0],[1,1]])

### Priors:
1. $P(M=1)=0.001$
1. $P(B=1)=0.01$
1. $P(M=m,B=b)=P(M=m)\cdot P(B=b)$

In [None]:
#priors
def prior(m,b):
    p_m=np.array([0.999,0.001])
    p_b=np.array([0.99,0.01])
    return p_m[m]*p_b[b]

### Likelihoods:
1. P(C=1|M=0,B=0)=0.5
1. P(C=1|M=0,B=1)=0.6
1. P(C=1|M=1,B=0)=1
1. P(C=1|M=1,B=1)=1

In [None]:
def likelihood(c,m,b):
    if m==0 and b==0:
        return 0.5
    elif m==0 and b==1:
        if c==1:
            return 0.6
        else:
            return 0.4
    elif m==1:
        if c==1:
            return 1
        else:
            return 0

### Posterior
$ P(M=m,B=b|C)=\frac{P(C|M=m,B=b)\cdot P(M=m,B=b)}{\sum_{m'} \sum_{b'} P(M=m', B=b') \cdot P(C|m', b')}$


In [None]:
C=1 # we observed one head

def posterior(m,b,d):
    
    #compute the numerator: the joint probability
    joint=prior(m,b)*likelihood(d,m,b)
    
    if joint==0:
        return 0;
    
    #compute the denominator: the marginal probability of the data     
    p_of_d=0
    for b_prime in [0,1]:
        for m_prime in [0,1]:
            p_of_d+=prior(m_prime,b_prime)*likelihood(d,m_prime,b_prime)
    
    return joint/p_of_d

for m in [0, 1]:
    for b in [0, 1]:
        print("P(M={}, B={} | C=1) = {}".format(m, b, posterior(m, b,C)))
#[[posterior(0,0,C),posterior(0,1,C)],[posterior(1,0,C),posterior(1,1,C)]]

In [None]:
#How strongly should we believe your friend has magic powers after the coin came up heads?
posterior(1,0,C)+posterior(1,1,C)

In [None]:
#How strongly should we believe that the coin is biased given that it came up heads?
posterior(0,1,C)+posterior(1,1,C)

## Bayes' Rule

From the definition of conditional probabilities, we know that $P(H=h_2|C=1)=\frac{P(h_2,C=1)}{P(C=1)}$, but we know neither the numerator nor the denominator. :(

But we can compute them! :) 

### Step 1: Find the numerator
Let's start with the numerator. We have to compute the joint probability. We know that $P(C=1|h_2)=\frac{P(h_2,C=1)}{P(h_2)}$ from the definition of conditional probabilities. The great thing is that we know both the prior $P(h_2)$ and the likelihood $P(d|h)$. Thus, if we multiply both sides by $P(h_2)$, then we get $P(h_2,C=1)=P(h_2)\cdot P(C=1|h_2)$ and now we can compute the joint probability. (It is $0.01 \cdot 0.5 = 0.005$) This is an example of the product rule.

Now that we know the numerator, we got $P(H=h_2|C=1)=\frac{P(h_2) \cdot P(C=1|h_2)}{P(C=1)}$.

## Step 2: Find the denominator
Now let's figure out the denominator. So we still have to compute the marginal probability of coughing ($C=1$).

$P(C=1)=\sum_h P(h,C=1) = P(h_0,C=1) + P(h_1,C=1) + P(h_2,C=1)$

--> This requires the joint distribution $P(H,C)$:

$P(H,C)=P(H)\cdot P(C|H)$. Hence, $P(h_0,C=1)= P(h_0)\cdot P(C=1|h_0)$ and likewise for the other hypotheses. In general, we can write $P(C,h)=P(h)\cdot P(C|h)$.

If we plug this into the equation for the marginal probability of the data, then we get

$P(C=1)=\sum_h P(h)\cdot P(C=1|h) = P(h_0)\cdot P(C=1|h_0) + P(h_1)\cdot P(C=1|h_1)  + P(h_2)\cdot P(C=1|h_2)$

Alright, now we have got everything we need to compute the posterior distribution. If we plug the equation for the marginal probability of the data into the equation for the posterior, then we get

$P(h_2|C=1)=\frac{P(h_2)\cdot P(C=1|h_2)}{\sum_h P(h)\cdot P(C=1|h)}$.

## Step 3: Generalize the result

Nothing in this derivation depended on the value of $H$ being $h_2$ or the data $d$ being $C=1$. Hence, our result holds for all hypotheses and all data sets $d$:

$P(h|d)=\frac{P(h)\cdot P(d|h)}{\sum_h P(h)\cdot P(d|h)}$

This is Bayes rule! Congratulations, if you followed along, you have just derived Bayes' rule!

In [None]:
p_h_given_d = p_h * p_d_given_h[C] / np.sum(p_h * p_d_given_h[C])

p_h_given_d


# Posterior Odds

Posterior odds = Prior Odds x Likelihood Ratio

$\frac{P(h_1|d)}{P(h_2|d)}=\frac{P(h_1)}{P(h_2)}\cdot \frac{P(d|h_1)}{P(d|h_2)}$

In [None]:
p_h_given_d[1]/p_h_given_d[2]

In [None]:
p_h[1]/p_h[2]

In [None]:
p_d_given_h[C,1]/p_d_given_h[C,2]

---

Before turning this problem in remember to do the following steps:

1. **Restart the kernel** (Kernel$\rightarrow$Restart)
2. **Run all cells** (Cell$\rightarrow$Run All)
3. **Save** (File$\rightarrow$Save and Checkpoint)

<div class="alert alert-danger">After you have completed these three steps, ensure that the following cell has printed "No errors". If it has <b>not</b> printed "No errors", then your code has a bug in it and has thrown an error! Make sure you fix this error before turning in your problem set.</div>

In [None]:
print("No errors!")