# Bayes's Theorem: Week 4 Tutorial COS20083


Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each.

![title](cookies.jpg)

Now suppose you choose one of the bowls at random and, without looking, select a cookie at random. The cookie is vanilla. What is the probability that it came from Bowl 1? 

This is a conditional probability; we want $p(Bowl_1| vanilla)$, but it is not obvious how to compute it. If I asked a different question—the probability of a vanilla cookie given Bowl 1—it would be easy:
$p(vanilla| Bowl_1) = 3/4$

Sadly, $p(A|B)$ is not the same as $p(B|A)$, but there is a way to get from one to the other: Bayes’s theorem:

$$ 
 p(A|B)=\frac{p(A)p(B|A)}{p(B)}
$$

For example, we can use it to solve the cookie problem. I’ll write B1 for the
hypothesis that the cookie came from Bowl 1 and V for the vanilla cookie.
Plugging in Bayes’s theorem we get

$$
 p(B_1|V)=\frac{p(B_1)p(V|B_1)}{p(V)}
$$

The term on the left is what we want: the probability of Bowl 1, given that we chose a vanilla cookie. The terms on the right are:
1. $p(B_1)$ : This is the probability that we chose Bowl 1, unconditioned by what kind of cookie we got. Since the problem says we chose a bowl at random, we can assume p(B1) = 1/2.
1. $p(V|B_1)$ : This is the probability of getting a vanilla cookie from Bowl 1, which is 3/4.
1. $p(V)$ : This is the probability of drawing a vanilla cookie from either bowl. Since we had an equal chance of choosing either bowl and the bowls contain the same number of cookies, we had the same chance of choosing any cookie. Between the two bowls there are 50 vanilla and 30 chocolate cookies, so $p(V)$ = 5/8.

Putting it together, we have

$$
 p(B_1|V)=\frac{(1/2)(3/4)}{5/8} = 3/5
$$

So the vanilla cookie is evidence in favor of the hypothesis that we chose Bowl 1, because vanilla cookies are more likely to come from Bowl 1. 

This example demonstrates one use of Bayes’s theorem: it provides a strategy to get from $p(B|A)$ to $p(A|B)$. This strategy is useful in cases, like the cookie problem, where it is easier to compute the terms on the right side of Bayes’s theorem than the term on the left

Rewriting Bayes’s theorem with H and D yields:
$$
p(H|D) = \frac{p(H) p(D|H)}{p(D)}
$$

In this interpretation, each term has a name:
- p(H) is the probability of the hypothesis before we see the data, called the prior probability, or just prior.
- p(H|D) is what we want to compute, the probability of the hypothesis after we see the data, called the posterior.
- p(D|H) is the probability of the data under the hypothesis, called the likelihood.
- p(D) is the probability of the data under any hypothesis, called the normalizing constant.

Sometimes we can compute the prior based on background information. For example, the cookie problem specifies that we choose a bowl at random with equal probability.

source: Think Bayes, Allen B. Downey

In [1]:
# Configure Jupyter so figures appear in the notebook
%matplotlib inline

# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'


import numpy as np
import pandas as pd

In [2]:
class BayesTable(pd.DataFrame):
    def __init__(self, hypo, prior):
        columns = ['hypo', 'prior', 'likelihood', 'posterior', 'norm_posterior']
        super().__init__(columns=columns)
        self.hypo = hypo
        self.prior = prior
    
    def mult(self):
        self.posterior = self.prior * self.likelihood
        
    def norm(self):
        nc = np.sum(self.posterior)
        self.norm_posterior = self.posterior / nc
        return nc
    
    def update(self):
        self.mult()
        return self.norm()
    
    def reset(self):
        return BayesTable(self.hypo, self.norm_posterior)

In [3]:
# Setting up prior probability of getting either bowl. It is fair chance therefore 0.5
table = BayesTable(['Bowl 1', 'Bowl 2'], [0.5, 0.5])

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Bowl 1,0.5,,,
1,Bowl 2,0.5,,,


In [4]:
# likelihood of getting vanilla cookies in each bowl
table.likelihood = [3/4, 1/2]
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Bowl 1,0.5,0.75,,
1,Bowl 2,0.5,0.5,,


In [5]:
# The next step is to multiply the priors by the likelihoods, which yields the unnormalized posteriors.
table.mult()
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Bowl 1,0.5,0.75,0.375,
1,Bowl 2,0.5,0.5,0.25,


In [6]:
# Now we can compute the normalized posteriors; `norm` returns the normalization constant.
table.norm()
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Bowl 1,0.5,0.75,0.375,0.6
1,Bowl 2,0.5,0.5,0.25,0.4


# Task 4 

### Problem 1 (1.5 marks)

M&M’s are small candy-coated chocolates that come in a variety of colors. Mars, Inc., which makes M&M’s, changes the mixture of colors from time to time. 

In 1995, they introduced blue M&M’s. Before then, the color mix in a bag of plain M&M’s was 30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 10% Tan. Afterward it was 24% Blue , 20% Green, 16% Orange,
14% Yellow, 13% Red, 13% Brown.

Suppose a friend of mine has two bags of M&M’s, and he tells me that one is from 1994 and one from 1996. He won’t tell me which is which, but he gives me one M&M from each bag. One is yellow and one is green. What is
the probability that the yellow one came from the 1994 bag?

This problem is similar to the cookie problem, with the twist that I draw one sample from each bowl/bag. This problem also gives me a chance to demonstrate the table method, which is useful for solving problems like this on paper. In the next chapter we will solve them computationally. The first step is to enumerate the hypotheses. The bag the yellow M&M came from I’ll call Bag 1; I’ll call the other Bag 2. So the hypotheses are:

- A: Bag 1 is from 1994, which implies that Bag 2 is from 1996.
- B: Bag 1 is from 1996 and Bag 2 from 1994.

Now we construct a table with a row for each hypothesis and a column for
each term in Bayes’s theorem: 
<table border="1"><th>Case</th><th> Prior p(H) </th><th> Likelihood p(D|H)<th> Posterior p(H|D)</th><th> Normalizing p(H)/p(D|H)<th>
    <tr><td>A</td><td>? </td><td>?</td><td>?</td><td>?</td></tr>
    <tr><td>B</td><td>? </td><td>?</td><td>?</td><td>?</td></tr>
</table>

The first column has the priors. Based on the statement of the problem, it is reasonable to choose p(A) = p(B) = 1/2.

The second column has the likelihoods, which follow from the information in the problem. For example, if A is true, the yellow M&M came from the 1994 bag with probability 20%, and the green came from the 1996 bag with probability 20%. If B is true, the yellow M&M came from the 1996 bag with probability 14%, and the green came from the 1994 bag with probability 10%. Because the selections are independent, we get the conjoint probability by multiplying them to get the likelihood.

The third column is just the product of the previous two. The sum of this column is the normalizing constant. To get the last column, which contains the posteriors, we divide the third column by the normalizing constant.

Based on the above explanation fill in the above table.



In [23]:
# Setting up prior probability of getting the m&m colors from both bags in both cases. It is fair chance therefore 0.5
table = BayesTable(['Case A', 'Case B'], [0.5, 0.5])

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Case A,0.5,,,
1,Case B,0.5,,,


In [27]:
# likelihood of Case A and Case B
#table.likelihood = [40/10000, 70/10000]
table.likelihood = [(0.20)*(0.20), (0.10)*(0.14)]
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Case A,0.5,0.04,0.002,0.363636
1,Case B,0.5,0.014,0.0035,0.636364


In [28]:
# The next step is to multiply the priors by the likelihoods, which yields the unnormalized posteriors.
table.mult()
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Case A,0.5,0.04,0.02,0.363636
1,Case B,0.5,0.014,0.007,0.636364


In [29]:
# Now we can compute the normalized posteriors; `norm` returns the normalization constant.
table.norm()
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,Case A,0.5,0.04,0.02,0.740741
1,Case B,0.5,0.014,0.007,0.259259


###  Problem 2 (2.5 marks)


Say we have  4 faced, and 5 faced dices. You roll the dice and you get 3. What is the probability that it was 4 faced dice given that it was rolled 3? Assume the chance to access each dice is fair.

write a short code to populate the bayesian table with the prior, likelihood, posterior and normalizing probabbilities.

In [16]:
# Setting up prior probability of getting 3 from 4 and 5 faced dices. It is fair chance therefore 0.5
table = BayesTable(['4 Faced Die', '5 Faced Die'], [0.5, 0.5])

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,4 Faced Die,0.5,,,
1,5 Faced Die,0.5,,,


In [17]:
# likelihood of getting 3 for both 4 and 5 faced dices
table.likelihood = [1/4, 1/5]
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,4 Faced Die,0.5,0.25,,
1,5 Faced Die,0.5,0.2,,


In [18]:
# The next step is to multiply the priors by the likelihoods, which yields the unnormalized posteriors.
table.mult()
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,4 Faced Die,0.5,0.25,0.125,
1,5 Faced Die,0.5,0.2,0.1,


In [19]:
# Now we can compute the normalized posteriors; `norm` returns the normalization constant.
table.norm()
table

Unnamed: 0,hypo,prior,likelihood,posterior,norm_posterior
0,4 Faced Die,0.5,0.25,0.125,0.555556
1,5 Faced Die,0.5,0.2,0.1,0.444444
