# ***2. Bayes's Theorem***

> ## **2.1 Diachronic Bayes**
***
There is another way to think of Bayes's theorem: it gives us a way to update the probability of a hypothesis, $H$, given some body of data, $D$.

- Diachronic means "related to change over time"; in this case, the probability of the hypotheses changes as we see new data.

Rewriting Bayes's theorem with $H$ and $D$ yields:

$$P(H|D) = \frac{P(H)~P(D|H)}{P(D)}$$

In this interpretation, each term has a name:

-  $P(H)$ is the probability of the hypothesis before we see the data, called the **prior**.

-  $P(D|H)$ is the probability of the data under the hypothesis, called the **likelihood**.

-  $P(D)$ is, under any hypothesis, the **total probability of the data**.

-  $P(H|D)$ is the probability of the hypothesis after we see the data, called the **posterior**.

We can compute the prior based on background information but in some cases the prior is subjective either because they use different background information or because they interpret the same information differently. 

Computing the total probability of the data can be tricky. It is supposed to be the probability of seeing the data under any hypothesis at all, but it can be hard to nail down what that means. Most often we simplify things by specifying a MECE set of hypotheses and compute $P(D)$ using the Law of Total Probability.

> ## **2.2 Bayes Tables**
***

>_Suppose there are two bowls of cookies._
>
> _Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies._
>
> _Bowl 2 contains 20 vanilla cookies and 20 chocolate cookies._
>
> _Now suppose you choose one of the bowls at random and, without looking, choose a cookie at random. If it's a vanilla cookie, what is the probability that it came from Bowl 1?_

A convenient tool for approaching this problem is Bayes tables.

In [1]:
import pandas as pd

# Empty DataFrame with one row for each bowl (hypothesis)
table = pd.DataFrame(index=['Bowl 1', 'Bowl 2'])
# Add a column to represent the priors
table['prior'] = 1/2, 1/2
# Add a column to represent likelihood (chance of getting a vanilla cookie in each bowl)
table['likelihood'] = 3/4, 1/2
table

Unnamed: 0,prior,likelihood
Bowl 1,0.5,0.75
Bowl 2,0.5,0.5


- Notice that the likelihoods don’t add up to 1; each of them is a probability conditioned on a different hypothesis. There’s no reason they should add up to 1 and no problem if they don’t.

The next step is to multiply the `priors` by the `likelihoods` → $P(H_1) P(D|H_1)$, which is the numerator, then calculate the total probability → $P(H_1)~P(D|H_1) + P(H_2)~P(D|H_2)$, which is the denominator and finally divide both terms.

In [21]:
# Unnormalized posteriors (numerator)
table['unnorm'] = table['prior'] * table['likelihood']
# Normalizing constant (denominator)
prob_data = table['unnorm'].sum()
print("Total Probability:", prob_data)
# Bayes theorem
table['posterior'] = table['unnorm'] / prob_data
table

Total Probability: 0.625


Unnamed: 0,prior,likelihood,unnorm,posterior
Bowl 1,0.5,0.75,0.375,0.6
Bowl 2,0.5,0.5,0.25,0.4


The posterior probability for getting a vanilla cookie from Bowl 1 is 0.6.

- When we add up the unnormalized posteriors and divide through, we force the posteriors to add up to 1. This process is called “normalization”, which is why the total probability of the data is also called the “normalizing constant”.

> _Suppose I have a box with a 6-sided die, an 8-sided die, and a 12-sided die. I choose one of the dice at random, roll it, and report
> that the outcome is a 1. What is the probability that I chose the 6-sided die?_

Once you established priors and likelihoods, the remaining steps are always the same, so we can plug them into a function:

In [43]:
table2 = pd.DataFrame(index=[6, 8, 12])

from fractions import Fraction

table2['prior'] = Fraction(1,3)
table2['likelihood'] = Fraction(1,6), Fraction(1,8), Fraction(1,12)
table2

Unnamed: 0,prior,likelihood
6,1/3,1/6
8,1/3,1/8
12,1/3,1/12


In [32]:
def update(table):
    """Compute the posterior probabilities."""
    table['unnorm'] = table['prior'] * table['likelihood']
    prob_data = table['unnorm'].sum()
    print("Total Probability:", prob_data)
    table['posterior'] = table['unnorm'] / prob_data
    return table

<div class="alert alert-block alert-info">
<b>Reminder:</b> Remember that Python uses square brackets to select the columns.
</div>

In [23]:
prob_data = update(table2)
prob_data

Total Probability: 1/8


Unnamed: 0,prior,likelihood,unnorm,posterior
6,1/3,1/6,1/18,4/9
8,1/3,1/8,1/24,1/3
12,1/3,1/12,1/36,2/9


> ## **2.3 The Monty Hall Problem**
***

> _The Monty Hall problem is based on a game show called *Let's Make a Deal*. If you are a contestant on the show, here's how the game works:_
>
>* _The host, Monty Hall, shows you three closed doors -- numbered 1, 2, and 3 -- and tells you that there is a prize behind each door._
>
>* _One prize is valuable (traditionally a car), the other two are less valuable (traditionally goats)._
>
>* _The object of the game is to guess which door has the car. If you guess right, you get to keep the car._
>
> _Suppose you pick Door 1. Before opening the door you chose, Monty opens Door 3 and reveals a goat. Then Monty offers you the option to stick with your original choice or switch to the remaining unopened door._
>
> _To maximize your chance of winning the car, should you stick with Door 1 or switch to Door 2?_

The data is that the host opened Door 3 and revealed a goat. So let'sconsider the probability of the data under each hypothesis:

* If the car is behind Door 1, Monty chooses Door 2 or 3 at random, so the probability he opens Door 3 is $1/2$.

* If the car is behind Door 2, Monty has to open Door 3, so the probability of the data under this hypothesis is 1.

* If the car is behind Door 3, Monty does not open it, so the probability of the data under this hypothesis is 0.

In [42]:
table3 = pd.DataFrame(index=['Door 1', 'Door 2', 'Door 3'])
table3['prior'] = Fraction(1,3)
table3['likelihood'] = Fraction(1,2), Fraction(1), Fraction(0)
prob_data = update(table3)
prob_data

Total Probability: 1/2


Unnamed: 0,prior,likelihood,unnorm,posterior
Door 1,1/3,1/2,1/6,1/3
Door 2,1/3,1,1/3,2/3
Door 3,1/3,0,0,0


After Monty opens Door 3, the posterior probability of Door 1 is $1/3$; the posterior probability of Door 2 is $2/3$. So you are better off switching from Door 1 to Door 2. Bayes's Theorem can help by providing a divide-and-conquer strategy:

1.  First, write down the hypotheses (**the question, such as what is the probability the car is behind Door 3? or What is the probability that both children are girls?**) and the data (**such as knowing Monty opens Door 2 or that one of the children is a girl**).

2.  Next, figure out the prior probabilities (**of each hypothesis happening**).

3.  Finally, compute the likelihood of the data under each hypothesis (**given that _something_ happened, what are the odds of observing a certain hypothesis being true?**).

The Bayes table does the rest.

> **Exercise:** _M&M's are small candy-coated chocolates that come in a variety of colors. Mars changes the mixture of colors from time to time._
>
>* _In 1994, the color mix in a bag of plain M&M's was 30\% Brown, 20\% Yellow, 20\% Red, 10\% Green, 10\% Orange, 10\% Tan._  
>
>* _In 1996, it was 24\% Blue , 20\% Green, 16\% Orange, 14\% Yellow, 13\% Red, 13\% Brown._
>
> _Suppose a friend of mine has two bags of M&M's, and he tells me
that one is from 1994 and one from 1996.  He won't tell me which is
which, but he gives me one M&M from each bag.  One is yellow and
one is green.  What is the probability that the yellow one came
from the 1994 bag?_


In [48]:
# A = yellow from 1994, green from 1996
# B = yellow from 1996, green from 1994

table4 = pd.DataFrame(index = ['A', 'B'])
table4['prior'] = Fraction(1/2)
# Each M&M is drawn independently, so to get the joint prob. for each hypothesis, we just multiply the individual probabilities
table4['likelihood'] = 0.2*0.2, 0.14*0.1
prob_data = update(table4)
prob_data

Total Probability: 0.027000000000000003


Unnamed: 0,prior,likelihood,unnorm,posterior
A,1/2,0.04,0.02,0.740741
B,1/2,0.014,0.007,0.259259


There's a probability around 74% that the yellow M&M came from the 1994 bag.