In [2]:
import pandas as pd

$$
P(H|D) = \frac{P(H) P(D|H)}{P(D)}
$$

- $P(H|D)$ is called posterior
- $P(H)$ is called prior
- $P(D|H)$ is called likelihood
- $P(D)$ is the total probability of data under any hypothesis

Suppose there are two bowls of cookies. 

- Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. 
- Bowl 2 contains 20 vanilla cookies and 20 chocolate cookies.

Now suppose you choose one of the bowls at random and, without looking, choose a cookie at random. If the cookie is vanilla, what is the probability that it came from Bowl 1?

In [13]:
table = pd.DataFrame(index=['Bow1', 'Bow2'])
table['prior'] = [0.5, 0.5]
table['likellihood'] = [30/40, 20/40]
table['unnorm'] = table['prior'] * table['likellihood']
probdata = table['unnorm'].sum()
table['posterior'] = table['unnorm']/probdata
table

Unnamed: 0,prior,likellihood,unnorm,posterior
Bow1,0.5,0.75,0.375,0.6
Bow2,0.5,0.5,0.25,0.4


## The Dice Problem
Suppose I have a box with a 6-sided die, an 8-sided die, and a 12-sided die. I choose one of the dice at random, roll it, and report that the outcome is a 1. What is the prob‐ ability that I chose the 6-sided die?

In [14]:
table2 = pd.DataFrame(index=[6, 8, 12])
table2['prior'] = [1/3, 1/3, 1/3]
table2['likelihood'] = [1/6, 1/8, 1/12]
table2['unnorm'] = table2['prior'] * table2['likelihood']
probdata2 = table2['unnorm'].sum()
table2['posterior'] = table2['unnorm']/probdata2
table2

Unnamed: 0,prior,likelihood,unnorm,posterior
6,0.333333,0.166667,0.055556,0.444444
8,0.333333,0.125,0.041667,0.333333
12,0.333333,0.083333,0.027778,0.222222


## The Monty Hall Problem

Suppose you pick Door 1. Before opening the door you chose, Monty opens Door 3 and reveals a goat. Then Monty offers you the option to stick with your original choice or switch to the remaining unopened door. To maximize your chance of winning the car, should you stick with Door 1 or switch to Door 2?

In [15]:
table3 = pd.DataFrame(index=['Door 1', 'Door 2', 'Door 3'])
table3['prior'] = [1/3, 1/3, 1/3]
table3['likelihood'] = [1/2, 1, 0]
table3['unnorm'] = table3['prior'] * table3['likelihood']
probdata3 = table3['unnorm'].sum()
table3['posterior'] = table3['unnorm']/probdata3
table3

Unnamed: 0,prior,likelihood,unnorm,posterior
Door 1,0.333333,0.5,0.166667,0.333333
Door 2,0.333333,1.0,0.333333,0.666667
Door 3,0.333333,0.0,0.0,0.0


In [17]:
def update(table):
    '''calculate posterior probability'''
    table['unnorm'] = table['prior'] * table['likelihood']
    probdata = table['unnorm'].sum()
    table['posterior'] = table['unnorm']/probdata

## Exercises

### Exercise 2-1. 
Suppose you have two coins in a box. One is a normal coin with heads on one side and tails on the other, and one is a trick coin with heads on both sides. You choose a coin at random and see that one of the sides is heads. What is the probability that you chose the trick coin?

In [18]:
table4 = pd.DataFrame(index=['normal', 'fake'])
table4['prior'] = [1/2, 1/2]
table4['likelihood'] = [1/2, 1]
update(table4)
table4

Unnamed: 0,prior,likelihood,unnorm,posterior
normal,0.5,0.5,0.25,0.333333
fake,0.5,1.0,0.5,0.666667


### Exercise 2-2. 
Suppose you meet someone and learn that they have two children. You ask if either child is a girl and they say yes. What is the probability that both children are girls? Hint: Start with four equally likely hypotheses.

In [19]:
table5 = pd.DataFrame(index=['GG', 'GB', 'BG', 'BB'])
table5['prior'] = [1/4, 1/4, 1/4, 1/4]
table5['likelihood'] = [1, 1, 1, 0]
update(table5)
table5

Unnamed: 0,prior,likelihood,unnorm,posterior
GG,0.25,1,0.25,0.333333
GB,0.25,1,0.25,0.333333
BG,0.25,1,0.25,0.333333
BB,0.25,0,0.0,0.0


### Exercise 2-3.
There are many variations of the Monty Hall Problem. For example, suppose Monty always chooses Door 2 if he can, and only chooses Door 3 if he has to (because the car is behind Door 2). 

If you choose Door 1 and Monty opens Door 2, what is the probability the car is behind Door 3?

In [20]:
table6 = pd.DataFrame(index=['Door 1', 'Door 2', 'Door 3'])
table6['prior'] = [1/3, 1/3, 1/3]
table6['likelihood'] = [1, 0, 1]
update(table6)
table6

Unnamed: 0,prior,likelihood,unnorm,posterior
Door 1,0.333333,1,0.333333,0.5
Door 2,0.333333,0,0.0,0.0
Door 3,0.333333,1,0.333333,0.5


If you choose Door 1 and Monty opens Door 3, what is the probability the car is behind Door 2?

In [21]:
table7 = pd.DataFrame(index=['Door 1', 'Door 2', 'Door 3'])
table7['prior'] = [1/3, 1/3, 1/3]
table7['likelihood'] = [0, 1, 0]
update(table7)
table7

Unnamed: 0,prior,likelihood,unnorm,posterior
Door 1,0.333333,0,0.0,0.0
Door 2,0.333333,1,0.333333,1.0
Door 3,0.333333,0,0.0,0.0


### Exercise 2-4. 

M&M’s are small candy-coated chocolates that come in a variety of colors. Mars, Inc., which makes M&M’s, changes the mixture of colors from time to time. In 1995, they introduced blue M&M’s.

- In 1994, the color mix in a bag of plain M&M’s was 30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 10% Tan. 
- In 1996, it was 24% Blue, 20% Green, 16% Orange, 14% Yellow, 13% Red, 13% Brown.

Suppose a friend of mine has two bags of M&M’s, and he tells me that one is from 1994 and one from 1996. He won’t tell me which is which, but he gives me one M&M from each bag. One is yellow and one is green. What is the probability that the yellow one came from the 1994 bag?

- H1: yellow from 1994, green from 1996
- H2: yellow from 1996, green from 1994

In [24]:
table8 = pd.DataFrame(index=['H1', 'H2'])
table8['prior'] = [0.5, 0.5]
table8['likelihood'] = [0.2*0.2, 0.14*0.1]
update(table8)
table8

Unnamed: 0,prior,likelihood,unnorm,posterior
H1,0.5,0.04,0.02,0.740741
H2,0.5,0.014,0.007,0.259259
