# Chapter 2 - Bayes's Theorem
http://allendowney.github.io/ThinkBayes2/

## Reading

### Bayes tables

Bayes's theorem:
$$P(H|D) = \frac{P(H)~P(D|H)}{P(D)}$$

The probability of the data under any hypothesis:
$$P(D) = \sum_i P(H_i)~P(D|H_i)$$

So,
$$P(H|D) = \frac{P(H)~P(D|H)}{\sum_i P(H_i)~P(D|H_i)}$$

In [1]:
import pandas as pd

In [2]:
table = pd.DataFrame(index=['bowl 1' ,'bowl 2'])

In [3]:
table['prior'] = 1/2, 1/2
table

Unnamed: 0,prior
bowl 1,0.5
bowl 2,0.5


In [4]:
table['likelihood'] = 3/4, 1/2
table

Unnamed: 0,prior,likelihood
bowl 1,0.5,0.75
bowl 2,0.5,0.5


In [5]:
table['unnorm'] = table['prior'] * table['likelihood']
table

Unnamed: 0,prior,likelihood,unnorm
bowl 1,0.5,0.75,0.375
bowl 2,0.5,0.5,0.25


In [6]:
prob_data = table['unnorm'].sum()
prob_data

0.625

In [7]:
table['posterior'] = table['unnorm'] / prob_data
table

Unnamed: 0,prior,likelihood,unnorm,posterior
bowl 1,0.5,0.75,0.375,0.6
bowl 2,0.5,0.5,0.25,0.4


In [8]:
table2 = pd.DataFrame(index=[6, 8, 12])

In [9]:
from fractions import Fraction

table2['prior'] = Fraction(1, 3)
table2['likelihood'] = Fraction(1, 6), Fraction(1, 8), Fraction(1, 12)
table2

Unnamed: 0,prior,likelihood
6,1/3,1/6
8,1/3,1/8
12,1/3,1/12


In [10]:
def update(table):
    """Compute the posterior probabilities."""
    table['unnorm'] = table['prior'] * table['likelihood']
    prob_data = table['unnorm'].sum()
    table['posterior'] = table['unnorm'] / prob_data
    return prob_data

In [11]:
prob_data = update(table2)

In [12]:
table2

Unnamed: 0,prior,likelihood,unnorm,posterior
6,1/3,1/6,1/18,4/9
8,1/3,1/8,1/24,1/3
12,1/3,1/12,1/36,2/9


#### Monty Hall
You pick door 1 and Monty opens door 3.

In [13]:
table3 = pd.DataFrame(index=['Door 1', 'Door 2', 'Door 3'])
table3['prior'] = Fraction(1, 3)
table3

Unnamed: 0,prior
Door 1,1/3
Door 2,1/3
Door 3,1/3


In [14]:
table3['likelihood'] = Fraction(1, 2), 1, 0
table3

Unnamed: 0,prior,likelihood
Door 1,1/3,1/2
Door 2,1/3,1
Door 3,1/3,0


In [15]:
update(table3)
table3

Unnamed: 0,prior,likelihood,unnorm,posterior
Door 1,1/3,1/2,1/6,1/3
Door 2,1/3,1,1/3,2/3
Door 3,1/3,0,0,0


## Exercises

**Exercise:** Suppose you have two coins in a box.
One is a normal coin with heads on one side and tails on the other, and one is a trick coin with heads on both sides.  You choose a coin at random and see that one of the sides is heads.
What is the probability that you chose the trick coin?

In [16]:
table = pd.DataFrame(index=['trick coin', 'normal coin'])
table['prior'] = 1/2
table['likelihood'] = 1, 1/2
prob_data = update(table)
table

Unnamed: 0,prior,likelihood,unnorm,posterior
trick coin,0.5,1.0,0.5,0.666667
normal coin,0.5,0.5,0.25,0.333333


So, you have a 2/3 chance of choosing the trick coin.

In [18]:
# Applying Bayes's thm directly:
p = 1/2 * 1 / (3/4)
p

0.6666666666666666

**Exercise:** Suppose you meet someone and learn that they have two children.
You ask if either child is a girl and they say yes.
What is the probability that both children are girls?

Hint: Start with four equally likely hypotheses.

In [19]:
table = pd.DataFrame(index=['GG', 'GB', 'BG', 'BB'])
table['prior'] = 1/4
table['likelihood'] = 1, 1, 1, 0
prob_data = update(table)
table

Unnamed: 0,prior,likelihood,unnorm,posterior
GG,0.25,1,0.25,0.333333
GB,0.25,1,0.25,0.333333
BG,0.25,1,0.25,0.333333
BB,0.25,0,0.0,0.0


There is a 1/3 chance that both children are girls.

**Exercise:** There are many variations of the [Monty Hall problem](https://en.wikipedia.org/wiki/Monty_Hall_problem).  
For example, suppose Monty always chooses Door 2 if he can, and
only chooses Door 3 if he has to (because the car is behind Door 2).

If you choose Door 1 and Monty opens Door 2, what is the probability the car is behind Door 3?

If you choose Door 1 and Monty opens Door 3, what is the probability the car is behind Door 2?

In [20]:
# You choose door 1 and Monty opens door 2
table = pd.DataFrame(index=['door 1', 'door 2', 'door 3'])
table['prior'] = 1/3
table['likelihood'] = 1, 0, 1
prob_data = update(table)
table

Unnamed: 0,prior,likelihood,unnorm,posterior
door 1,0.333333,1,0.333333,0.5
door 2,0.333333,0,0.0,0.0
door 3,0.333333,1,0.333333,0.5


There is 1/2 probability the car is behind door 3.

In [21]:
# You choose door 1 and Monty opens door 3
table = pd.DataFrame(index=['door 1', 'door 2', 'door 3'])
table['prior'] = 1/3
table['likelihood'] = 0, 1, 0
prob_data = update(table)
table

Unnamed: 0,prior,likelihood,unnorm,posterior
door 1,0.333333,0,0.0,0.0
door 2,0.333333,1,0.333333,1.0
door 3,0.333333,0,0.0,0.0


There is a probability of 1 that the door is behind door 2.

**Exercise:** M&M's are small candy-coated chocolates that come in a variety of colors.  
Mars, Inc., which makes M&M's, changes the mixture of colors from time to time.
In 1995, they introduced blue M&M's.  

* In 1994, the color mix in a bag of plain M&M's was 30\% Brown, 20\% Yellow, 20\% Red, 10\% Green, 10\% Orange, 10\% Tan.  

* In 1996, it was 24\% Blue , 20\% Green, 16\% Orange, 14\% Yellow, 13\% Red, 13\% Brown.

Suppose a friend of mine has two bags of M&M's, and he tells me
that one is from 1994 and one from 1996.  He won't tell me which is
which, but he gives me one M&M from each bag.  One is yellow and
one is green.  What is the probability that the yellow one came
from the 1994 bag?

Hint: The trick to this question is to define the hypotheses and the data carefully.

---
Hypotheses: the yellow M&M is from 1994, the green M&M is from 1994
Data: one M&M is yellow and one M&M is green

In [22]:
table = pd.DataFrame(index=['YG', 'GY']) # the first one is from 1994
table['prior'] = 1/2
table['likelihood'] = 0.2 * 0.2, 0.1 * 0.14
prob_data = update(table)
table

Unnamed: 0,prior,likelihood,unnorm,posterior
YG,0.5,0.04,0.02,0.740741
GY,0.5,0.014,0.007,0.259259
