# Think Bayes

Second Edition

Copyright 2020 Allen B. Downey

License: [Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/)

In [1]:
# If we're running on Colab, install empiricaldist
# https://pypi.org/project/empiricaldist/

import sys
IN_COLAB = 'google.colab' in sys.modules

if IN_COLAB:
    !pip install empiricaldist

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from empiricaldist import Pmf

## The Pmf class

I'll start by making a Pmf that represents the outcome of a six-sided die.  Initially there are 6 values with equal probability.

In [3]:
pmf = Pmf()
for x in [1,2,3,4,5,6]:
    pmf[x] = 1
    
pmf

Unnamed: 0,probs
1,1
2,1
3,1
4,1
5,1
6,1


To be true probabilities, they have to add up to 1.  So we can normalize the Pmf:

In [4]:
pmf.normalize()

6

The return value from `Normalize` is the sum of the probabilities before normalizing.

In [5]:
pmf

Unnamed: 0,probs
1,0.166667
2,0.166667
3,0.166667
4,0.166667
5,0.166667
6,0.166667


A faster way to make a Pmf is to provide a sequence of values.  The constructor adds the values to the Pmf and then normalizes:

In [6]:
pmf = Pmf.from_seq([1,2,3,4,5,6])
pmf

Unnamed: 0,probs
1,0.166667
2,0.166667
3,0.166667
4,0.166667
5,0.166667
6,0.166667


In [7]:
pmf[1]

0.16666666666666666

Or you can use the bracket operator.

In [8]:
pmf(1)

0.16666666666666666

  Either way, if you ask for the probability of something that's not in the Pmf, the result is 0.

In [9]:
try:
    pmf[7]
except KeyError as e:
    print(e)

7


In [10]:
pmf(7)

0

In [11]:
pmf([1,4,7])

array([0.16666667, 0.16666667, 0.        ])

## The cookie problem

Here's a Pmf that represents the prior distribution.

In [14]:
prior = Pmf.from_seq(['Bowl 1', 'Bowl 2'])
prior

Unnamed: 0,probs
Bowl 1,0.5
Bowl 2,0.5


And we can update it using `Mult`

In [15]:
likelihood = [0.75, 0.5]
posterior = prior * likelihood
posterior.normalize()
posterior

Unnamed: 0,probs
Bowl 1,0.6
Bowl 2,0.4


## The Bayesian framework


In [16]:
likelihood_dict = dict(
    vanilla = [0.75, 0.5], 
    chocolate = [0.25, 0.5]
)

def update_cookie(pmf, data):
    pmf *= likelihood_dict[data]
    pmf.normalize()

We can confirm that we get the same result.

In [17]:
cookie = Pmf.from_seq(['Bowl 1', 'Bowl 2'])
update_cookie(cookie, 'vanilla')
cookie

Unnamed: 0,probs
Bowl 1,0.6
Bowl 2,0.4


But this implementation is more general; it can handle any sequence of data.

In [18]:
cookie = Pmf.from_seq(['Bowl 1', 'Bowl 2'])

dataset = ['vanilla', 'chocolate', 'vanilla']
for data in dataset:
    update_cookie(cookie, 'vanilla')
    
cookie

Unnamed: 0,probs
Bowl 1,0.771429
Bowl 2,0.228571


## The Monty Hall problem

The Monty Hall problem might be the most contentious question in
the history of probability.  The scenario is simple, but the correct
answer is so counterintuitive that many people just can't accept
it.

Here's the statement of the problem, from [Wikipedia](https://en.wikipedia.org/wiki/Monty_Hall_problem):

> Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

To avoid ambiguities, we have to make some assumptions about the behavior of the host:

1. The host never opens the door you picked.

2. The host never opens the door with the car.

3. If you choose the door with the car, the host chooses one of the other doors at random.

4. The host always offers you the option to switch.

Under these assumptions, are you better off sticking or switching?

The correct answer is that you are better off switching.  If you stick, you win 1/3 of the time.  If you switch, you win 2/3 of the time.

Here's one of many arguments that might persuade you.

> If you always stick, you win if you initially choose the door with the car, so the probability is 1/3.
>
> If you always switch, you win if you did _not_ choose the door with the car, so the probability is 2/3.

However, many people do not find any verbal arguments persuasive. 
Maybe Bayes's Theorem can help.

In [19]:
hypos = [1, 2, 3]
prior = Pmf(1, hypos)
prior

Unnamed: 0,probs
1,1
2,1
3,1


In [20]:
likelihood = [0.5, 1, 0]

And here's how we use it.

In [21]:
posterior = prior * likelihood
posterior.normalize()
posterior

Unnamed: 0,probs
1,0.333333
2,0.666667
3,0.0


## The M&M problem

M&Ms are small candy-coated chocolates that come in a variety of
colors.  Mars, Inc., which makes M&Ms, changes the mixture of
colors from time to time.

In 1995, they introduced blue M&Ms.  Before then, the color mix in
a bag of plain M&Ms was 30% Brown, 20% Yellow, 20% Red, 10%
Green, 10% Orange, 10% Tan.  Afterward it was 24% Blue , 20%
Green, 16% Orange, 14% Yellow, 13% Red, 13% Brown.

Suppose a friend of mine has two bags of M&Ms, and he tells me
that one is from 1994 and one from 1996.  He won't tell me which is
which, but he gives me one M&M from each bag.  One is yellow and
one is green.  What is the probability that the yellow one came
from the 1994 bag?

Here's a solution:

In [22]:
mix94 = dict(brown=30,
                 yellow=20,
                 red=20,
                 green=10,
                 orange=10,
                 tan=10,
                 blue=0)

mix96 = dict(blue=24,
                 green=20,
                 orange=16,
                 yellow=14,
                 red=13,
                 brown=13,
                 tan=0)

hypoA = dict(bag1=mix94, bag2=mix96)
hypoB = dict(bag1=mix96, bag2=mix94)

hypotheses = dict(A=hypoA, B=hypoB)

def likelihood_mm(data, hypo):
    """Computes the likelihood of the data under the hypothesis.

    hypo: string hypothesis (A or B)
    data: tuple of string bag, string color
    """
    bag, color = data
    like = hypotheses[hypo][bag][color]
    return like

And here's an update:

In [25]:
data = ('bag1', 'yellow')
hypos = ['A', 'B']

likelihood1 = [likelihood_mm(data, hypo) for hypo in hypos]
likelihood1

[20, 14]

In [26]:
def update_mm(pmf, data):
    pmf *= [likelihood_mm(data, hypo) 
            for hypo in hypos]
    pmf.normalize()

In [27]:
pmf = Pmf(1, hypos)
pmf

Unnamed: 0,probs
A,1
B,1


In [28]:
data = ('bag1', 'yellow')
update_mm(pmf, data)
pmf

Unnamed: 0,probs
A,0.588235
B,0.411765


In [29]:
data = ('bag2', 'green')
update_mm(pmf, data)
pmf

Unnamed: 0,probs
A,0.740741
B,0.259259


**Exercise:**  Suppose you draw another M&M from `bag1` and it's blue.  What can you conclude?  Run the update to confirm your intuition.

In [30]:
data = ('bag1', 'blue')
update_mm(pmf, data)
pmf

Unnamed: 0,probs
A,0.0
B,1.0


**Exercise:**  Now suppose you draw an M&M from `bag2` and it's blue.  Run the update to see what happens.  What does that mean?  

In [31]:
# Solution

data = ('bag2', 'blue')
update_mm(pmf, data)
pmf

Unnamed: 0,probs
A,
B,


In [32]:
# The unnormalized posterior for both hypotheses is 0, so when
# we try to normalize, the total probability of the data is 0.

# This means that the data have ruled out all of our hypotheses.