In [1]:
from thinkbayes2 import Pmf

# Think Bayes

# Chapter 2

In [2]:
pmf = Pmf()
for x in [1,2,3,4,5,6]:
    pmf.Set(x, 1/6.0)
    
print(pmf)

Pmf({1: 0.16666666666666666, 2: 0.16666666666666666, 3: 0.16666666666666666, 4: 0.16666666666666666, 5: 0.16666666666666666, 6: 0.16666666666666666})


## The cookie problem

The prior distribution

In [3]:
pmf = Pmf()
pmf.Set('Bowl 1', 0.5)
pmf.Set('Bowl 2', 0.5)

print(pmf)

Pmf({'Bowl 1': 0.5, 'Bowl 2': 0.5})


To update the distribution based on new data (the vanilla cookie), we multiply each prior by the corresponding likelihood. The likelihood of drawing a vanilla cookie from Bowl 1 is 3/4. The likelihood for Bowl 2 is 1/2.

In [4]:
pmf.Mult('Bowl 1', 0.75)
pmf.Mult('Bowl 2', 0.5)
print(pmf)

Pmf({'Bowl 1': 0.375, 'Bowl 2': 0.25})


Mult does what you would expect. It gets the probability for the given hypothesis and multiplies by the given likelihood.

After this update, the distribution is no longer normalized, but because these hypotheses are mutually exclusive and collectively exhaustive, we can renormalize:

In [5]:
pmf.Normalize()
print(pmf)

Pmf({'Bowl 1': 0.6000000000000001, 'Bowl 2': 0.4})


The result is a distribution that contains the posterior probability for each
hypothesis, which is called (wait now) the posterior distribution.
Finally, we can get the posterior probability for Bowl 1:

## The Bayesian Framework

In [17]:
class Cookie(Pmf):
    def __init__(self, hypos):
        Pmf.__init__(self)
        for hypo in hypos:
            self.Set(hypo, 1)
        self.Normalize()
     
    def Update(self, data):
        for hypo in self.Values():
            like = self.Likelihood(data, hypo)
            self.Mult(hypo, like)
        self.Normalize()
    
    def Likelihood(self, data, hypo):
        mix = self.mixes[hypo]
        like = mix[data]
        return like
    
    mixes = {
        'Bowl 1':dict(vanilla=0.75, chocolate=0.25),
        'Bowl 2':dict(vanilla=0.5, chocolate=0.5),
    }

    

In [18]:
hypos = ['Bowl 1', 'Bowl 2']
pmf = Cookie(hypos)

print(pmf)

Cookie({'Bowl 1': 0.5, 'Bowl 2': 0.5})


Cookie provides an Update method that takes data as a parameter and updates the probabilities:


In [15]:
pmf.Update('vanilla')

In [20]:
print(pmf)

Cookie({'Bowl 1': 0.6000000000000001, 'Bowl 2': 0.4})


In [22]:
for hypo, prob in pmf.Items():
    print( hypo, prob )


Bowl 1 0.6000000000000001
Bowl 2 0.4


This code is more complicated
than what we saw in the previous section. One advantage is that it generalizes to the case where we draw more than one cookie from the same bowl
(with replacement):


In [23]:
dataset = ['vanilla', 'chocolate', 'vanilla']
for data in dataset:
    pmf.Update(data)
    
print(pmf)

Cookie({'Bowl 1': 0.627906976744186, 'Bowl 2': 0.37209302325581395})


## 2.4 The Monty Hall Problem 

## 2.5 Encapsulating the Framework

## 2.6 The M&M problem

# Chapter 7: Prediction

## 7.1 The Boston Bruins Problem

## 7.2 Poisson Processes

## 7.3 The Posteriors

## 7.4 The Distribution of Goals

## 7.5 The Probability of Winning

## 7.6 Sudden Death