## Bayesian updates using the table method

This notebook demonstrates a way of doing simple Bayesian updates using the table method, with a Pandas DataFrame as the table.

Copyright 2018 Allen Downey

MIT License: https://opensource.org/licenses/MIT


In [11]:
# Configure Jupyter so figures appear in the notebook
%matplotlib inline

# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'

import numpy as np
import pandas as pd

import sympy

As an example, I'll use the "cookie problem", which is a version of a classic probability "urn problem".

Suppose there are two bowls of cookies.

* Bowl #1 has 10 chocolate and 30 vanilla.

* Bowl #2 has 20 of each.

You choose a bowl at random, and then pick a cookie at random.  The cookie turns out to be vanilla.  What is the probability that the bowl you picked from is Bowl #1?

### The BayesTable class

Here's the class that represents a Bayesian table.

In [3]:
class BayesTable(pd.DataFrame):
    def __init__(self, hypo, prior=1):
        columns = ['hypo', 'prior', 'likelihood', 'unnorm', 'posterior']
        super().__init__(columns=columns)
        self.hypo = hypo
        self.prior = prior
    
    def mult(self):
        self.unnorm = self.prior * self.likelihood
        
    def norm(self):
        nc = np.sum(self.unnorm)
        self.posterior = self.unnorm / nc
        return nc
    
    def update(self):
        self.mult()
        return self.norm()
    
    def reset(self):
        return BayesTable(self.hypo, self.posterior)

Here's an instance that represents the two hypotheses: you either chose from Bowl 1 or Bowl 2:

In [12]:
dice = [4,6,8,12]
table = BayesTable(dice, prior=sympy.Integer(1))

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,4,1,,,
1,6,1,,,
2,8,1,,,
3,12,1,,,


Since we didn't specify prior probabilities, the default value is equal priors for all hypotheses.

Now we can specify the likelihoods:

* The likelihood of getting a vanilla cookie from Bowl 1 is 3/4.

* The likelihood of getting a vanilla cookie from Bowl 2 is 1/2.

Here's how we plug the likelihoods in:

In [13]:
table.likelihood = [1/n for n in dice]
table

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,4,1,0.25,,
1,6,1,0.166667,,
2,8,1,0.125,,
3,12,1,0.083333,,


In [14]:
table.update()
table

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,4,1,0.25,0.25,0.4
1,6,1,0.166667,0.166666666666667,0.266666666666667
2,8,1,0.125,0.125,0.2
3,12,1,0.083333,0.0833333333333333,0.133333333333333


In [18]:
sum(np.array(table.posterior.values) * [1/n for n in dice])

0.180555555555556