## Bayesian updates using the table method

This notebook demonstrates a way of doing simple Bayesian updates using the table method, with a Pandas DataFrame as the table.

Copyright 2018 Allen Downey

MIT License: https://opensource.org/licenses/MIT


In [19]:
# Configure Jupyter so figures appear in the notebook
%matplotlib inline

# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'

import numpy as np
import pandas as pd

from sympy import S, nsimplify

As an example, I'll use the "cookie problem", which is a version of a classic probability "urn problem".

Suppose there are two bowls of cookies.

* Bowl #1 has 10 chocolate and 30 vanilla.

* Bowl #2 has 20 of each.

You choose a bowl at random, and then pick a cookie at random.  The cookie turns out to be vanilla.  What is the probability that the bowl you picked from is Bowl #1?

### The BayesTable class

Here's the class that represents a Bayesian table.

In [34]:
class BayesTable(pd.DataFrame):
    def __init__(self, hypo, prior=1, sym=False):
        columns = ['hypo', 'prior', 'likelihood', 'unnorm', 'posterior']
        super().__init__(columns=columns)
        if isinstance(hypo, str):
            hypo = hypo.split()
        self.hypo = hypo
        if sym:
            prior = nsimplify(prior)
        self.prior = prior
        self.sym = sym
    
    def mult(self):
        self.unnorm = self.prior * self.likelihood
        
    def norm(self):
        nc = np.sum(self.unnorm)
        self.posterior = self.unnorm / nc
        return nc
    
    def update(self):
        self.mult()
        return self.norm()
    
    def reset(self):
        return BayesTable(self.hypo, self.posterior)

Here's an instance that represents the two hypotheses: you either chose from Bowl 1 or Bowl 2:

In [35]:
nsimplify(1/2)

1/2

In [36]:
table = BayesTable('Avery=Mon Blake=Mon', sym=False)

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,Avery=Mon,1,,,
1,Blake=Mon,1,,,


Updated based on Monday:

In [37]:
table.likelihood = [S(40)/100, S(70)/100]
table.update()
table

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,Avery=Mon,1,2/5,2/5,4/11
1,Blake=Mon,1,7/10,7/10,7/11


Updated based on Tuesday:

In [38]:
table.reset()
table.likelihood = [S(30)/100, S(60)/100]
table.update()
table

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,Avery=Mon,1,3/10,3/10,1/3
1,Blake=Mon,1,3/5,3/5,2/3


In [40]:
nsimplify(0.3)

3/10