# Think Bayes

This notebook presents example code and exercise solutions for Think Bayes.

Copyright 2018 Allen B. Downey

MIT License: https://opensource.org/licenses/MIT

In [1]:
# Configure Jupyter so figures appear in the notebook
%matplotlib inline

# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'

# import classes from thinkbayes2
from thinkbayes2 import Hist, Pmf, Suite

## The cookie problem

Here's the original statement of the cookie problem:

> Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each.

> Now suppose you choose one of the bowls at random and, without looking, select a cookie at random. The cookie is vanilla. What is the probability that it came from Bowl 1?

If we only draw one cookie, this problem is simple, but if we draw more than one cookie, there is a complication: do we replace the cookie after each draw, or not?

If we replace the cookie, the proportion of vanilla and chocolate cookies stays the same, and we can perform multiple updates with the same likelihood function.

If we *don't* replace the cookie, the proportions change and we have to keep track of the number of cookies in each bowl.

**Exercise:**

Modify the solution from the book to handle selection without replacement.

Hint: Add instance variables to the `Cookie` class to represent the hypothetical state of the bowls, and modify the `Likelihood` function accordingly.

To represent the state of a Bowl, you might want to use the `Hist` class from `thinkbayes2`.

**Solution:**

substitute the bowls in `mixes` as `Hist`s and use `Hist.Subtract()` to update the composition.

In [2]:
### copied from code/chap02.ipynb and modified
class Cookie(Pmf):
    """A map from string bowl ID to probablity."""
    
    def __init__(self, hypos):
        """Initialize self.

        hypos: sequence of string bowl IDs
        """
        Pmf.__init__(self)
        for hypo in hypos:
            self.Set(hypo, 1)
        self.Normalize()

    def Update(self, data):
        """Updates the PMF with new data.

        data: string cookie type
        """
        for hypo in self.Values():
            self[hypo] *= self.Likelihood(data, hypo)
        self.Normalize()

    
    mixes = {
        'Bowl 1': Hist(dict(vanilla=30, chocolate=10)),
        'Bowl 2': Hist(dict(vanilla=20, chocolate=20))
    }

    def Likelihood(self, data, hypo):
        """The likelihood of the data under the hypothesis.

        data: string cookie type
        hypo: string bowl ID
        """
        mix = self.mixes[hypo]
        like = mix[data]
        
        ### update the composition of the bowl
        mix.Subtract( Hist({data:1}) )
        
        return like

In [3]:
pmf = Cookie(['Bowl 1', 'Bowl 2'])
print(pmf.mixes)

{'Bowl 1': Hist({'vanilla': 30, 'chocolate': 10}), 'Bowl 2': Hist({'vanilla': 20, 'chocolate': 20})}


In [4]:
pmf.Update('vanilla')
pmf.Print()
print(pmf.mixes)

Bowl 1 0.6
Bowl 2 0.4
{'Bowl 1': Hist({'vanilla': 29, 'chocolate': 10}), 'Bowl 2': Hist({'vanilla': 19, 'chocolate': 20})}


In [5]:
pmf.Update('chocolate')
pmf.Print()
print(pmf.mixes)

Bowl 1 0.42857142857142855
Bowl 2 0.5714285714285714
{'Bowl 1': Hist({'vanilla': 29, 'chocolate': 9}), 'Bowl 2': Hist({'vanilla': 19, 'chocolate': 19})}


In [6]:
for _ in range(10):
    pmf.Update('chocolate')
    pmf.Print()
    print(pmf.mixes)

Bowl 1 0.2621359223300971
Bowl 2 0.7378640776699029
Hist({'vanilla': 29, 'chocolate': 8})
Bowl 1 0.13636363636363635
Bowl 2 0.8636363636363636
Hist({'vanilla': 29, 'chocolate': 7})
Bowl 1 0.061046511627906974
Bowl 2 0.938953488372093
Hist({'vanilla': 29, 'chocolate': 6})
Bowl 1 0.023800528900642236
Bowl 2 0.9761994710993578
Hist({'vanilla': 29, 'chocolate': 5})
Bowl 1 0.008061420345489444
Bowl 2 0.9919385796545106
Hist({'vanilla': 29, 'chocolate': 4})
Bowl 1 0.0023166023166023165
Bowl 2 0.9976833976833976
Hist({'vanilla': 29, 'chocolate': 3})
Bowl 1 0.0005355548943766736
Bowl 2 0.9994644451056233
Hist({'vanilla': 29, 'chocolate': 2})
Bowl 1 8.929900282780175e-05
Bowl 2 0.9999107009971722
Hist({'vanilla': 29, 'chocolate': 1})
Bowl 1 8.118750253710948e-06
Bowl 2 0.9999918812497465
Hist({'vanilla': 29, 'chocolate': 0})
Bowl 1 0.0
Bowl 2 1.0
Hist({'vanilla': 29, 'chocolate': -1})


**Comments:**  
This is weird: the composition of the bowls is updated at the same way, i.e. if a `data` is "chocolate" a chocolate cookie is removed from *both* bowls. I think that it should be done as follows:

* we start with B1 = {30V, 10C} and B2 = {20V, 20C}
* we get a vanilla cookie

At this point, in the hypothesis that we are "extracting" from B1, we should have B1 = {29V, 10C} **and** B2 = {20V, 20C}, while in the hypothesis that we are "extracting" from B2, we should have B1 = {20V, 10C} **and** B2 = {19V, 20C}.

**Answer (after some thinking):** It is not like that since the bowl we extract from is always the same! The problem would need a treatment like the one outlined here if the bowls were randomly selected after each extraction. Of course in this case the composition would change, but the priors would "reset" each time.