In *Think Bayes*, Downey introduces a framework for Bayesian statistics. Basically, it consists of

* your hypotheses (called the *suite* in the book),
* the probability for each hypotheses to be true (the *pmf*),
* methods for updating those probabilities.

The approach in the book is nice, because it provides a consistent structure that allows you to focus on each part of the modeling (semi-)separately, while still allowing each component to be represented in a way that is natural for the problem. That said, I'll not replicate the whole framework. Rather, I'll re-implement each piece ad hoc for each problem.

Just as a reminder, Bayes theorem says $P\left(A | B\right) = \frac{P\left(B | A\right) P\left(A\right)}{P\left(B\right)}$.

## The M&M problem

Found on pages 6-7 and 16-17 of *Think Bayes*.

In [147]:
mnm.colors  <- c("brown", "yellow", "red", "green", "orange", "tan", "blue")
mix.94 <- c(30, 20, 20, 10, 10, 10, 0)
mix.96 <- c(13, 14, 13, 20, 16, 0, 24)

## a hypothesis is definied as a dataframe of possibilites
## the rows are the colors, and the columns are the proportion of that color in the bags
df.hypo1  <- data.frame(row.names = mnm.colors, 
                        bag.1 = mix.94, 
                        bag.2 = mix.96)

df.hypo2  <- data.frame(row.names = mnm.colors, 
                        bag.1 = mix.96, 
                        bag.2 = mix.94)

## take a hypothesis data frame and normalize each column
NormalizeHypothesis  <- function(df.hypo, columns = NULL) {
    if (is.null(columns)) {
        columns  <- colnames(df.hypo)
        }
    for (column in columns){
        df.hypo[, column]  <- df.hypo[, column] / sum(df.hypo[, column])
    }
    return(df.hypo)
}

df.hypo1  <- NormalizeHypothesis(df.hypo1)
df.hypo2  <- NormalizeHypothesis(df.hypo2)

## take an ordered list of (color, bag) and returns its likelihood, given the hypothesis
LikelihoodMnM <- function(data, df.hypothesis) {
    like  <- df.hypothesis[data[[1]], data[[2]]]
    return(like)
}

Our suite of hypotheses is a list of hypotheses, accompanied by a column of our belief in each of those hypotheses. We simply require a function for updating our beliefs based on observations. We describe observations with an ordered pair describing the color and the bag that we pulled it from.

In [148]:
## normalize the belief we have in the hypotheses
NormalizeSuite  <- function(suite) {
    suite$probs  <- suite$probs / sum(suite$probs)
    return(suite)
}

UpdateMnM <- function(data, suite) {
    for (ii in 1:length(suite$hypos)) {
        like  <- LikelihoodMnM(data, suite$hypos[[ii]])
        suite$probs[[ii]] <- like * suite$probs[[ii]]
    }
    suite  <- NormalizeSuite(suite)
    return(suite)
}

In [149]:
suite  <- list(hypos = list(df.hypo1, df.hypo2),
               probs = c(1, 1))

suite <- NormalizeSuite(suite)
suite

Unnamed: 0,bag.1,bag.2
brown,0.3,0.13
yellow,0.2,0.14
red,0.2,0.13
green,0.1,0.2
orange,0.1,0.16
tan,0.1,0.0
blue,0.0,0.24

Unnamed: 0,bag.1,bag.2
brown,0.13,0.3
yellow,0.14,0.2
red,0.13,0.2
green,0.2,0.1
orange,0.16,0.1
tan,0.0,0.1
blue,0.24,0.0


In [150]:
obs  <- list(list('yellow', 'bag.1'),
            list('green', 'bag.2'))

In [151]:
for (ii in 1:length(obs)) {
    suite <- UpdateMnM(obs[[ii]], suite)
}

In [152]:
suite$probs