## M&M MODEL

Let's load libraries and observations.

In [None]:
import numpy as np
import pandas as pd

import pymc as pm
import arviz as az
#az.style.use("arviz-doc")

import altair as alt
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format='retina'

# loading the data
df = pd.read_csv('./data/Counting Candies_ M&M with Bayes.csv.zip').drop('Timestamp', axis=1)

print(df)

# let's translate this to a binomial sample 
blue = df.Blue.sum()
total = df.sum().sum()
sample = np.concatenate((np.repeat(1, blue), np.repeat(0, total - blue)))

sample

#### Prior

Let's simulate our prior belief on the assumption that the color distribution is uniform and model the proportion of blue candies. For this we draw 1000 samples equal to the size of our observations from the binomial distribution and calculate the proporton of color==blue in each simulated sample. 

In [None]:

prior = np.random.binomial(total, 1/6, 1000) / total

data = pd.DataFrame(prior, columns=['p'])

data.plot.kde()


Next we construct a model that will sample a Bernoulli distribution on the basis of our observations. We will set a *flat prior* (pm.Uniform) on the proportion of blue color, allowing it to vary from 0 to 1. By doing this we express indifference towards the actual color distribution -- we deem it equally likely that there may be no blue candies in the sample, or that all the candies in the sample might be blue -- or that every 6th candy would be blue.

pm.Uniform is an uninformative prior that gives the model maximum flexibility. As a general rule we can actually do better and choose more informative priors, based on our existing knowledge -- and indeed, as we shall see, often there are many different priors that can be chosen for a particular problem. However, since we are dealing with a very simple problem these considerations do not play a major part.

In [None]:

with pm.Model() as mm_model:
    p = pm.Uniform('p', lower=0, upper=1)
    y = pm.Bernoulli('y', p=p, observed=sample)
    
pm.model_to_graphviz(mm_model)
    

Next it is generally a good idea to run a prior predictive check to see what does our model do *before* it has seen any data.

In [None]:
with mm_model:
    pp = pm.sample_prior_predictive()

print(pp.prior.p)

alt.Chart(pd.DataFrame(pp.prior.p[0], columns=['p'])).mark_tick().encode(
    x='p'
).properties(width=500)

OK, everything is as expected, so we're ready to push the button and run the model with the data.

In [None]:
with mm_model:
    trace = pm.sample()

pm.plot_trace(trace)

Next we check the model run diagnostics (a lot more about that later) and plot the posterior distribution against the reference value of 1/6.

In [None]:
print(pm.summary(trace))

az.plot_posterior(trace, ref_val=(1/6))

Let's look at prior and posterior comparison.

In [None]:

prior_plot = alt.Chart(data).transform_density(
    'p',
    as_=['p', 'density'],
    #bandwidth = 0.05
).mark_area(opacity=0.5).encode(
    x="p:Q",
    y='density:Q',
)


post_plot = alt.Chart(pd.DataFrame(trace.posterior.p[0], columns=['p']), width=400).transform_density(
    'p',
    as_=['p', 'density'],
    #bandwidth = 0.05
).mark_area(opacity=0.5, color='pink').encode(
    x="p:Q",
    y='density:Q',
)

post_plot + prior_plot

In [None]:
import scipy.stats as stats

def posterior_grid(grid=10, a=1, b=1, blue=5, trials=20):
    grid_vals = np.linspace(0, 1, grid)
    prior = stats.beta(a,b).pdf(grid_vals) #stats.beta(a, b).pdf(grid_vals)
    likelihood = stats.binom.pmf(blue, trials, grid_vals)
    posterior = likelihood * prior
    posterior /= posterior.sum()
    
    data = pd.DataFrame({
        "Probability": np.tile(grid_vals, 3),
        "Density": np.concatenate([prior, likelihood, posterior]),
        "Type": np.repeat(["Prior", "Likelihood", "Posterior"], grid)
    })
    
    chart = alt.Chart(data).mark_line(point=True).encode(
        x=alt.X("Probability", title="p(Blue)"),
        y=alt.Y("Density", title="Density"),
        color=alt.Color("Type").legend(None),
        facet=alt.Facet("Type:N", columns=3, sort=["Prior", "Likelihood", "Posterior"]).title(None),
        tooltip=[
            alt.Tooltip('Probability', format='.2%'), 
            alt.Tooltip('Density', format='.2f')]
    ).properties(
        width=250,
        height=200,
        title=f"Blue = {blue}, Trials = {trials}"
    ).resolve_scale(y="independent")
    
    return chart

posterior_grid(grid=10, blue=1, trials=5)

In [None]:

%matplotlib inline
from IPython.core.pylabtools import figsize
from matplotlib import pyplot as plt 

import scipy.stats as stats

figsize(11, 7)

dist = stats.beta
n_trials = [0, 1, 2, 3, 4, 5, 20, 50, 500]
data = stats.bernoulli.rvs(trace.posterior.p.mean(), size=n_trials[-1])
x = np.linspace(0, 1, 100)

for k, N in enumerate(n_trials):
    sx = plt.subplot(len(n_trials)//2, 3, k+1)
    plt.xlabel("$p$, probability of blue candy") \
        if k in [0, len(n_trials)-1] else None
    plt.setp(sx.get_yticklabels(), visible=False)
    heads = data[:N].sum()
    y = dist.pdf(x, 1 + heads, 1 + N - heads)
    plt.plot(x, y, label="%d candies taken,\n %d of them blue" % (N, heads))
    plt.fill_between(x, 0, y, color="#348ABD", alpha=0.4)
    plt.vlines(0.5, 0, 4, color="k", linestyles="--", lw=1)

    leg = plt.legend()
    leg.get_frame().set_alpha(0.4)
    plt.autoscale(tight=True)


plt.suptitle("Bayesian updating of posterior probability",
             y=1.02,
             fontsize=14)

plt.tight_layout()