# Code from chapter 2 of "Statistical Rethinking," 2nd edition.

In [None]:
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pymc3 as pm
import scipy.stats as stats
import seaborn as sns

## Computing plausibilities (R code 2.1)

In [None]:
ways = np.array([0, 3, 8, 9, 0])
ways / sum(ways)

The plausibilities above are probabilities. They total to 1.

In [None]:
sum(Out[2])

The design loop for simple Bayesian models has three steps.

1. Data story: Motivate the model by narrating **how** the data
   might arise.
2. Update: Educate the model by feeding it the (observed) data.
3. Evaluate: All statistical models require supervision,
   leading to model revision.

In the globe case, the data story is simply a restatement of the
sampling process.

1. The true proportion of the water covering the globe is _p_.
2. A single toss of the globe has a probability, _p_, of 
   producing a water (_W_) observation. It has a probability,
   _1 - p_, of producing a land (_L_) observation.
3. Each toss of the globe is independent of the others.

## Updating plausibilities in light of data (Figure 2.5)

Notice that each time a "W" is observed (a "success"), the 
plausibility peak moves to the right, and each time a "L" is
observed (a "failure"), the plausibility peak moves to the left.

In [None]:
data = ['W', 'L', 'W', 'W', 'W', 'L', 'W', 'L', 'W']

In [None]:
fig, axes = plt.subplots(3, 3, figsize=(25.6, 14.4))
# Begin with an "indifferent" prior
prior_alpha, prior_beta = 1, 1
posterior_alpha, posterior_beta = 1, 1  # a copy of the prior
for i in range(len(axes)):
    for j in range(len(axes[0])):
        xs = np.linspace(stats.beta.ppf(0.01, prior_alpha, prior_beta),
                         stats.beta.ppf(0.99, prior_alpha, prior_beta))
        axes[i, j].plot(xs, stats.beta.pdf(xs, prior_alpha, prior_beta), '--', label='prior')
        axes[i, j].legend()
        axes[i, j].set_title(f'Observed {data[i * 3 + j]}')
        if data[i * 3 + j] == 'W':
            posterior_alpha += 1
        else:
            posterior_beta += 1
        xs = np.linspace(stats.beta.ppf(0.01, posterior_alpha, posterior_beta),
                         stats.beta.ppf(0.99, posterior_alpha, posterior_beta))
        axes[i, j].plot(xs, stats.beta.pdf(xs, posterior_alpha, posterior_beta), '-', label='posterior')
        axes[i, j].legend()
        
        prior_alpha, prior_beta = posterior_alpha, posterior_beta

I could make other fixes, like consistent x and y limits, but
I'm less certain the value of this effort right now.

## Terminology

"Variables are just symbols that can take on different values.... In the globe
tossing model there are three variables."

The first variable, _p_, is an unobserved variable (a variable whose value 
is not **directly** measured as part of the experiment. Unobserved variables
like _p_ are called **parameters**. Although _p_ is unobserved, its value
(distribution) can be inferred.

The other variables, _W_ and _L_ are **observed variables**. The modeling
process uses a model and the observed variables to infer the unobserved
parameters (of the model).

For our model, once we utilize our assumptions that every toss is independent
of any other toss and that the probability of _W_ (and _L_) is the same on
every toss, we can calculate the probability of _W_ water observations and
_L_ land observations based on the probability, _p_, of water on each toss.
This calculation is based on a _binomial distribution_.

## Binomial distribution (R code 2.2)

In [None]:
# Parameters of the binomial discribution
n = 9  # Total number of trials
p = 0.5  # Probability of "water" in each trial

In [None]:
stats.binom.pmf(6, n, p)

In [None]:
xs = np.arange(stats.binom.ppf(0.01, n, p), stats.binom.ppf(0.99, n, p))
fig, ax = plt.subplots(1, 1)
ax.plot(xs, stats.binom.pmf(xs, n, p), 'bo', ms=8, label='Binomial PMF')
ax.vlines(xs, 0, stats.binom.pmf(xs, n, p), colors='b', lw=5, alpha=0.5)
ax.legend()
plt.plot()

In [None]:
rv = stats.binom(n, p)  # Returns a "frozen" distribution (fixed parameters)
rv.pmf(6)

## Learning about discrete distributions

- Create a custom discrete distribution with a mapping between `xk` and `pk`
- Use the `pmf` method to access values in the custom distribution

In [None]:
xk = np.arange(7)
pk = [0.1, 0.2, 0.3, 0.1, 0.1, 0.0, 0.2]
custm = stats.rv_discrete('custm', values=(xk, pk))
custm.pmf(xk)

Querying the distribution at a value that is not in the distribution domain
returns 0.

In [None]:
custm.pmf(1.5)

Plot all the values of the custom distribution.

In [None]:
fig, ax = plt.subplots(1, 1)
ax.plot(xk, custm.pmf(xk), 'ro', ms=12, mec='g')  # ms - marker size; mec - marker edge color
ax.vlines(xk, 0, custm.pmf(xk), colors='r', lw=4)  # lw - line width
plt.show()

## Learning about Matplotlib color cyles

Note that the default behavior was changed in Matplotlib 3.5.1. See 
[the document](https://matplotlib.org/3.5.1/users/prev_whats_new/dflt_style_changes.html#colors-in-default-property-cycle)
for details.

In [None]:
xs = np.arange(10)
fig, ax = plt.subplots(1, 1, figsize=(8.8, 4.8))
for i, color in zip(range(1, 10 + 1), plt.rcParams['axes.prop_cycle']):
    ys = np.repeat(i, len(xs))
    plt.plot(xs, ys, label=color['color'])
ax.legend()

In [None]:
plt.rcParams['axes.prop_cycle']

## Plotting the binomial (discrete) distribution for different values of p

In [None]:
fig, ax = plt.subplots(2, 5, sharey=True, figsize=(25.6, 9.6))
for i, color in zip(np.arange(11), plt.rcParams['axes.prop_cycle']):
    p = i / 10
    xs = np.arange(1, 10 + 1)  # Indexing from 0 produces an anomalous graph for p = 0.0
    ax[i // 5, i % 5].plot(xs, stats.binom.pmf(xs, n, p), color=color['color'], marker='o', ms=8)
    ax[i // 5, i % 5].vlines(xs, 0, stats.binom.pmf(xs, n, p), colors=color['color'], lw=5, alpha=0.5)
    ax[i // 5, i % 5].set_title(f'p = {p}')
plt.plot()