In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.print_figure_kwargs={'facecolor' : "w"}

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np

# Posterior Predictive Distributions for Stochastic Predictions  

Suppose one is interested in parameter estimation for a theoretical model $y_\text{th}$ that depends on a set of parameters $\vec{a}$.
Upon finding the posterior $\text{pr}(\vec{a} \,|\, y_\text{exp})$, it can be useful to look at the distribution of $y_\text{th}(\vec{a})$ across the posterior distribution of $\vec{a}$.
This is given by the posterior predictive distribution (PPD)
$$
    \text{pr}(y_\text{th} \,|\, y_\text{exp}) = \int d\vec{a} \,\text{pr}(y_\text{th} \,|\, \vec{a}, y_\text{exp}) \,\text{pr}(\vec{a} \,|\, y_\text{exp})
$$

If the theory is determined exactly from the parameters $\vec{a}$, then one is able to write $\text{pr}(y_\text{th} \,|\, \vec{a}, y_\text{exp}) = \text{pr}(y_\text{th} \,|\, \vec{a})$, **but this is not always the case**.
One example of a PPD that cannot be simplified in this way is the PPD for the theoretical discrepancy:
$$
    \text{pr}(y_\text{th} + \delta y_\text{th} \,|\, y_\text{exp})
$$
where
$$
    y_\text{exp} = y_\text{th} + \delta y_\text{th} + \delta y_\text{exp}.
$$
The discrepancy $\delta y_\text{th}$ is generally unknown, and so it is given a distribution.
A common prior is to state that the discrepancy follows a Gaussian process (GP) with some kernel $\kappa$
$$
    \delta y_\text{th}(x) \sim GP[0, \kappa(x, x')],
$$
where it is assumed that $\delta y_\text{th}$ is independent of the parameters $\vec{a}$.
Because $\delta y_\text{th}$ is **not** uniquely determined by $\vec{a}$, then the experimental data **does** have an impact on $\text{pr}(y_\text{th} + \delta y_\text{th} \,|\, \vec{a}, y_\text{exp})$, which is in fact the *conditional* probability density for the Gaussian process.

We can illustrate this using an example, inspired by the model proposed in [this](#MelendezGP) reference.
For the theory (and the discrepancy), we will consider a simple geometric series whose coefficients are to be determined.
These coefficients will play the role of $\vec{a}$.
The series, which constitute $y_\text{th}$, will be terminated at some order $k$, and the subsequent terms will be placed in $\delta y_\text{th}$, aka, the truncation error.
Note that, although the coefficients in the geometric series are treated as constants, this case generalizes nicely to when the coefficients are themselves Gaussian processes.

**NOTE:** For simplicity, no interpolation is done in this notebook. If we wanted to interpolate between the experimental data points, this feature could be easily added.


## Fit Coefficients

This is a linear model and so we can use the standard linear regression equations to find the posterior distribution
$$
    \vec{a} \,|\, y_{\text{exp}} \sim N[\mu_a, \Sigma_a] \\
    \mu_a = \Sigma_a X^T (\Sigma_\text{th} + \Sigma_\text{exp})^{-1} y_\text{exp} \\
    \Sigma_a = [X^T (\Sigma_\text{th} + \Sigma_\text{exp})^{-1} X]^{-1}
$$
This equation follows from the well known (Bayesian) linear model.
More general relations are given [here](#MelendezGP), in Eqs. (A27)-(A28).

## PPD Calculation

The PPD for the theory prediction can be calculated analytically. If $y_\text{th} = X \vec{a}$, then
$$
  y_\text{th} \,|\, y_\text{exp} \sim N[X\mu_a, X \Sigma_a X^T]
$$
This follows from Eq. (A35) in [this](#MelendezGP) reference.

The PPD when including truncation error is slightly trickier, but still analytic.
Now the PPD is given by
$$
    \text{pr}(y_\text{th} + \delta y_\text{th} \,|\, y_\text{exp}) = \int d\vec{a} \,\text{pr}(y_\text{th} + \delta y_\text{th} \,|\, \vec{a}, y_\text{exp}) \,\text{pr}(\vec{a} \,|\, y_\text{exp})
$$
The prior of $y_\text{th} + \delta y_\text{th}$ is itself a GP
$$
    y_\text{th} + \delta y_\text{th} \,|\, \vec{a} \sim GP[y_\text{th}(\vec{a}), \kappa(x, x')].
$$
With this understanding, it is clear that
$$
    \text{pr}(y_\text{th} + \delta y_\text{th} \,|\, \vec{a}, y_\text{exp})
$$
is the conditional PDF for the GP.
Thus, we compute conditional GP equations *first*, and then sum over all $\vec{a}$.
The conditional GP equations are well known.
They can be found, e.g., [here](#MelendezGP) in Eqs. (10)-(15). Once the conditional GP is given as a function of $\vec{a}$,
$$
    y_\text{th} + \delta y_\text{th} \,|\, \vec{a}, y_\text{exp} \sim N[Z\vec{a} + \Sigma_\text{th} (\Sigma_\text{th} + \Sigma_\text{exp})^{-1} y_\text{exp},\,\, \Sigma_\text{th} + \Sigma_\text{th} (\Sigma_\text{th} + \Sigma_\text{exp})^{-1}\Sigma_\text{th}] \\
    Z = X - \Sigma_\text{th} (\Sigma_\text{th} + \Sigma_\text{exp})^{-1} X
$$
then they can be integrated out as done above (again, see Eq. (A35) [here](#MelendezGP)). In this case, this yields
$$
y_\text{th} + \delta y_\text{th} \,|\, y_\text{exp} \sim N[Z\mu_a + \Sigma_\text{th} (\Sigma_\text{th} + \Sigma_\text{exp})^{-1} y_\text{exp},\,\, \Sigma_\text{th} + \Sigma_\text{th} (\Sigma_\text{th} + \Sigma_\text{exp})^{-1}\Sigma_\text{th} + Z\Sigma_a Z^T]
$$

## Non-PPD Calculation


Although not the PPD, it is sometimes useful to look at the distribution of $y_\text{th} + \delta y_\text{th}$ sampled over our posterior values for $\vec{a}$. So rather than conditioning $y_\text{th} + \delta y_\text{th}$ on the experimental values, we instead use its unconditional distribution. This quantity can be described as
$$
    y_\text{th} + \delta y_\text{th} \, | \, I_{\vec{a} \,|\, y_\text{exp}}
$$
where $I_{\vec{a} \,|\, y_\text{exp}}$ is the information about the posterior distribution of $\vec{a}$.
By integrating in $\vec{a}$, we have
$$
    \text{pr}(y_\text{th} + \delta y_\text{th} \, | \, I_{\vec{a} \,|\, y_\text{exp}})
    = \int d\vec{a}\, \text{pr}(y_\text{th} + \delta y_\text{th} \, | \, \vec{a}) \text{pr}(\vec{a} \,|\, I_{\vec{a} \,|\, y_\text{exp}}),
$$
where the factor on the left is the unconditional GP, and the factor on the right is the posterior density for $\vec{a}$.

## References

* <a id="MelendezGP" /> Melendez, J. A., Furnstahl, R. J., Phillips, D. R., Pratola, M. T. & Wesolowski, S. Quantifying correlated truncation errors in effective field theory. Phys Rev C 100, 044001 (2019).

## Interface using ipywidgets with interactive_output

In [2]:
# Import the widgets we will use (add more as needed!)
import ipywidgets as widgets
from ipywidgets import HBox, VBox, Layout, Tab, Label, Checkbox
from ipywidgets import FloatSlider, IntSlider, Play, Dropdown, HTMLMath

from IPython.display import display
from time import sleep

In [3]:
from src.stats import (
    linear_feature_matrix,
    linear_theory_prediction,
    linear_theory_covariance,
    linear_solve,
    conditional_gp,
    linear_conditional_covariance_from_parameters,
)

In [4]:
def master_plot(
    y_th_flag=True,
    y_true_flag=True,
    y_exp_flag=True,
    y_th_ppd_flag=True,
    y_tot_ppd_flag=True,
    y_non_ppd_flag=True,
    rng_coeff_seed=3,
    rng_data_seed=3,
    N_exp=50,
    N_int=100,
    k_max=7,
    k=3,
    cbar=1.0,
    sd_exp=0.2,
    data_min=0.01,
    data_max=0.9,
    x_min=0,
    x_max=1.0,
):
    """
    Does the calculations then generates the plot of results.

        Parameters:
        k_max (int): Total max order, including truncation error; (default: 7)
        k (int): Max order in y_th; (default: 3)
        rng_seed (int): Seed for random number generator; (default: 3)
        N (int): Number of data to use in training; (default: 50)
        cbar (float): Standard deviation of the parameters; (default: 1)
        sd_exp (float): Experimental standard deviation; (default: 0.2)

        Returns:
        fig (matplotlib Figure): plot of results
    """
    # Recalculate everything

    # Parameters and data generation
    y_ref = 1  # A constant scalar that multiplies the series
    x_exp = np.linspace(
        data_min, data_max, N_exp
    )  # The ratio x^n in the geometric series
    x_int = np.linspace(x_min, x_max, N_int)
    higher_orders = np.arange(k, k_max)  # These are in the truncation error
    orders = np.arange(k)  # These are in y_th

    # Make everything repeatable but separate coefficient and data random numbers
    rng_coeff = np.random.default_rng(rng_coeff_seed)
    rng_data = np.random.default_rng(rng_data_seed)

    # The coefficients of the geometric series
    c_true_all = rng_coeff.normal(0, cbar, k_max)
    c_true_higher = c_true_all[k:]
    c_true = c_true_all[:k]

    # Set up toy predictions and experiment
    X_old = linear_feature_matrix(x=x_exp, y_ref=y_ref, orders=orders)
    X_new = linear_feature_matrix(x=x_int, y_ref=y_ref, orders=orders)
    X_old_higher = linear_feature_matrix(x=x_exp, y_ref=y_ref, orders=higher_orders)
    X_new_higher = linear_feature_matrix(x=x_int, y_ref=y_ref, orders=higher_orders)

    y_th = linear_theory_prediction(X=X_old, c=c_true)
    dy_th = linear_theory_prediction(X=X_old_higher, c=c_true_higher)
    dy_exp = rng_data.normal(0, sd_exp, N_exp)
    y_exp = y_th + dy_th + dy_exp

    y_th_int = linear_theory_prediction(X=X_new, c=c_true)
    dy_th_int = linear_theory_prediction(X=X_new_higher, c=c_true_higher)

    # Fit coefficients
    Sigma_th_oo = linear_theory_covariance(
        x=x_exp, xp=x_exp, y_ref=y_ref, cbar=cbar, orders=higher_orders
    )
    Sigma_th_no = linear_theory_covariance(
        x=x_int, xp=x_exp, y_ref=y_ref, cbar=cbar, orders=higher_orders
    )
    Sigma_th_nn = linear_theory_covariance(
        x=x_int, xp=x_int, y_ref=y_ref, cbar=cbar, orders=higher_orders
    )
    Sigma_exp = sd_exp ** 2 * np.eye(N_exp)
    c_mean, c_inv_cov = linear_solve(X=X_old, y=y_exp, cov=Sigma_th_oo + Sigma_exp)

    # PPD calculation
    y_th_ppd_mean = linear_theory_prediction(X=X_new, c=c_mean)
    y_th_ppd_mean_exp = linear_theory_prediction(X=X_old, c=c_mean)
    y_th_ppd_stdv = np.sqrt(np.diag(X_new @ np.linalg.solve(c_inv_cov, X_new.T)))

    y_tot_ppd_mean, y_tot_ppd_cov = conditional_gp(
        y=y_exp,
        mean_new=y_th_ppd_mean,
        mean_old=y_th_ppd_mean_exp,
        Sigma_nn=Sigma_th_nn,
        Sigma_no=Sigma_th_no,
        Sigma_oo=Sigma_th_oo + Sigma_exp,
    )
    y_tot_ppd_cov += linear_conditional_covariance_from_parameters(
        X_new=X_new,
        X_old=X_old,
        Sigma_no=Sigma_th_no,
        Sigma_oo=Sigma_th_oo + Sigma_exp,
        precision=c_inv_cov,
    )
    y_tot_ppd_stdv = np.sqrt(np.diag(y_tot_ppd_cov))

    # non-PPD calculation
    y_th_non_ppd_stdv = np.sqrt(np.diag(Sigma_th_nn)) + np.sqrt(
        np.diag(X_new @ np.linalg.solve(c_inv_cov, X_new.T))
    )

    # make the plot
    fig, ax = plt.subplots(figsize=(5, 4.4))

    if y_th_flag:
        ax.plot(x_int, y_th_int, label="y_th (True)", c="k")

    if y_true_flag:
        ax.plot(
            x_int, y_th_int + dy_th_int, label="y_th + dy_th (True)", c="k", ls="--"
        )

    if y_exp_flag:
        ax.errorbar(
            x_exp,
            y_exp,
            sd_exp,
            lw=0,
            elinewidth=1,
            barsabove=True,
            capsize=1,
            label="y_exp",
            c="C1",
        )

    if y_th_ppd_flag:
        ax.plot(x_int, y_th_ppd_mean, c="r")
        ax.fill_between(
            x_int,
            y_th_ppd_mean + y_th_ppd_stdv,
            y_th_ppd_mean - y_th_ppd_stdv,
            label="y_th PPD",
            color="r",
            alpha=0.3,
        )

    if y_tot_ppd_flag:
        ax.plot(x_int, y_tot_ppd_mean, c="g", zorder=0)
        ax.fill_between(
            x_int,
            y_tot_ppd_mean + y_tot_ppd_stdv,
            y_tot_ppd_mean - y_tot_ppd_stdv,
            label="y_th + dy_th PPD",
            color="g",
            alpha=0.3,
            zorder=0,
        )

    if y_non_ppd_flag:
        ax.plot(x_int, y_th_ppd_mean, c="b")
        ax.fill_between(
            x_int,
            y_th_ppd_mean + y_th_non_ppd_stdv,
            y_th_ppd_mean - y_th_non_ppd_stdv,
            label="y_th + dy_th Non-PPD",
            color="b",
            alpha=0.3,
        )

    ax.legend()
    ax.set_xlabel("x")

    return fig

In [5]:
# Widgets for the various inputs.
#   For any widget, we can set continuous_update=False if we don't want the
#    plots to shift until the selection is finished (particularly relevant for
#    sliders).

# Widgets for the plot choice (plus a label out front)
plot_choice_w = Label(value="Which plots: ", layout=Layout(width="90px"))


def plot_choice_widget(on=True, plot_description=None):
    """Makes a Checkbox to select whether to show a plot."""
    return Checkbox(
        value=on,
        description=plot_description,
        disabled=False,
        indent=False,
        layout=Layout(width="150px"),
    )


y_th_flag_w = plot_choice_widget(True, r"$y_{\rm th}$ (true)")
y_true_flag_w = plot_choice_widget(True, r"$y_{\rm th} + \delta y_{\rm th}$ (true)")
y_exp_flag_w = plot_choice_widget(True, r"$y_{\rm exp}$")
y_th_ppd_flag_w = plot_choice_widget(True, r"$y_{\rm th}$ ppd")
y_tot_ppd_flag_w = plot_choice_widget(True, r"$y_{\rm th} + \delta y_{\rm th}$ ppd")
y_non_ppd_flag_w = plot_choice_widget(
    False, r"$y_{\rm th} + \delta y_{\rm th}$ non-ppd"
)

# Widgets for the model parameters (all use FloatSlider or IntSlider, so made functions)
def float_widget(value, min, max, step, description, format):
    """Makes a FloatSlider with the passed parameters and continuous_update
    set to False."""
    slider_border = Layout(border="solid 1.0px")
    return FloatSlider(
        value=value,
        min=min,
        max=max,
        step=step,
        disabled=False,
        description=description,
        continuous_update=False,
        orientation="horizontal",
        layout=slider_border,
        readout=True,
        readout_format=format,
    )


def int_widget(value, min, max, step, description):
    """Makes a FloatSlider with the passed parameters and continuous_update
    set to False."""
    slider_border = Layout(border="solid 1.0px")
    return IntSlider(
        value=value,
        min=min,
        max=max,
        step=step,
        disabled=False,
        description=description,
        continuous_update=False,
        orientation="horizontal",
        layout=slider_border,
        readout=True,
        readout_format="d",
    )


rng_coeff_seed_w = int_widget(
    value=3, min=0, max=100, step=1, description=r"$\delta y_{\rm th}$ seed"
)
rng_data_seed_w = int_widget(
    value=3, min=0, max=100, step=1, description=r"$\delta y_{\rm exp}$ seed"
)

N_data_w = int_widget(value=50, min=2, max=100, step=1, description=r"$N_{\rm data}$")
N_interp_w = int_widget(value=100, min=2, max=200, step=1, description=r"$N_{\rm interp}$")

k_max_w = int_widget(value=7, min=2, max=10, step=1, description=r"$k_{\rm max}$")
k_w = int_widget(value=3, min=2, max=5, step=1, description=r"theory order")

cbar_w = float_widget(
    value=1.0, min=0.1, max=5.0, step=0.1, description=r"$\overline c$:", format=".1f"
)
sd_exp_w = float_widget(
    value=0.2,
    min=0.001,
    max=2.0,
    step=0.1,
    description=r"$\sigma_{\rm exp}$:",
    format=".2f",
)
data_min_w = float_widget(
    value=0.01,
    min=0.01,
    max=0.5,
    step=0.1,
    description=r"data $x_{\rm min}$:",
    format=".2f",
)
data_max_w = float_widget(
    value=0.9,
    min=0.3,
    max=2.0,
    step=0.1,
    description=r"data $x_{\rm max}$:",
    format=".1f",
)


# Widgets for the plotting parameters
x_min_w = float_widget(
    value=-0.05,
    min=0,
    max=0.5,
    step=0.1,
    description=r"Pred $x_{\rm min}$:",
    format=".1f",
)
x_max_w = float_widget(
    value=1.0,
    min=0.5,
    max=5.0,
    step=0.1,
    description=r"Pred $x_{\rm max}$:",
    format=".1f",
)

# Widgets for the styling parameters
font_size_w = Dropdown(
    options=["12", "16", "18", "20", "24"],
    value="18",
    description="Font size:",
    disabled=False,
    continuous_update=False,
    layout=Layout(width="140px"),
)

empty_w = Label(value=" ", layout=Layout(width="300px", border="solid 1.0px"))


############## Begin: Explicit callback functions #######################

# Make sure that x_max for plotting is at least x_min + 1
def update_plot_max(*args):
    if x_max_w.value < x_min_w.value:
        x_max_w.value = x_min_w.value + 1


x_min_w.observe(update_plot_max, "value")
x_max_w.observe(update_plot_max, "value")


# Make sure that data_max is at least data_min + .1
def update_data_max(*args):
    if data_max_w.value < data_min_w.value:
        data_max_w.value = data_min_w.value + 0.1


data_max_w.observe(update_data_max, "value")
data_min_w.observe(update_data_max, "value")


############## End: Explicit callback functions #######################

# Set up the interactive_output widget
plot_out = widgets.interactive_output(
    master_plot,
    dict(
        y_th_flag=y_th_flag_w,
        y_true_flag=y_true_flag_w,
        y_exp_flag=y_exp_flag_w,
        y_th_ppd_flag=y_th_ppd_flag_w,
        y_tot_ppd_flag=y_tot_ppd_flag_w,
        y_non_ppd_flag=y_non_ppd_flag_w,
        rng_coeff_seed=rng_coeff_seed_w,
        rng_data_seed=rng_data_seed_w,
        N_exp=N_data_w,
        N_int=N_interp_w,
        k_max=k_max_w,
        k=k_w,
        cbar=cbar_w,
        sd_exp=sd_exp_w,
        data_min=data_min_w,
        data_max=data_max_w,
        x_min=x_min_w,
        x_max=x_max_w,
    ),
)

# Now do some manual layout, where we can put the plot anywhere using plot_out
hbox1 = HBox(
    [
        plot_choice_w,
        y_th_flag_w,
        y_true_flag_w,
        y_exp_flag_w,
        y_th_ppd_flag_w,
        y_tot_ppd_flag_w,
        y_non_ppd_flag_w,
    ]
)  #  choice of what plots
hbox2 = HBox([N_data_w, rng_coeff_seed_w, rng_data_seed_w, empty_w])  #
hbox3 = HBox([k_max_w, k_w, cbar_w, sd_exp_w])  # initial conditions and damping
hbox4 = HBox([data_min_w, data_max_w, x_min_w, x_max_w])  # time and plot ranges
hbox5 = HBox([font_size_w])  # font size

# We'll set up Tabs to organize the controls.  The Tab contents are declared
#  as tab0, tab1, ... (probably should make this a list?) and the overall Tab
#  is called tab (so its children are tab0, tab1, ...).
tab_height = '70px'  # Fixed minimum height for all tabs. Specify another way?
tab0 = VBox([hbox2, hbox3], layout=Layout(min_height=tab_height))
tab1 = VBox([hbox1, hbox4], layout=Layout(min_height=tab_height))
tab2 = VBox([hbox5], layout=Layout(min_height=tab_height))

tab = Tab(children=[tab0, tab1, tab2])
tab.set_title(0, "Parameters & Data")
tab.set_title(1, "Plotting")
tab.set_title(2, "Styling")

param_height = "120px"
param_box = VBox([hbox1, hbox2, hbox3, hbox4], layout=Layout(min_height=param_height))

# Release the Kraken!
vbox2 = VBox([param_box, plot_out])
display(vbox2)

VBox(children=(VBox(children=(HBox(children=(Label(value='Which plots: ', layout=Layout(width='90px')), Checkb…