In [1]:
import pathlib
import platform

import numpy as np
import pandas as pd

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

import pymc3 as pm

import arviz as ar
import matplotlib.pyplot as pl
from matplotlib import rcParams
import seaborn as sb

In [2]:
def pkg_ver(pkgs):
    print('Python & Package Versions')
    print('----------------')
    print(f'PYTHON: {platform.python_version()}')
    for pki in pkgs:
        print(f'{pki.__name__}: {pki.__version__}')
pkg_ver([np, pd, pm, ar, sb])

Python & Package Versions
----------------
PYTHON: 3.7.3
numpy: 1.17.2
pandas: 0.25.1
pymc3: 3.7
arviz: 0.5.1
seaborn: 0.9.0


In [3]:
ar.style.use('arviz-darkgrid')

In this and a subsequent notebook, I implement some bayesian regression models to predict chlorophyll from satellite and ancillary data. I use a Bayesian framework for all models. For each model, implementation follows the sequence below.

* The model is cast in a Bayesian framework using a probabilistic programming language (PPL);
* A set of prior predictive simulations is conducted to ascertain that model priors are reasonable;
* The model is fit using the No U-Turn Sampler (NUTS) variant of Hamiltonian Monte Carlo, and the data subset from NOMAD 2008;
* Model predictive skill and  uncertainty are quantified via posterior distribution evaluation and posterior predictive simulation.

Subsequently Inter-model comparisons of model skill are conducted using Information Criteria (IC) including Watanabe Akaike Information Criterion (WAIC) and/or Pareto Smoothed Importance Sampling Leave One Out Cross Validation (LOO). T