# What could go wrong and how to fix it

## 2. Take a step back: look at the full Bayesian workflow
<img src="janfb/figures/what-if-i-told-you-this-is-not-fully-bayesian.jpg" align="center" alt="beadexample" width="500"/>

## Outline

1) Model misspecification

2) Bayesian workflow

3) Bayesian workflow in `sbi`

4) Simulation-based calibration

5) Practical: Bayesian workflow in `sbi`

## Model misspecification

- previous session: troubleshooting with the `sbi` package with **given** simulator and prior
- what if the model and prior are not accurately (enough) modelling the observed data $x_o$

- what if the model is **misspecified**?

## Model misspecification example

- the prior and simulator do not capture the actual data generating process
- $x_o$ is not in the prior predictive distribution


- prior predictive distribution
    - all data the model can generate: $(\theta, x) \sim p(\theta, x) = p(x | \theta)p(\theta)$

In [None]:
from sbi.inference import SNLE, SNPE, prepare_for_sbi, simulate_for_sbi
from sbi.simulators.linear_gaussian import (
    linear_gaussian,
    samples_true_posterior_linear_gaussian_uniform_prior,
)
import torch
from sbi.utils import BoxUniform
import matplotlib.pyplot as plt
from sbi.analysis import pairplot

## Model misspecification example

$$
\text{prior:  } \theta \sim \mathcal{N}(0, I) \\
\text{simulator:  } x \sim \mathcal{N}(\theta, I) \\
\\
\text{but:  } x_o \sim \mathcal{N}(\theta_o, 5 * I)
$$

In [None]:
# Gaussian simulator
def simulator(theta, scale=0.5):
    # Sample from standard normal, shift with mean.
    return scale * torch.randn(theta.shape) + theta

num_dim = 3
# Misspecification.
simulator_scale = 0.5
true_scale = 5.0

# Uniform prior.
prior = BoxUniform(-5 * torch.ones(num_dim), 5 * torch.ones(num_dim))
x_o = simulator(prior.sample((1,)), scale=true_scale)

## Model misspecification: $x_o$ not in prior predictive

In [None]:
pairplot(simulator(prior.sample((1000,)), scale=simulator_scale), 
         upper="scatter", points=x_o, points_colors="k", limits=[[-10, 10]]);

## Other examples of model misspecification


- what we have here: one hyper-parameter (scale) of the simulator misspecified 
- more severe: wrong model class
- prior misspecification
- more examples? 

## Bayesian inference vs. Bayesian workflow

- Bayesian inference: 
    - obtain posterior distribution
    - (or samples)
- Bayesian workflow:
    - model building
    - inference
    - model checking

<img src="janfb/figures/bayesian_worklow_chart.png" align="center" alt="beadexample" width="800"/> 
[Gelman et al. 2020]

## Recommended reading

- Gelman et al. 2020, "Bayesian workflow", 
https://arxiv.org/abs/2011.01808

- Michael Betancourt, "Towards a principled Bayesian Workflow", 2020
https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html



## Bayesian workflow in `sbi`

Which of the steps can we do in `sbi` (until now)?

- prior predictive checks
- convergence diagnostics (training logs)
- posterior predictive checks
- simulation-based calibration

## Simulation-based calibration

TBD.

## Practical: Bayesian workflow in `sbi`

### Tasks
1) Take one the following inference problem (TBD) and perform as much of the Bayesian workflow as possible using `sbi`
- prior predictive checks
- convergence diagnostics
- posterior predictive checks
- [optional] simulation-based calibration

2) Is the inference you performed valid? 