In [None]:
import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np
from utils import ECDF
from data import load_decay
import pandas as pd
import theano.tensor as tt

%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

# Introduction

Now that you've learned about Bayesian estimation, we're going to explore one more topic: Bayesian curve fitting.

By "curve fitting", we're really talking about any curve: those that are bendy, those that are straight, and those that are in between. 

In order to reinforce this point, rather than show you plain vanilla linear regression, we will work through an exponential decay curve example.

# Problem Setup

You've taken radioactive decay measurements of an unknown element in a secure facility. The measurements are noisy, though, and potentially have some bias. In the face of this, we would like to be able to characterize the decay constant of this unknown material, potentially leading to an identification of the material.

Let's load in the data.

In [None]:
np.random.seed(42)

df = load_decay()
df.head(5)

**Exercise:** Plot `activity` vs. `time`.

In [None]:
ax = df['activity'].plot()

**Discuss:** 

- For the scenario that we're in, what is a plausible equation that links time to activity?
- What are the key parameters that we need to worry about?
- What might be justifiable priors for them?

**Exercise:** Implement the model.

In [None]:
with pm.Model() as model:
    A = pm.HalfNormal('A', sd=100)
    tau = pm.Exponential('tau', lam=1)
    C = pm.Normal('C', sd=100)
    
    sd = pm.HalfCauchy('sd', beta=1)
    
    link = A * np.exp(-df['t'].values / tau) + C
    
    like = pm.Normal('activity', mu=link, sd=sd, observed=df['activity'].values)

Sample from the posterior.

In [None]:
with model:
    trace = pm.sample(2000, tune=2000)
    # Note: Sampler may pause for a while after finishing

Check that sampling has converged.

In [None]:
traces = pm.traceplot(trace)

# Summary

- In lieu of showing you a "straight curve" (line) fit, you've now seen an arbitrary curve fit.
- As long as you can find a way to parameterize the curve with a function, you can perform inference on the curve's parameters.
- The function that you are modelling is the "link function" that provides the link between the parameters, data and the output.

More generally, if

$$y = f(x, \theta)$$

where $\theta$ are merely a set of parameters, then you can perform inference on the curve's parameters $\theta$. To make this clear:

| curve name | functional form | parameters $\theta$ |
|------------|-----------------|---------------------|
| exponential decay | $y = Ae^{-t/\tau} + C$ | $A$, $\tau$, $C$|
| linear regression | $y = mx + c$ | $m$, $c$ |
| logistic regression | $y = L(mx + c)$ | $m$, $c$ |
| 4-parameter IC50 | $y = \frac{a - i}{1 + 10^{\beta(log(\tau) - x)}} + i$ | $a$, $i$, $\tau$, $\beta$ |
| deep learning | $y = f(x, \theta)$ | $\theta$ |