# (Simple) Stochastic Volatility Model

This notebook is (heavily) based on [this notebook](https://docs.pymc.io/notebooks/stochastic_volatility.html) from the pymc3 examples section.




In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc3 as pm

%matplotlib inline

## Load and manipulate the data

We are going to use S&P 500 daily prices to compute returns

In [None]:
df = pd.read_csv("./spp500.csv").rename(columns={"Date": "dt"})

# Compute the "percent change"
df["change"] = np.log(df["Close"]).diff()

returns = df.set_index("dt").loc["2007-01-01":, ["Close", "change"]].dropna()
returns.head()

## Object of interest

If we plot the daily log returns from the S&P 500, we see that there are periods of time in which there is raised volatility.

We would like to build a model that allows us to evaluate how this volatility changes over time.

In [None]:
fig, ax = plt.subplots(figsize=(12, 6))

returns.plot(y="change", label="S&P 500", linewidth=0.75, ax=ax)

ax.set_xlabel("time")
ax.set_ylabel("returns")
ax.spines["right"].set_visible(False)
ax.spines["top"].set_visible(False)

## Our model

The likelihood is given by: $\log(r_{t+1}) \sim \text{StudentT}(\nu, 0, \exp(-2 \log \sigma_{t+1}))$

Our priors are given by

\begin{align*}
  \nu &\sim \text{Exp}(0.1) \\
  \text{stepsize} &\sim \text{Exp}(10) \\
  \log \sigma_{t+1} &= \log \sigma_{t} + \text{stepsize} \varepsilon_{t+1} \\
\end{align*}

In [None]:
m = pm.Model()

with m:
    # Data
    data_returns = pm.Data("data_returns", returns["change"].to_numpy())
    
    # Prior on the DoF
    nu = pm.Exponential("nu", 0.1)

    # Prior on the step size of GRW
    step_size = pm.Exponential("step_size", 10)

    # Prior on the volatility
    log_sigma = pm.GaussianRandomWalk("log_sigma", sigma=step_size, shape=returns.shape[0])

    # Likelihood of returns
    obs_returns = pm.StudentT(
        "obs_returns", nu=nu, lam=pm.math.exp(-2*log_sigma), observed=data_returns
    )

###  Prior predictive

In [None]:
with m:
    prior_trace = pm.sample_prior_predictive(25)

In [None]:
fig, ax = plt.subplots()

dates = returns.index.to_numpy().astype(np.datetime64())
ax.plot(dates, prior_trace["obs_returns"].T, color="b", alpha=0.05)
ax.plot(dates, returns["change"].to_numpy(), color="k")

ax.set_ylim(-15.0, 15.0)

In [None]:
np.quantile(prior_trace["obs_returns"], [0.25, 0.75])

## Sample from the posterior

We will sample from the posterior distribution using the default sampler that comes with pymc3 (NUTS)

In [None]:
with m:
    trace = pm.sample(1500, tune=2000)

### Plotting the traces

When we sample from the posterior, `pymc3` will typically run multiple chains at once. The `pm.traceplot` can show us how the chains differ from one another by plotting the samples

In [None]:
pm.traceplot(trace, var_names=["step_size", "nu"]);

In [None]:
fig, ax = plt.subplots(figsize=(14, 4))

y_vals = np.exp(trace["log_sigma"])[::5].T
x_vals = np.vstack([returns.index for _ in y_vals.T]).T.astype(np.datetime64)

plt.plot(x_vals, y_vals, "k", alpha=0.002)
ax.set_xlim(x_vals.min(), x_vals.max())
ax.set_ylim(bottom=0)
ax.set(title="Estimated volatility over time", xlabel="Date", ylabel="Volatility");

### Plotting the observed log returns against the posterior predictive log returns

In [None]:
with m:
    posterior_predictive = pm.sample_posterior_predictive(trace)

In [None]:
fig, ax = plt.subplots(2, 1, figsize=(14, 8))

x = returns.index.to_numpy().astype(np.datetime64)
x_vals = np.vstack([returns.index for _ in y_vals.T]).T.astype(np.datetime64)

# Plot returns
ax[0].plot(
    x, posterior_predictive["obs_returns"][::25].T, color="g",
    alpha=0.25, zorder=-10
)
ax[0].plot(x, returns["change"].to_numpy(), color="k", linewidth=0.5)

# # Plot volatility
y_vals = np.exp(trace["log_sigma"])[::25].T

ax[1].plot(x, y_vals, "k", alpha=0.002)
ax[1].set_ylim(bottom=0)