# Section 4: Advanced Models

#### PyData London 2025 - Bayesian Time Series Analysis with PyMC

---

## State-Space Models

State-space models are powerful frameworks for time series analysis that separate the **latent state** (unobserved) from the **observations** (observed). This separation allows us to:

- Model complex dynamics in the latent space
- Handle missing observations naturally
- Incorporate measurement error
- Build hierarchical time series models

### Mathematical Framework

**State equation**: $x_t = f(x_{t-1}, \theta) + \eta_t$

**Observation equation**: $y_t = g(x_t, \phi) + \epsilon_t$

Where:
- $x_t$ is the latent state
- $y_t$ is the observation
- $\eta_t, \epsilon_t$ are noise terms
- $\theta, \phi$ are parameters

In [None]:
# Import necessary libraries for Section 4
import numpy as np
import polars as pl
import matplotlib.pyplot as plt
import pymc as pm
import pytensor.tensor as pt
import arviz as az
import warnings

# Configure plotting and suppress warnings
plt.style.use('seaborn-v0_8')
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['figure.dpi'] = 100
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)
RANDOM_SEED = 42

print("🔧 Section 4 libraries loaded successfully!")
print("Ready to build advanced Bayesian time series models")

## Model 1: Local Level Model (State-Space)

The local level model is the simplest state-space model, where the latent state follows a random walk and observations are noisy measurements of this state.

In [None]:
# Load and prepare data
births_data = pl.read_csv('../data/births.csv', null_values=['null', 'NA', '', 'NULL'])
births_data = births_data.filter(pl.col('day').is_not_null())

monthly_births = (births_data
    .group_by(['year', 'month'])
    .agg(pl.col('births').sum())
    .sort(['year', 'month'])
)

births_subset = (monthly_births
    .filter((pl.col('year') >= 1970) & (pl.col('year') <= 1990))
    .with_row_index('index')
)

original_data = births_subset['births'].to_numpy()
births_standardized = (original_data - original_data.mean()) / original_data.std()
n_obs = len(births_standardized)

print(f"📊 Data prepared: {n_obs} observations")

In [None]:
# Model 1: Local Level Model (State-Space)
with pm.Model() as local_level_model:
    # Process noise (state evolution)
    sigma_level = pm.HalfNormal('sigma_level', sigma=0.5)
    
    # Observation noise
    sigma_obs = pm.HalfNormal('sigma_obs', sigma=0.5)
    
    # Initial level
    init_level = pm.Normal('init_level', mu=0, sigma=1)
    
    # Level process (latent state - random walk)
    init_dist = pm.Normal.dist(mu=init_level, sigma=sigma_level)
    level = pm.GaussianRandomWalk('level',
                                 mu=0,
                                 sigma=sigma_level,
                                 init_dist=init_dist,
                                 steps=n_obs-1)
    
    # Observations (noisy measurements of latent state)
    obs = pm.Normal('obs', mu=level, sigma=sigma_obs, observed=births_standardized)

# Sample from local level model
with local_level_model:
    trace_local_level = pm.sample(1000, tune=1000, random_seed=RANDOM_SEED, chains=2)

print("Local Level Model Summary:")
print(az.summary(trace_local_level, var_names=['sigma_level', 'sigma_obs', 'init_level']))

## Model 2: Stochastic Volatility Model

Stochastic volatility models are essential for financial time series where the variance itself changes over time. These models assume that volatility follows its own stochastic process.

### Model Specification

**Returns equation**: $r_t = \mu + \sigma_t \epsilon_t$

**Volatility equation**: $\log(\sigma_t^2) = \log(\sigma_{t-1}^2) + \nu_t$

Where $\epsilon_t, \nu_t \sim \mathcal{N}(0,1)$ are independent.

In [None]:
# Generate synthetic financial returns for stochastic volatility demo
np.random.seed(42)
n_periods = 200
true_mu = 0.02
true_tau = 0.1

# Simulate log-volatility as random walk
log_vol = np.cumsum(np.random.normal(0, true_tau, n_periods))
vol = np.exp(log_vol / 2)

# Simulate returns
returns = np.random.normal(true_mu, vol)

print(f"📈 Generated {n_periods} synthetic returns")
print(f"   Mean return: {returns.mean():.4f}")
print(f"   Return volatility: {returns.std():.4f}")

In [None]:
# Model 2: Stochastic Volatility Model
with pm.Model() as stoch_vol_model:
    # Mean return
    mu = pm.Normal('mu', mu=0, sigma=0.1)
    
    # Volatility process parameters
    tau = pm.HalfNormal('tau', sigma=0.2)  # Innovation in log-volatility
    
    # Log-volatility random walk
    init_dist = pm.Normal.dist(mu=np.log(0.1), sigma=1)
    log_sigma = pm.GaussianRandomWalk('log_sigma',
                                     mu=0,
                                     sigma=tau,
                                     init_dist=init_dist,
                                     steps=n_periods-1)
    
    # Convert to volatility
    sigma = pm.Deterministic('sigma', pm.math.exp(log_sigma))
    
    # Likelihood
    y_pred = pm.Normal('y_pred', mu=mu, sigma=sigma, observed=returns)

# Sample from stochastic volatility model
with stoch_vol_model:
    trace_sv = pm.sample(1000, tune=2000, target_accept=0.9, random_seed=RANDOM_SEED, chains=2)

print("Stochastic Volatility Model Summary:")
print(az.summary(trace_sv, var_names=['mu', 'tau']))

## Model 3: Gaussian Process Regression

Gaussian Processes (GPs) provide a non-parametric approach to time series modeling. They're particularly useful when:
- The functional form is unknown
- You want to capture complex, non-linear patterns
- Uncertainty quantification is crucial

### Key Concepts

- **Covariance function**: Defines the relationship between points
- **Length scale**: Controls how quickly correlations decay
- **Marginal variance**: Controls the overall variability

In [None]:
# Prepare data for GP regression
# Use a subset for computational efficiency
n_gp = 100
X_gp = np.arange(n_gp)[:, None]  # Time points (must be 2D for GP)
y_gp = births_standardized[:n_gp]  # Observations

print(f"📊 GP data: {n_gp} observations")

In [None]:
# Model 3: Gaussian Process Regression
with pm.Model() as gp_model:
    # GP hyperparameters
    length_scale = pm.HalfNormal('length_scale', sigma=2.0)
    eta = pm.HalfNormal('eta', sigma=1.0)  # marginal standard deviation
    
    # Define the covariance function (RBF/Squared Exponential)
    cov_func = eta**2 * pm.gp.cov.ExpQuad(1, ls=length_scale)
    
    # GP prior
    gp = pm.gp.Marginal(cov_func=cov_func)
    
    # Observation noise
    sigma_gp = pm.HalfNormal('sigma_gp', sigma=0.5)
    
    # Observed data
    y_pred = gp.marginal_likelihood('y_pred', X=X_gp, y=y_gp, sigma=sigma_gp)

# Sample from the model
with gp_model:
    trace_gp = pm.sample(1000, tune=1000, random_seed=RANDOM_SEED, chains=2)

print("Gaussian Process Model Summary:")
print(az.summary(trace_gp, var_names=['length_scale', 'eta', 'sigma_gp']))

## Model 4: Multivariate Time Series

Many real-world applications involve multiple related time series. PyMC provides tools for modeling these relationships through multivariate distributions.

In [None]:
# Create synthetic multivariate time series
np.random.seed(42)
n_series = 3
n_time = 100

# True correlation matrix
true_corr = np.array([[1.0, 0.7, 0.3],
                      [0.7, 1.0, 0.5],
                      [0.3, 0.5, 1.0]])

# Generate correlated innovations
innovations = np.random.multivariate_normal(np.zeros(n_series), true_corr, n_time)

# Create multivariate random walk
mv_series = np.cumsum(innovations, axis=0)

print(f"📊 Generated {n_series} correlated time series with {n_time} observations each")
print(f"   Empirical correlations:")
print(np.corrcoef(mv_series.T))

In [None]:
# Model 4: Multivariate Random Walk
with pm.Model() as mv_model:
    # Innovation covariance matrix
    # Use LKJ prior for correlation matrix
    corr_matrix = pm.LKJCorr('corr_matrix', n=n_series, eta=2)
    
    # Standard deviations
    sigma_mv = pm.HalfNormal('sigma_mv', sigma=1, shape=n_series)
    
    # Covariance matrix
    cov_matrix = pm.Deterministic('cov_matrix', 
                                 pt.diag(sigma_mv) @ corr_matrix @ pt.diag(sigma_mv))
    
    # Multivariate random walk
    mv_walk = pm.MvGaussianRandomWalk('mv_walk',
                                     mu=np.zeros(n_series),
                                     cov=cov_matrix,
                                     steps=n_time-1,
                                     observed=mv_series[1:])

# Sample from multivariate model
with mv_model:
    trace_mv = pm.sample(1000, tune=1000, random_seed=RANDOM_SEED, chains=2)

print("Multivariate Model Summary:")
print(az.summary(trace_mv, var_names=['sigma_mv']))

## Model Comparison and Insights

Let's compare the performance of our advanced models and understand their strengths.

In [None]:
# Visualize model fits (for models that can be compared)
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Plot 1: Local Level Model
level_mean = az.extract(trace_local_level)['level'].mean(dim='sample')
axes[0,0].plot(births_standardized, 'o-', alpha=0.6, label='Observed')
axes[0,0].plot(level_mean, '-', linewidth=2, label='Latent Level')
axes[0,0].set_title('Local Level Model')
axes[0,0].legend()
axes[0,0].grid(True, alpha=0.3)

# Plot 2: Stochastic Volatility
vol_mean = az.extract(trace_sv)['sigma'].mean(dim='sample')
axes[0,1].plot(returns, alpha=0.6, label='Returns')
axes[0,1].plot(vol_mean, linewidth=2, label='Estimated Volatility')
axes[0,1].set_title('Stochastic Volatility Model')
axes[0,1].legend()
axes[0,1].grid(True, alpha=0.3)

# Plot 3: Gaussian Process (prediction)
# Generate predictions for GP
with gp_model:
    X_new = np.arange(n_gp + 20)[:, None]  # Extend beyond observed data
    gp_pred = gp.conditional('gp_pred', X_new)
    pred_samples = pm.sample_posterior_predictive(trace_gp, var_names=['gp_pred'], random_seed=RANDOM_SEED)

gp_mean = pred_samples.posterior_predictive['gp_pred'].mean(dim=['chain', 'draw'])
gp_hdi = az.hdi(pred_samples.posterior_predictive['gp_pred'], hdi_prob=0.9)

axes[1,0].plot(y_gp, 'o', alpha=0.6, label='Observed')
axes[1,0].plot(gp_mean, '-', linewidth=2, label='GP Mean')
axes[1,0].fill_between(range(len(gp_mean)), 
                      gp_hdi.sel(hdi='lower'), 
                      gp_hdi.sel(hdi='higher'), 
                      alpha=0.3, label='90% HDI')
axes[1,0].axvline(n_gp, color='red', linestyle='--', alpha=0.7, label='Forecast Start')
axes[1,0].set_title('Gaussian Process Regression')
axes[1,0].legend()
axes[1,0].grid(True, alpha=0.3)

# Plot 4: Multivariate Series
for i in range(n_series):
    axes[1,1].plot(mv_series[:, i], label=f'Series {i+1}')
axes[1,1].set_title('Multivariate Time Series')
axes[1,1].legend()
axes[1,1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\n🎯 **Model Insights**:")
print("   • **Local Level**: Separates signal from noise effectively")
print("   • **Stochastic Volatility**: Captures time-varying uncertainty")
print("   • **Gaussian Process**: Provides flexible, non-parametric fits")
print("   • **Multivariate**: Models correlations between series")

## Summary

In this section, we've explored advanced Bayesian time series models:

1. **State-Space Models**: Separate latent dynamics from observations
2. **Stochastic Volatility**: Model time-varying uncertainty
3. **Gaussian Processes**: Non-parametric, flexible modeling
4. **Multivariate Models**: Handle correlated time series

### When to Use Each Model

- **State-Space**: When you have latent variables or missing data
- **Stochastic Volatility**: For financial data with changing variance
- **Gaussian Process**: When functional form is unknown
- **Multivariate**: For multiple related time series

**Next**: In Section 5, we'll learn how to properly evaluate and compare these models.

---

**Key Takeaways**:
- Advanced models provide greater flexibility but require more careful specification
- State-space models are powerful for latent variable modeling
- Stochastic volatility is essential for financial time series
- Gaussian processes offer non-parametric flexibility
- Multivariate models capture cross-series dependencies