# 03 — Fit Gaussian Copula and Simulate Dependence

**Objective**
- Fit a **Gaussian Copula** to the PIT-transformed uniform returns.
- Compare **empirical vs. simulated** dependence between assets.
- Visualize correlations and joint scatterplots.
- Save the copula model and simulated uniforms for downstream use.

**Concept Reminder**
A copula captures **dependence structure** separate from marginals:
> `U_i ~ Uniform(0,1)` but jointly follow a correlation matrix Σ (the copula core).


In [1]:
%load_ext autoreload
%autoreload 2

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from copulas.multivariate import GaussianMultivariate

from quantcopula.plotting import plot_correlation_heatmap
from quantcopula.copula import (
    fit_gaussian_copula,
    simulate_copula
)

# Parameters
N_SAMPLES = 5000
os.makedirs("data/processed", exist_ok=True)
os.makedirs("figures", exist_ok=True)


ModuleNotFoundError: No module named 'copulas'

In [None]:
# Load uniforms generated in the previous notebook
U = pd.read_parquet("data/processed/uniforms.parquet")
U = U.dropna(how="any")

U.describe().round(4)


## Fit Gaussian Copula
Estimate correlation structure Σ from the uniforms.
The Gaussian Copula assumes the dependence between `U_i`'s follows a multivariate normal correlation.


In [None]:
gaussian_copula = fit_gaussian_copula(U)
print("Gaussian Copula fitted successfully.")

# You can access the estimated correlation matrix:
corr_matrix = gaussian_copula.covariance
pd.DataFrame(corr_matrix, index=U.columns, columns=U.columns).round(3)


In [None]:
U_sim = simulate_copula(gaussian_copula, N_SAMPLES)
U_sim.columns = U.columns

U_sim.describe().round(4)


In [None]:
plot_correlation_heatmap(U, out_path="figures/corr_uniform_empirical.png")
plot_correlation_heatmap(U_sim, out_path="figures/corr_uniform_simulated.png")


## Compare Selected Pairs
Scatterplots of empirical vs simulated dependence (U-space).
We expect the overall cloud shapes to be similar.


In [None]:
pairs = [('XLK', 'XLF'), ('XLI', 'XLV'), ('XLK', 'XLI')]

for a, b in pairs:
    plt.figure(figsize=(6, 5))
    plt.scatter(U[a], U[b], alpha=0.4, color='gray', label='Empirical')
    plt.scatter(U_sim[a], U_sim[b], alpha=0.4, color='blue', label='Simulated')
    plt.xlabel(a)
    plt.ylabel(b)
    plt.title(f"Gaussian Copula Dependence: {a} vs {b}")
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.savefig(f"figures/copula_dependence_{a}_{b}.png", dpi=150)
    plt.close()


## Save the Copula Model
We'll save the fitted Gaussian Copula object (as a pickle) and the simulated U's for downstream transformation.


In [None]:
import joblib

joblib.dump(gaussian_copula, "data/processed/gaussian_copula.pkl")
U_sim.to_parquet("data/processed/u_simulated.parquet")

print("Model and simulated uniforms saved successfully.")


## Takeaways

- The Gaussian Copula captures **linear dependence** (correlation) between asset uniforms.
- Simulated pairs preserve rank structure, though not tail dependence (limitations of Gaussian copulas).
- Outputs will be used next to **map simulated uniforms back to real returns** via inverse t-CDFs.

**Next:** `04_simulate_var_cvar_and_stress.ipynb`
