# Financial Utility & Stress Testing (Scenario Demo)

While the model's core performance is measured by CRPS (Continuous Ranked Probability Score), its true value as a research tool is in its ability to generate scenarios that are **structurally plausible** and **controllable for risk analysis.**

The dedicated scenario testing notebook proves utility by satisfying these three objectives:

### 1. Structural Integrity
We confirm the model captures financial "stylized facts." Crucially, the **Autocorrelation Function (ACF) of generated squared returns** shows high, slow-decaying correlation (volatility clustering), a necessary feature for credible financial data.

### 2. Tail Risk Quantification
The generated scenario paths are used to calculate the **95% Value-at-Risk (VaR)** and **Conditional VaR (CVaR)**. This proves the model's probabilistic output is calibrated for assessing extreme losses, making it directly relevant for risk management.

### 3. Counterfactual Scenario Control
By utilizing the static conditioning input, we demonstrate the model's **killer feature**: generating two distinct forecasts from the *same historical context* where one is forced into a **Stress Regime (High Vol)**. The resulting divergence in the forecast distribution proves the conditioning mechanism is robust and enables actionable "What-If" analysis.


## Notebook Roadmap
- **Setup**: Load dependencies, configure paths, and define utility helpers.
- **Scenario Generation**: Sample baseline and stress-regime forecasts from the trained model.
- **Structural Checks**: Verify volatility clustering via the ACF of squared returns.
- **Tail Risk Metrics**: Compute VaR and CVaR from simulated PnL paths.
- **Counterfactual Insights**: Contrast baseline vs. stress-conditioned scenarios to highlight controllability.
- **Takeaways**: Summarize findings and next steps for risk stakeholders.


In [None]:
# Core imports and visual style
from pathlib import Path
import sys

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (10, 4)


## Data & Model Handles
Adjust the paths below to point to the trained model artifact and evaluation dataset. Keep the `DATA_DIR` and `MODEL_DIR` relative to the repository root to simplify reproducibility across environments.


In [None]:
# Paths
try:
    PROJECT_ROOT = Path(__file__).resolve().parents[2]
except NameError:
    PROJECT_ROOT = Path.cwd().resolve().parents[1]

PACKAGE_ROOT = PROJECT_ROOT / "FinD_Generator"
DATA_DIR = PACKAGE_ROOT / "data"
RAW_DATA_DIR = DATA_DIR / "raw"
MODEL_DIR = DATA_DIR / "processed"

MODEL_CHECKPOINT = MODEL_DIR / "timegrad_checkpoint.pt"

if str(PACKAGE_ROOT) not in sys.path:
    sys.path.insert(0, str(PACKAGE_ROOT))


## Scenario Generation
Replace the placeholders with your model inference code. The key is to produce two sets of forecast paths:
1. **Baseline**: Standard forecast from recent history.
2. **Stress Regime**: Same history but with static conditioning toggled to a high-volatility regime.

Both outputs should be aligned time-wise to enable clean counterfactual comparisons.


In [None]:
# Load model and generate baseline vs. stress scenarios
import torch

from src.data_loader import TimeGradDataModule
from src.predictor import ConditionalTimeGradPredictionNetwork
from src.scenario_generator import ScenarioFeatureGenerator

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
context_length = 60
prediction_length = 5
num_samples = 512

data_sources = {
    "target": RAW_DATA_DIR / "target.parquet",
    "market": RAW_DATA_DIR / "market.parquet",
    "daily_macro": RAW_DATA_DIR / "daily_macro.parquet",
    "monthly_macro": RAW_DATA_DIR / "monthly_macro.parquet",
    "quarterly_macro": RAW_DATA_DIR / "quarterly_macro.parquet",
}
data_dict = {name: pd.read_parquet(path) for name, path in data_sources.items()}

dm = TimeGradDataModule(
    data_dict=data_dict,
    seq_len=context_length,
    forecast_horizon=prediction_length,
    batch_size=4,
    device=str(device),
)
dm.preprocess_and_split()
dm.build_datasets()

feature_cols = dm.get_feature_columns_by_type()
target_dim = len(feature_cols["target"])
cond_dynamic_dim = len(feature_cols["daily"]) + len(feature_cols["monthly"])
cond_static_dim = len(feature_cols["regime"])

predictor = ConditionalTimeGradPredictionNetwork(
    target_dim=target_dim,
    context_length=context_length,
    prediction_length=prediction_length,
    cond_dynamic_dim=cond_dynamic_dim,
    cond_static_dim=cond_static_dim,
    diff_steps=100,
    beta_end=0.1,
    beta_schedule="linear",
    residual_layers=6,
    residual_channels=32,
    cond_embed_dim=64,
    cond_attn_heads=4,
    cond_attn_dropout=0.1,
).to(device)

state = torch.load(MODEL_CHECKPOINT, map_location=device)
predictor.load_state_dict(state, strict=False)
predictor.eval()

test_batch = next(iter(dm.test_dataloader()))
x_hist = test_batch["x_hist"][:1].to(device)
cond_dynamic = test_batch["cond_dynamic"][:1].to(device)
cond_static_base = test_batch["cond_static"][:1].to(device)

regime_prefixes = {"market": "market_regime", "vol": "vol_regime", "macro": "macro_regime"}
scenario_generator = ScenarioFeatureGenerator(regime_prefixes=regime_prefixes, smoothing_window=3)

cond_static_df = pd.DataFrame(cond_static_base.cpu().numpy(), columns=feature_cols["regime"]).reindex(
    columns=feature_cols["regime"]
)
cond_static_stress_df = scenario_generator.apply_scenario(
    cond_df=cond_static_df,
    scenario={
        "market_regime": "bear",
        "vol_regime": "high_vol",
        "start_t": 0,
        "duration": 1,
        "transition": "hard",
    },
    horizon=1,
).reindex(columns=feature_cols["regime"], fill_value=0.0)

cond_static_baseline = torch.tensor(cond_static_df.values, device=device, dtype=cond_static_base.dtype)
cond_static_stress = torch.tensor(cond_static_stress_df.values, device=device, dtype=cond_static_base.dtype)

with torch.no_grad():
    baseline_samples = predictor.sample_autoregressive(
        x_hist=x_hist,
        cond_dynamic=cond_dynamic,
        cond_static=cond_static_baseline,
        num_samples=num_samples,
        sampling_strategy="masked_step",
    )
    stress_samples = predictor.sample_autoregressive(
        x_hist=x_hist,
        cond_dynamic=cond_dynamic,
        cond_static=cond_static_stress,
        num_samples=num_samples,
        sampling_strategy="masked_step",
    )

baseline_df = baseline_samples.mean(dim=-1).squeeze(1).cpu().numpy().T
stress_df = stress_samples.mean(dim=-1).squeeze(1).cpu().numpy().T

baseline_scenarios = pd.DataFrame(baseline_df)
stress_scenarios = pd.DataFrame(stress_df)


## 1. Structural Integrity: Volatility Clustering via ACF
We evaluate whether generated squared returns exhibit slow-decaying autocorrelation, mirroring empirical financial series. A pronounced, long-memory ACF in squared returns signals that the model respects volatility clustering.


In [None]:
from statsmodels.tsa.stattools import acf

# Compute squared returns
sq_returns = baseline_scenarios.pow(2)

# ACF of squared returns (per horizon) averaged across scenarios
lags = 20
acf_values = sq_returns.apply(lambda col: acf(col, nlags=lags, fft=True), axis=0)
acf_mean = acf_values.mean(axis=1)

fig, ax = plt.subplots()
ax.stem(range(len(acf_mean)), acf_mean, basefmt=" " )
ax.set(title="Average ACF of Squared Returns (Baseline)", xlabel="Lag", ylabel="ACF")
plt.show()


## 2. Tail Risk Quantification: VaR & CVaR
We derive Value-at-Risk and Conditional VaR from simulated PnL distributions. These metrics articulate tail exposure for both baseline and stress regimes, demonstrating the model's utility for downstream risk analysis.


In [None]:
def var_cvar(series, alpha=0.95):
    cutoff = series.quantile(1 - alpha)
    cvar = series[series <= cutoff].mean()
    return cutoff, cvar

# Aggregate PnL per scenario (sum over horizon)
baseline_pnl = baseline_scenarios.sum()
stress_pnl = stress_scenarios.sum()

baseline_var, baseline_cvar = var_cvar(baseline_pnl, alpha=0.95)
stress_var, stress_cvar = var_cvar(stress_pnl, alpha=0.95)

metrics = pd.DataFrame(
    {
        "VaR_95": [baseline_var, stress_var],
        "CVaR_95": [baseline_cvar, stress_cvar],
    },
    index=["Baseline", "Stress"],
)

metrics


## 3. Counterfactual Scenario Control
We juxtapose baseline and stress forecasts conditioned on the same historical window. The divergence in distributions—and the corresponding shift in tail metrics—demonstrates controllability of risk profiles via static conditioning.


In [None]:
fig, ax = plt.subplots()
sns.kdeplot(baseline_pnl, label="Baseline", fill=True, alpha=0.4, ax=ax)
sns.kdeplot(stress_pnl, label="Stress Regime", fill=True, alpha=0.4, ax=ax)
ax.axvline(baseline_var, color="C0", linestyle="--", label="Baseline VaR")
ax.axvline(stress_var, color="C1", linestyle="--", label="Stress VaR")
ax.set(title="Counterfactual PnL Distributions", xlabel="PnL", ylabel="Density")
ax.legend()
plt.show()


## Key Takeaways
- **Structural Integrity**: ACF diagnostics confirm volatility clustering in generated squared returns.
- **Tail Sensitivity**: VaR/CVaR shifts between baseline and stress scenarios quantify risk amplification.
- **Actionable Counterfactuals**: Conditioning enables targeted "What-If" analysis for governance and scenario planning.

> Replace the placeholder sampling logic with your model's inference API to operationalize this workflow.
