# VolSense — Model Training

In this notebook, you’ll:

1. Load the processed volatility dataset generated in Notebook 1  
2. Train baseline statistical models (GARCH, EGARCH)  
3. Train a deep learning global volatility forecaster (LSTM-based)  
4. Compare validation metrics and export trained models for inference

In [None]:
# Imports

import pandas as pd
import numpy as np
import torch
from pathlib import Path

from volsense_core.models.garch_methods import ARCHForecaster, forecast_arch, fit_arch
from volsense_core.models.lstm_forecaster import BaseLSTM
from volsense_core.models.global_vol_forecaster import (
    GlobalVolForecaster, TrainConfig, train_global_model
)
from volsense_core.utils.metrics import evaluate_forecast

## 📥 Step 1: Load processed dataset
We’ll use the dataset saved in `1_data_preparation.ipynb` and inspect it.

In [None]:
data_path = Path("../data/processed/global_volatility_dataset.csv")
multi_df = pd.read_csv(data_path, parse_dates=["date"])
print(f"✅ Loaded {len(multi_df):,} rows")
multi_df.head()

## Step 2: Quick sanity checks
Ensure no missing key columns and reasonable volatility range.

In [None]:
print(multi_df.isna().sum())
print(multi_df.describe()[["return","realized_vol"]])
multi_df["ticker"].value_counts().head()

## Step 3: Baseline — GARCH Model
We’ll train a simple GARCH(1, 1) for one ticker (AAPL) to establish a baseline.

In [None]:
ticker = "AAPL"
returns = multi_df.loc[multi_df["ticker"] == ticker, "return"].dropna()

arch_model = fit_arch(returns)
pred_vol = forecast_arch(arch_model, horizon=1)

print(f"Predicted 1-day volatility for {ticker}: {pred_vol:.4f}")