# Deep Hedging — Black-Scholes, Merton & Heston

Pipeline complet de Deep Hedging avec le package `deep_hedging/`.

### Nouveautes v2
- **6 features enrichies** (log-moneyness, vol realisee, BS delta hint, ...)
- **PolicyLSTM** (reseau recurrent) en plus du MLP
- **Heston** (volatilite stochastique) en plus de BS et Merton
- **Payoffs exotiques** : call, put, asian, straddle, lookback
- **OCE parametrique** : loss differentiable avec seuil appris
- **Metriques enrichies** : Hedging Error, Sharpe PnL, Cost/Payoff, No Hedge baseline

### Pipeline
1. Configuration
2. Visualisation des trajectoires (BS, Merton, Heston)
3. Training MLP et LSTM (BS puis Merton)
4. Evaluation multi-scenarios
5. Analyse de risque (VaR, CVaR, KDE, QQ-plot)
6. Tableau de synthese colore
7. Bonus : payoffs exotiques

## 1. Imports et Configuration

In [None]:
import numpy as np
import torch
from dataclasses import replace

from deep_hedging import (
    DEVICE, DTYPE, N_FEATURES,
    DeepHedgingConfig, MarketConfig, TrainingConfig, RandomConfig,
    SimpleWorldBS, SimpleWorldMerton, SimpleWorldHeston,
    DeepHedgingEnv, PolicyMLP, PolicyLSTM, DeltaBSPolicy,
    MonetaryUtility, OCEUtility,
    train_deep_hedging,
    evaluate_strategies_env_world, build_comparison_table,
    RiskMetrics,
    plot_training_history, plot_gains_hist, plot_payoff_vs_gains,
    plot_simulated_paths, traffic_light_style,
)

print(f"Device: {DEVICE}")
print(f"Features par pas de temps: {N_FEATURES}")

## 2. Configuration globale

In [None]:
cfg = DeepHedgingConfig()
print(f"Marche  : S0={cfg.market.S0}, sigma={cfg.market.sigma}, K={cfg.market.K}")
print(f"Merton  : jumps={cfg.market.use_jumps}, lambda={cfg.market.lambda_jump}")
print(f"Heston  : kappa={cfg.market.heston_kappa}, theta={cfg.market.heston_theta}, xi={cfg.market.heston_xi}, rho={cfg.market.heston_rho}")
print(f"Training: epochs={cfg.training.n_epochs}, batch={cfg.training.batch_size}, lr={cfg.training.lr}")
print(f"CVaR alpha={cfg.training.cvar_alpha}")

## 3. Visualisation des trajectoires simulees

### 3.1 Black-Scholes

In [None]:
world_bs = SimpleWorldBS(cfg.market)
data_bs = world_bs.simulate_paths(200, seed=42)
plot_simulated_paths(data_bs["S"], n_paths_to_plot=20)

### 3.2 Merton Jump-Diffusion

In [None]:
world_merton = SimpleWorldMerton(cfg.market)
data_merton = world_merton.simulate_paths(200, seed=42)
plot_simulated_paths(data_merton["S"], n_paths_to_plot=20)

### 3.3 Heston Stochastic Volatility

In [None]:
world_heston = SimpleWorldHeston(cfg.market)
data_heston = world_heston.simulate_paths(200, seed=42)
plot_simulated_paths(data_heston["S"], n_paths_to_plot=20)

# Trajectoires de variance
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 4))
for i in range(20):
    ax.plot(data_heston["v"][i], linewidth=0.8, alpha=0.7)
ax.set_xlabel("Time steps")
ax.set_ylabel("Variance v_t")
ax.set_title("Heston — Variance Paths")
ax.grid(alpha=0.4)
plt.tight_layout()
plt.show()

## 4. Training Deep Hedging

### 4.1 Entrainement MLP sous Black-Scholes

In [None]:
cfg_bs = DeepHedgingConfig(
    market=replace(cfg.market, use_jumps=False, payoff_type="call"),
    training=cfg.training,
    random=cfg.random,
    device=cfg.device,
    dtype=cfg.dtype,
)

# MLP avec 6 features enrichies
policy_mlp_bs = PolicyMLP(d_in=N_FEATURES, d_hidden=32, depth=2, dropout=0.1, clip=2.0)

res_mlp_bs = train_deep_hedging(
    cfg_bs, policy_mlp_bs,
    utility=MonetaryUtility(kind="cvar", alpha=cfg_bs.training.cvar_alpha),
    patience=5, min_delta=1e-3, use_scheduler=True,
)
print("Training MLP BS termine.")
plot_training_history(res_mlp_bs["history"])

### 4.2 Entrainement LSTM sous Black-Scholes

In [None]:
policy_lstm_bs = PolicyLSTM(d_in=N_FEATURES, d_hidden=32, n_layers=1, dropout=0.0, clip=2.0)

res_lstm_bs = train_deep_hedging(
    cfg_bs, policy_lstm_bs,
    utility=MonetaryUtility(kind="cvar", alpha=cfg_bs.training.cvar_alpha),
    patience=5, min_delta=1e-3, use_scheduler=True,
)
print("Training LSTM BS termine.")
plot_training_history(res_lstm_bs["history"])

### 4.3 Entrainement MLP sous Merton (sauts)

In [None]:
cfg_merton = DeepHedgingConfig(
    market=replace(cfg.market, use_jumps=True, payoff_type="call"),
    training=cfg.training,
    random=cfg.random,
    device=cfg.device,
    dtype=cfg.dtype,
)

policy_mlp_merton = PolicyMLP(d_in=N_FEATURES, d_hidden=32, depth=2, dropout=0.1, clip=2.0)

res_mlp_merton = train_deep_hedging(
    cfg_merton, policy_mlp_merton,
    utility=MonetaryUtility(kind="cvar", alpha=cfg_merton.training.cvar_alpha),
    patience=5, min_delta=1e-3, use_scheduler=True,
)
print("Training MLP Merton termine.")
plot_training_history(res_mlp_merton["history"])

### 4.4 Entrainement LSTM sous Merton (sauts)

In [None]:
policy_lstm_merton = PolicyLSTM(d_in=N_FEATURES, d_hidden=32, n_layers=1, dropout=0.0, clip=2.0)

res_lstm_merton = train_deep_hedging(
    cfg_merton, policy_lstm_merton,
    utility=MonetaryUtility(kind="cvar", alpha=cfg_merton.training.cvar_alpha),
    patience=5, min_delta=1e-3, use_scheduler=True,
)
print("Training LSTM Merton termine.")
plot_training_history(res_lstm_merton["history"])

## 5. Evaluation multi-scenarios

Quatre scenarios principaux :
- **BS → BS** : entrainement et test sous Black-Scholes
- **BS → Merton** : entrainement BS, test avec sauts
- **Merton → Merton** : entrainement et test avec sauts
- **Merton → Heston** : entrainement Merton, test vol stochastique

### 5.1 MLP BS → BS

In [None]:
eval_mlp_bs_bs = evaluate_strategies_env_world(
    cfg_bs, res_mlp_bs["policy"],
    world_class=SimpleWorldBS,
    n_paths_eval=20_000,
    seed_eval=cfg_bs.random.seed_eval_bs,
)
print("=== MLP : BS -> BS ===")
print(f"CVaR Deep  : {eval_mlp_bs_bs['cvar_deep']:.4f}")
print(f"CVaR Delta : {eval_mlp_bs_bs['cvar_delta']:.4f}")
print(f"CVaR NoHedge: {eval_mlp_bs_bs['cvar_no_hedge']:.4f}")
plot_gains_hist(eval_mlp_bs_bs)

### 5.2 LSTM BS → BS

In [None]:
eval_lstm_bs_bs = evaluate_strategies_env_world(
    cfg_bs, res_lstm_bs["policy"],
    world_class=SimpleWorldBS,
    n_paths_eval=20_000,
    seed_eval=cfg_bs.random.seed_eval_bs,
)
print("=== LSTM : BS -> BS ===")
print(f"CVaR Deep  : {eval_lstm_bs_bs['cvar_deep']:.4f}")
print(f"CVaR Delta : {eval_lstm_bs_bs['cvar_delta']:.4f}")
plot_gains_hist(eval_lstm_bs_bs)

### 5.3 MLP BS → Merton (robustesse)

In [None]:
eval_mlp_bs_merton = evaluate_strategies_env_world(
    cfg_bs, res_mlp_bs["policy"],
    world_class=SimpleWorldMerton,
    n_paths_eval=20_000,
    seed_eval=cfg_bs.random.seed_eval_bs_merton,
)
print("=== MLP : BS -> Merton ===")
print(f"CVaR Deep  : {eval_mlp_bs_merton['cvar_deep']:.4f}")
print(f"CVaR Delta : {eval_mlp_bs_merton['cvar_delta']:.4f}")
plot_gains_hist(eval_mlp_bs_merton)

### 5.4 MLP Merton → Merton

In [None]:
eval_mlp_merton_merton = evaluate_strategies_env_world(
    cfg_merton, res_mlp_merton["policy"],
    world_class=SimpleWorldMerton,
    n_paths_eval=20_000,
    seed_eval=cfg_merton.random.seed_eval_merton,
)
print("=== MLP : Merton -> Merton ===")
print(f"CVaR Deep  : {eval_mlp_merton_merton['cvar_deep']:.4f}")
print(f"CVaR Delta : {eval_mlp_merton_merton['cvar_delta']:.4f}")
plot_gains_hist(eval_mlp_merton_merton)

### 5.5 LSTM Merton → Merton

In [None]:
eval_lstm_merton_merton = evaluate_strategies_env_world(
    cfg_merton, res_lstm_merton["policy"],
    world_class=SimpleWorldMerton,
    n_paths_eval=20_000,
    seed_eval=cfg_merton.random.seed_eval_merton,
)
print("=== LSTM : Merton -> Merton ===")
print(f"CVaR Deep  : {eval_lstm_merton_merton['cvar_deep']:.4f}")
print(f"CVaR Delta : {eval_lstm_merton_merton['cvar_delta']:.4f}")
plot_gains_hist(eval_lstm_merton_merton)

### 5.6 MLP Merton → Heston (cross-model)

In [None]:
eval_mlp_merton_heston = evaluate_strategies_env_world(
    cfg_merton, res_mlp_merton["policy"],
    world_class=SimpleWorldHeston,
    n_paths_eval=20_000,
    seed_eval=cfg_merton.random.seed_eval_merton,
)
print("=== MLP : Merton -> Heston ===")
print(f"CVaR Deep  : {eval_mlp_merton_heston['cvar_deep']:.4f}")
print(f"CVaR Delta : {eval_mlp_merton_heston['cvar_delta']:.4f}")
plot_gains_hist(eval_mlp_merton_heston)

## 6. Analyse de risque detaillee

Focus sur le scenario Merton → Merton (MLP vs LSTM vs Delta).

In [None]:
rm = RiskMetrics(alpha=cfg.training.cvar_alpha)

# MLP Merton->Merton
gains_delta = eval_mlp_merton_merton["gains_delta"]
gains_mlp   = eval_mlp_merton_merton["gains_deep"]
gains_lstm  = eval_lstm_merton_merton["gains_deep"]

# Histogrammes avec VaR
rm.plot_hist_with_var(gains_delta, title="Delta Hedging — Merton")
rm.plot_hist_with_var(gains_mlp,  title="Deep Hedging MLP — Merton")
rm.plot_hist_with_var(gains_lstm, title="Deep Hedging LSTM — Merton")

# KDE : Delta vs MLP vs LSTM
rm.plot_kde(gains_delta, gains_mlp, label_a="Delta", label_b="Deep MLP")
rm.plot_kde(gains_delta, gains_lstm, label_a="Delta", label_b="Deep LSTM")
rm.plot_kde(gains_mlp, gains_lstm, label_a="MLP", label_b="LSTM")

# QQ-Plot
rm.plot_qq(gains_mlp, title="Deep MLP — QQ-plot vs Normal")
rm.plot_qq(gains_lstm, title="Deep LSTM — QQ-plot vs Normal")

# Queue gauche
rm.plot_left_tail(gains_delta, gains_mlp, label_a="Delta", label_b="Deep MLP")
rm.plot_left_tail(gains_delta, gains_lstm, label_a="Delta", label_b="Deep LSTM")

## 7. Tableau de synthese

Comparaison coloree de tous les scenarios avec metriques enrichies.

In [None]:
import pandas as pd

# Construire les tableaux
scenarios = {
    "MLP BS->BS": (cfg_bs, eval_mlp_bs_bs),
    "LSTM BS->BS": (cfg_bs, eval_lstm_bs_bs),
    "MLP BS->Merton": (cfg_bs, eval_mlp_bs_merton),
    "MLP Merton->Merton": (cfg_merton, eval_mlp_merton_merton),
    "LSTM Merton->Merton": (cfg_merton, eval_lstm_merton_merton),
    "MLP Merton->Heston": (cfg_merton, eval_mlp_merton_heston),
}

all_tables = []
for scenario_name, (c, ev) in scenarios.items():
    t = build_comparison_table(c, ev).copy()
    t.insert(0, "Scenario", scenario_name)
    all_tables.append(t)

summary_df = pd.concat(all_tables, axis=0)
summary_df = summary_df.reset_index().rename(columns={"index": "Strategy"})
summary_df = summary_df.set_index(["Scenario", "Strategy"])

# Colonnes metriques
metric_cols = [c for c in summary_df.columns]
summary_metrics = summary_df[metric_cols].astype(float)

# Affichage avec code couleur
styled = (
    summary_metrics
    .style
    .apply(traffic_light_style, axis=None)
    .format("{:.4f}")
)
display(styled)

## 8. Comparaison directe MLP vs LSTM

Resume des performances par architecture.

In [None]:
# Extraire les metriques cles pour MLP vs LSTM
comparison_data = {
    "Scenario": ["BS->BS", "BS->BS", "Merton->Merton", "Merton->Merton"],
    "Architecture": ["MLP", "LSTM", "MLP", "LSTM"],
    "CVaR Deep": [
        eval_mlp_bs_bs["cvar_deep"],
        eval_lstm_bs_bs["cvar_deep"],
        eval_mlp_merton_merton["cvar_deep"],
        eval_lstm_merton_merton["cvar_deep"],
    ],
    "Std Deep": [
        eval_mlp_bs_bs["std_deep"],
        eval_lstm_bs_bs["std_deep"],
        eval_mlp_merton_merton["std_deep"],
        eval_lstm_merton_merton["std_deep"],
    ],
    "CVaR Delta (ref)": [
        eval_mlp_bs_bs["cvar_delta"],
        eval_lstm_bs_bs["cvar_delta"],
        eval_mlp_merton_merton["cvar_delta"],
        eval_lstm_merton_merton["cvar_delta"],
    ],
}

comp_df = pd.DataFrame(comparison_data).set_index(["Scenario", "Architecture"])
print(comp_df.to_string())

## 9. Bonus : Deep Hedging sur payoffs exotiques

Test rapide du MLP sur differents payoffs (entrainement BS).

In [None]:
exotic_results = {}

for ptype in ["call", "put", "straddle", "asian", "lookback"]:
    cfg_exotic = DeepHedgingConfig(
        market=replace(cfg.market, use_jumps=False, payoff_type=ptype,
                       n_paths_train=50_000, n_paths_val=10_000),
        training=replace(cfg.training, n_epochs=30, print_every=10),
        random=cfg.random,
        device=cfg.device,
        dtype=cfg.dtype,
    )

    pol = PolicyMLP(d_in=N_FEATURES, d_hidden=32, depth=2, dropout=0.1, clip=2.0)
    res = train_deep_hedging(cfg_exotic, pol, patience=5)

    ev = evaluate_strategies_env_world(
        cfg_exotic, res["policy"],
        world_class=SimpleWorldBS,
        n_paths_eval=10_000,
        seed_eval=42,
    )
    exotic_results[ptype] = ev
    print(f"Payoff {ptype:>10s} | CVaR Deep={ev['cvar_deep']:.2f}, CVaR Delta={ev['cvar_delta']:.2f}")

print()
print("Tous les payoffs exotiques traites.")

## 10. Sauvegarde des modeles

In [None]:
torch.save({
    "policy_mlp_bs": res_mlp_bs["policy"].state_dict(),
    "policy_lstm_bs": res_lstm_bs["policy"].state_dict(),
    "policy_mlp_merton": res_mlp_merton["policy"].state_dict(),
    "policy_lstm_merton": res_lstm_merton["policy"].state_dict(),
    "config_bs": cfg_bs,
    "config_merton": cfg_merton,
    "history_mlp_bs": res_mlp_bs["history"],
    "history_lstm_bs": res_lstm_bs["history"],
    "history_mlp_merton": res_mlp_merton["history"],
    "history_lstm_merton": res_lstm_merton["history"],
}, "deep_hedging_models.pt")

print("Modeles sauvegardes dans 'deep_hedging_models.pt'")

## Conclusion

Ce notebook orchestre le pipeline complet de Deep Hedging v2 :

| Fonctionnalite | Details |
|---|---|
| **Mondes** | Black-Scholes, Merton Jump-Diffusion, Heston Stochastic Vol |
| **Architectures** | MLP (feed-forward) et LSTM (recurrent) |
| **Features** | 6D enrichies : log-moneyness, time, prev_delta, realized_vol, BS delta, d1 |
| **Losses** | CVaR empirique, entropique, OCE parametrique |
| **Payoffs** | Call, Put, Straddle, Asian, Lookback |
| **Evaluation** | No Hedge + Delta BS + Deep Hedging, 10 metriques |
| **Risk** | VaR, CVaR, KDE, QQ-plot, queue gauche |

### Structure du package

```
deep_hedging/
  __init__.py      # Public API
  config.py        # Configurations + Heston params + payoff_type
  worlds.py        # BS, Merton, Heston + payoffs exotiques
  env.py           # 6 features enrichies
  policies.py      # MLP, LSTM, DeltaBS
  losses.py        # CVaR, entropique, OCE parametrique
  training.py      # DataLoader + early stop + LR scheduler
  evaluation.py    # No Hedge + Delta + Deep, 10 metriques
  risk_metrics.py  # VaR, CVaR, KDE, QQ-plot
  plotting.py      # Visualisations
  utils.py         # Fonctions utilitaires
```