# Day 5: Advanced Deep Hedging - CVaR Objectives & Risk-Aware Hedging

## Learning Objectives
- Understand Conditional Value-at-Risk (CVaR) as a coherent risk measure
- Implement CVaR-based objective functions for deep hedging
- Build risk-aware neural network hedging strategies
- Compare mean-variance vs CVaR optimization approaches
- Explore spectral risk measures and their implementation

## Topics Covered
1. CVaR (Expected Shortfall) Theory
2. Differentiable CVaR for Neural Networks
3. Risk-Aware Deep Hedging Architecture
4. Multi-Objective Optimization (P&L vs Risk)
5. Practical Implementation with Transaction Costs

In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import matplotlib.pyplot as plt
from typing import Tuple, Optional, Callable
from dataclasses import dataclass
import warnings
warnings.filterwarnings('ignore')

# Set seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

## 1. CVaR (Conditional Value-at-Risk) Theory

### Definition
CVaR (also known as Expected Shortfall) at confidence level $\alpha$ is defined as:

$$\text{CVaR}_\alpha(X) = \mathbb{E}[X | X \leq \text{VaR}_\alpha(X)]$$

For a loss distribution, CVaR represents the expected loss in the worst $(1-\alpha)\%$ of cases.

### Why CVaR for Hedging?
- **Coherent risk measure**: Satisfies subadditivity (diversification reduces risk)
- **Tail-sensitive**: Focuses on extreme losses
- **Differentiable formulation**: Can be optimized with gradient descent

### Rockafellar-Uryasev Formulation
CVaR can be computed as:

$$\text{CVaR}_\alpha(X) = \min_{\nu} \left\{ \nu + \frac{1}{1-\alpha} \mathbb{E}[(X - \nu)^+] \right\}$$

This formulation is crucial for neural network optimization!

In [None]:
@dataclass
class MarketParams:
    """Market parameters for simulation."""
    S0: float = 100.0        # Initial stock price
    K: float = 100.0         # Strike price
    T: float = 1/12          # Time to maturity (1 month)
    r: float = 0.05          # Risk-free rate
    sigma: float = 0.20      # Volatility
    n_steps: int = 21        # Trading days
    transaction_cost: float = 0.001  # 10 bps


class GBMSimulator:
    """Geometric Brownian Motion simulator for stock prices."""
    
    def __init__(self, params: MarketParams):
        self.params = params
        self.dt = params.T / params.n_steps
    
    def simulate_paths(self, n_paths: int) -> np.ndarray:
        """Generate stock price paths."""
        dt = self.dt
        drift = (self.params.r - 0.5 * self.params.sigma**2) * dt
        diffusion = self.params.sigma * np.sqrt(dt)
        
        # Generate random increments
        Z = np.random.randn(n_paths, self.params.n_steps)
        log_returns = drift + diffusion * Z
        
        # Construct paths
        log_prices = np.zeros((n_paths, self.params.n_steps + 1))
        log_prices[:, 0] = np.log(self.params.S0)
        log_prices[:, 1:] = log_returns.cumsum(axis=1) + np.log(self.params.S0)
        
        return np.exp(log_prices)
    
    def get_time_to_maturity(self) -> np.ndarray:
        """Return time to maturity at each step."""
        return np.linspace(self.params.T, 0, self.params.n_steps + 1)


# Initialize
params = MarketParams()
simulator = GBMSimulator(params)

# Test simulation
test_paths = simulator.simulate_paths(1000)
print(f"Simulated {test_paths.shape[0]} paths with {test_paths.shape[1]} time steps")
print(f"Final price range: [{test_paths[:, -1].min():.2f}, {test_paths[:, -1].max():.2f}]")

## 2. CVaR Loss Functions for Deep Learning

### Differentiable CVaR Implementation

We implement CVaR using the Rockafellar-Uryasev formulation which allows gradient-based optimization:

$$\mathcal{L}_{\text{CVaR}} = \nu + \frac{1}{(1-\alpha) \cdot N} \sum_{i=1}^{N} \max(L_i - \nu, 0)$$

Where $\nu$ (VaR) is jointly optimized with the hedging strategy.

In [None]:
class CVaRLoss(nn.Module):
    """Differentiable CVaR (Expected Shortfall) loss function.
    
    Uses Rockafellar-Uryasev formulation for gradient-based optimization.
    """
    
    def __init__(self, alpha: float = 0.95):
        """
        Args:
            alpha: Confidence level (e.g., 0.95 for 95% CVaR)
        """
        super().__init__()
        self.alpha = alpha
        # Learnable VaR parameter
        self.nu = nn.Parameter(torch.tensor(0.0))
    
    def forward(self, losses: torch.Tensor) -> torch.Tensor:
        """Compute CVaR of the loss distribution.
        
        Args:
            losses: Tensor of loss values (positive = loss, negative = profit)
        
        Returns:
            CVaR value
        """
        # Rockafellar-Uryasev formulation
        excess_loss = torch.relu(losses - self.nu)
        cvar = self.nu + excess_loss.mean() / (1 - self.alpha)
        return cvar
    
    def get_var(self) -> float:
        """Return current VaR estimate."""
        return self.nu.item()


class SortingCVaRLoss(nn.Module):
    """CVaR computed via sorting (non-parametric).
    
    More accurate but slightly less smooth gradients.
    """
    
    def __init__(self, alpha: float = 0.95):
        super().__init__()
        self.alpha = alpha
    
    def forward(self, losses: torch.Tensor) -> torch.Tensor:
        """Compute CVaR by sorting and averaging tail losses."""
        sorted_losses, _ = torch.sort(losses, descending=True)
        n_tail = max(1, int((1 - self.alpha) * len(losses)))
        cvar = sorted_losses[:n_tail].mean()
        return cvar


# Test CVaR implementations
torch.manual_seed(42)
test_losses = torch.randn(10000) * 10 + 5  # Mean=5, std=10

cvar_loss = CVaRLoss(alpha=0.95)
sorting_cvar = SortingCVaRLoss(alpha=0.95)

print(f"Parametric CVaR (95%): {cvar_loss(test_losses).item():.4f}")
print(f"Sorting CVaR (95%): {sorting_cvar(test_losses).item():.4f}")
print(f"Empirical VaR (95%): {np.percentile(test_losses.numpy(), 95):.4f}")

In [None]:
class SpectralRiskMeasure(nn.Module):
    """Generalized spectral risk measure.
    
    Spectral risk measures are weighted averages of quantiles:
    ρ(X) = ∫₀¹ φ(p) · VaR_p(X) dp
    
    Where φ is a non-negative, non-increasing weight function with ∫φ=1.
    """
    
    def __init__(self, risk_aversion: float = 2.0):
        """
        Args:
            risk_aversion: Higher values = more weight on tail losses
        """
        super().__init__()
        self.risk_aversion = risk_aversion
    
    def forward(self, losses: torch.Tensor) -> torch.Tensor:
        """Compute spectral risk measure."""
        sorted_losses, _ = torch.sort(losses, descending=True)
        n = len(sorted_losses)
        
        # Exponential weighting (more weight on larger losses)
        positions = torch.arange(1, n + 1, dtype=torch.float32, device=losses.device)
        weights = torch.exp(-self.risk_aversion * positions / n)
        weights = weights / weights.sum()  # Normalize
        
        return (weights * sorted_losses).sum()


class MeanCVaRLoss(nn.Module):
    """Combined Mean + λ·CVaR objective.
    
    Balances expected P&L with tail risk control.
    """
    
    def __init__(self, alpha: float = 0.95, lambda_cvar: float = 0.5):
        """
        Args:
            alpha: CVaR confidence level
            lambda_cvar: Weight on CVaR (0=mean only, 1=CVaR only)
        """
        super().__init__()
        self.alpha = alpha
        self.lambda_cvar = lambda_cvar
        self.cvar_module = CVaRLoss(alpha)
    
    def forward(self, losses: torch.Tensor) -> Tuple[torch.Tensor, dict]:
        """Compute combined objective."""
        mean_loss = losses.mean()
        cvar = self.cvar_module(losses)
        
        total_loss = (1 - self.lambda_cvar) * mean_loss + self.lambda_cvar * cvar
        
        metrics = {
            'mean': mean_loss.item(),
            'cvar': cvar.item(),
            'var': self.cvar_module.get_var()
        }
        
        return total_loss, metrics


# Visualize different risk measures
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Loss distribution with risk measures
ax1 = axes[0]
losses_np = test_losses.numpy()
ax1.hist(losses_np, bins=100, density=True, alpha=0.7, label='P&L Distribution')

var_95 = np.percentile(losses_np, 95)
cvar_95 = losses_np[losses_np >= var_95].mean()

ax1.axvline(losses_np.mean(), color='green', linestyle='-', linewidth=2, label=f'Mean: {losses_np.mean():.2f}')
ax1.axvline(var_95, color='orange', linestyle='--', linewidth=2, label=f'VaR 95%: {var_95:.2f}')
ax1.axvline(cvar_95, color='red', linestyle='--', linewidth=2, label=f'CVaR 95%: {cvar_95:.2f}')

ax1.fill_between(np.linspace(var_95, losses_np.max(), 100), 0, 0.02, alpha=0.3, color='red', label='Tail (5%)')
ax1.set_xlabel('Loss')
ax1.set_ylabel('Density')
ax1.set_title('Risk Measures on Loss Distribution')
ax1.legend()

# Plot 2: Spectral weights
ax2 = axes[1]
x = np.linspace(0, 1, 100)
for gamma in [0.5, 1.0, 2.0, 5.0]:
    weights = np.exp(-gamma * x)
    weights = weights / weights.sum() * 100
    ax2.plot(x, weights, label=f'γ = {gamma}')

ax2.set_xlabel('Quantile (sorted by loss)')
ax2.set_ylabel('Weight (%)')
ax2.set_title('Spectral Risk Measure Weights')
ax2.legend()

plt.tight_layout()
plt.show()

## 3. Risk-Aware Deep Hedging Neural Network

### Architecture Design
Our network learns the optimal hedging strategy that minimizes CVaR of the hedging P&L.

**Inputs at each time step:**
- Current stock price (normalized)
- Time to maturity
- Current hedge position
- Implied volatility (if available)

**Output:**
- Optimal hedge ratio (delta)

In [None]:
class RiskAwareHedgingNetwork(nn.Module):
    """Neural network for risk-aware deep hedging.
    
    Architecture designed to learn CVaR-optimal hedging strategies.
    """
    
    def __init__(
        self,
        input_dim: int = 4,
        hidden_dims: list = [64, 64, 32],
        dropout: float = 0.1
    ):
        super().__init__()
        
        layers = []
        prev_dim = input_dim
        
        for hidden_dim in hidden_dims:
            layers.extend([
                nn.Linear(prev_dim, hidden_dim),
                nn.LayerNorm(hidden_dim),
                nn.LeakyReLU(0.1),
                nn.Dropout(dropout)
            ])
            prev_dim = hidden_dim
        
        # Output layer: hedge ratio in [-1, 1] (can short)
        layers.append(nn.Linear(prev_dim, 1))
        layers.append(nn.Tanh())  # Constrain to [-1, 1]
        
        self.network = nn.Sequential(*layers)
        
        # Initialize weights
        self._init_weights()
    
    def _init_weights(self):
        """Xavier initialization for better gradient flow."""
        for m in self.modules():
            if isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                nn.init.zeros_(m.bias)
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """Compute hedge ratio.
        
        Args:
            x: Input features [batch, features]
        
        Returns:
            Hedge ratio in [-1, 1]
        """
        return self.network(x)


class RecurrentHedgingNetwork(nn.Module):
    """LSTM-based hedging network for capturing path dependencies."""
    
    def __init__(
        self,
        input_dim: int = 4,
        hidden_dim: int = 64,
        num_layers: int = 2,
        dropout: float = 0.1
    ):
        super().__init__()
        
        self.lstm = nn.LSTM(
            input_size=input_dim,
            hidden_size=hidden_dim,
            num_layers=num_layers,
            dropout=dropout if num_layers > 1 else 0,
            batch_first=True
        )
        
        self.output_layer = nn.Sequential(
            nn.Linear(hidden_dim, 32),
            nn.ReLU(),
            nn.Linear(32, 1),
            nn.Tanh()
        )
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """Process sequence and output hedge ratios.
        
        Args:
            x: Input sequence [batch, seq_len, features]
        
        Returns:
            Hedge ratios [batch, seq_len, 1]
        """
        lstm_out, _ = self.lstm(x)
        return self.output_layer(lstm_out)


# Test networks
ffn = RiskAwareHedgingNetwork()
rnn = RecurrentHedgingNetwork()

# FFN test
test_input = torch.randn(32, 4)
ffn_output = ffn(test_input)
print(f"FFN output shape: {ffn_output.shape}, range: [{ffn_output.min():.3f}, {ffn_output.max():.3f}]")

# RNN test
test_seq = torch.randn(32, 21, 4)
rnn_output = rnn(test_seq)
print(f"RNN output shape: {rnn_output.shape}")

## 4. Deep Hedging Training Framework

### P&L Calculation
The hedging P&L consists of:
1. **Option payoff**: What we owe at maturity
2. **Hedging gains/losses**: From delta hedging
3. **Transaction costs**: From rebalancing

$$\text{P\&L} = -\text{Payoff}(S_T) + \sum_{t=0}^{T-1} \delta_t (S_{t+1} - S_t) - \sum_{t=0}^{T-1} c |\delta_t - \delta_{t-1}| S_t$$

In [None]:
def black_scholes_delta(S: np.ndarray, K: float, T: np.ndarray, r: float, sigma: float) -> np.ndarray:
    """Compute Black-Scholes delta for call option."""
    from scipy.stats import norm
    
    # Handle T=0 case
    T = np.maximum(T, 1e-10)
    
    d1 = (np.log(S / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T))
    return norm.cdf(d1)


class DeepHedgingTrainer:
    """Training framework for risk-aware deep hedging."""
    
    def __init__(
        self,
        model: nn.Module,
        params: MarketParams,
        risk_measure: str = 'cvar',
        alpha: float = 0.95,
        lambda_risk: float = 0.5
    ):
        self.model = model.to(device)
        self.params = params
        self.simulator = GBMSimulator(params)
        
        # Setup risk measure
        if risk_measure == 'cvar':
            self.risk_module = MeanCVaRLoss(alpha, lambda_risk).to(device)
        elif risk_measure == 'spectral':
            self.risk_module = SpectralRiskMeasure(risk_aversion=2.0).to(device)
        else:
            raise ValueError(f"Unknown risk measure: {risk_measure}")
        
        self.risk_measure = risk_measure
        
    def prepare_features(self, paths: np.ndarray) -> torch.Tensor:
        """Prepare input features for the model.
        
        Features:
        - Normalized stock price (S/K)
        - Time to maturity
        - Log-moneyness
        - Previous delta (initialized to 0)
        """
        n_paths, n_steps = paths.shape
        ttm = self.simulator.get_time_to_maturity()
        
        features = np.zeros((n_paths, n_steps - 1, 4))  # Exclude last step (maturity)
        
        for t in range(n_steps - 1):
            features[:, t, 0] = paths[:, t] / self.params.K  # Normalized price
            features[:, t, 1] = ttm[t]  # Time to maturity
            features[:, t, 2] = np.log(paths[:, t] / self.params.K)  # Log-moneyness
            # Feature 3 (previous delta) will be filled during forward pass
        
        return torch.tensor(features, dtype=torch.float32, device=device)
    
    def compute_hedging_pnl(
        self,
        paths: torch.Tensor,
        features: torch.Tensor,
        include_costs: bool = True
    ) -> Tuple[torch.Tensor, torch.Tensor]:
        """Compute P&L from hedging strategy.
        
        Returns:
            pnl: Hedging P&L for each path
            deltas: Hedge ratios used
        """
        n_paths, n_steps = paths.shape
        n_hedging_steps = n_steps - 1
        
        # Initialize
        deltas = torch.zeros((n_paths, n_hedging_steps), device=device)
        hedging_pnl = torch.zeros(n_paths, device=device)
        transaction_costs = torch.zeros(n_paths, device=device)
        
        prev_delta = torch.zeros(n_paths, device=device)
        
        # Forward pass through time
        for t in range(n_hedging_steps):
            # Update feature with previous delta
            feat = features[:, t, :].clone()
            feat[:, 3] = prev_delta
            
            # Get hedge ratio from model
            delta = self.model(feat).squeeze(-1)
            deltas[:, t] = delta
            
            # Hedging P&L from price change
            price_change = paths[:, t + 1] - paths[:, t]
            hedging_pnl += delta * price_change
            
            # Transaction costs from rebalancing
            if include_costs:
                delta_change = torch.abs(delta - prev_delta)
                transaction_costs += self.params.transaction_cost * delta_change * paths[:, t]
            
            prev_delta = delta
        
        # Option payoff (short call)
        final_prices = paths[:, -1]
        payoff = torch.relu(final_prices - self.params.K)
        
        # Total P&L = Hedging gains - Option liability - Transaction costs
        total_pnl = hedging_pnl - payoff - transaction_costs
        
        # Return LOSS (negative P&L) for minimization
        return -total_pnl, deltas
    
    def train_epoch(
        self,
        optimizer: optim.Optimizer,
        n_paths: int = 10000
    ) -> dict:
        """Train for one epoch."""
        self.model.train()
        
        # Generate new paths
        paths_np = self.simulator.simulate_paths(n_paths)
        paths = torch.tensor(paths_np, dtype=torch.float32, device=device)
        features = self.prepare_features(paths_np)
        
        # Forward pass
        optimizer.zero_grad()
        losses, deltas = self.compute_hedging_pnl(paths, features)
        
        # Compute risk measure
        if self.risk_measure == 'cvar':
            loss, metrics = self.risk_module(losses)
        else:
            loss = self.risk_module(losses)
            metrics = {'spectral_risk': loss.item()}
        
        # Backward pass
        loss.backward()
        
        # Gradient clipping
        torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=1.0)
        
        optimizer.step()
        
        # Add P&L statistics
        pnl = -losses  # Convert back to P&L
        metrics['pnl_mean'] = pnl.mean().item()
        metrics['pnl_std'] = pnl.std().item()
        metrics['avg_delta'] = deltas.mean().item()
        metrics['loss'] = loss.item()
        
        return metrics
    
    @torch.no_grad()
    def evaluate(
        self,
        n_paths: int = 50000
    ) -> Tuple[dict, np.ndarray, np.ndarray]:
        """Evaluate model on test paths."""
        self.model.eval()
        
        paths_np = self.simulator.simulate_paths(n_paths)
        paths = torch.tensor(paths_np, dtype=torch.float32, device=device)
        features = self.prepare_features(paths_np)
        
        losses, deltas = self.compute_hedging_pnl(paths, features)
        pnl = -losses
        
        metrics = {
            'pnl_mean': pnl.mean().item(),
            'pnl_std': pnl.std().item(),
            'pnl_var_95': torch.quantile(pnl, 0.05).item(),
            'pnl_cvar_95': pnl[pnl <= torch.quantile(pnl, 0.05)].mean().item(),
            'sharpe': pnl.mean().item() / (pnl.std().item() + 1e-8)
        }
        
        return metrics, pnl.cpu().numpy(), deltas.cpu().numpy()


print("Training framework initialized!")

In [None]:
def train_deep_hedger(
    risk_measure: str = 'cvar',
    lambda_risk: float = 0.5,
    n_epochs: int = 100,
    lr: float = 1e-3
) -> Tuple[nn.Module, list]:
    """Train a deep hedging model."""
    
    # Initialize model and trainer
    model = RiskAwareHedgingNetwork(input_dim=4, hidden_dims=[64, 64, 32])
    trainer = DeepHedgingTrainer(
        model=model,
        params=params,
        risk_measure=risk_measure,
        alpha=0.95,
        lambda_risk=lambda_risk
    )
    
    # Optimizer with weight decay
    optimizer = optim.AdamW(model.parameters(), lr=lr, weight_decay=1e-4)
    scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=n_epochs)
    
    history = []
    
    for epoch in range(n_epochs):
        metrics = trainer.train_epoch(optimizer, n_paths=5000)
        scheduler.step()
        history.append(metrics)
        
        if (epoch + 1) % 20 == 0:
            print(f"Epoch {epoch+1:3d} | Loss: {metrics['loss']:.4f} | "
                  f"P&L Mean: {metrics['pnl_mean']:.4f} | "
                  f"P&L Std: {metrics['pnl_std']:.4f}")
    
    return model, history, trainer


# Train with different risk preferences
print("Training CVaR-optimized hedger (λ=0.8, risk-averse)...")
model_cvar, history_cvar, trainer_cvar = train_deep_hedger(
    risk_measure='cvar',
    lambda_risk=0.8,
    n_epochs=100
)

In [None]:
print("\nTraining mean-focused hedger (λ=0.2, less risk-averse)...")
model_mean, history_mean, trainer_mean = train_deep_hedger(
    risk_measure='cvar',
    lambda_risk=0.2,
    n_epochs=100
)

In [None]:
# Evaluate both models
print("\n" + "="*60)
print("Model Evaluation (50,000 test paths)")
print("="*60)

metrics_cvar, pnl_cvar, deltas_cvar = trainer_cvar.evaluate()
metrics_mean, pnl_mean, deltas_mean = trainer_mean.evaluate()

print(f"\n{'Metric':<20} {'CVaR-Focused':<15} {'Mean-Focused':<15}")
print("-" * 50)
for key in metrics_cvar:
    print(f"{key:<20} {metrics_cvar[key]:<15.4f} {metrics_mean[key]:<15.4f}")

In [None]:
# Compare with Black-Scholes delta hedging
def evaluate_bs_hedging(n_paths: int = 50000) -> Tuple[dict, np.ndarray]:
    """Evaluate Black-Scholes delta hedging baseline."""
    paths = simulator.simulate_paths(n_paths)
    ttm = simulator.get_time_to_maturity()
    
    hedging_pnl = np.zeros(n_paths)
    transaction_costs = np.zeros(n_paths)
    prev_delta = np.zeros(n_paths)
    
    for t in range(params.n_steps):
        # Black-Scholes delta
        delta = black_scholes_delta(
            paths[:, t], params.K, ttm[t], params.r, params.sigma
        )
        
        # P&L from hedge
        hedging_pnl += delta * (paths[:, t + 1] - paths[:, t])
        
        # Transaction costs
        transaction_costs += params.transaction_cost * np.abs(delta - prev_delta) * paths[:, t]
        prev_delta = delta
    
    # Option payoff
    payoff = np.maximum(paths[:, -1] - params.K, 0)
    
    # Total P&L
    pnl = hedging_pnl - payoff - transaction_costs
    
    metrics = {
        'pnl_mean': pnl.mean(),
        'pnl_std': pnl.std(),
        'pnl_var_95': np.percentile(pnl, 5),
        'pnl_cvar_95': pnl[pnl <= np.percentile(pnl, 5)].mean(),
        'sharpe': pnl.mean() / (pnl.std() + 1e-8)
    }
    
    return metrics, pnl

metrics_bs, pnl_bs = evaluate_bs_hedging()

print("\n" + "="*70)
print("Comparison with Black-Scholes Delta Hedging")
print("="*70)
print(f"\n{'Metric':<20} {'BS Delta':<15} {'CVaR-NN':<15} {'Mean-NN':<15}")
print("-" * 65)
for key in metrics_bs:
    print(f"{key:<20} {metrics_bs[key]:<15.4f} {metrics_cvar[key]:<15.4f} {metrics_mean[key]:<15.4f}")

## 5. Visualization and Analysis

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: P&L Distributions
ax1 = axes[0, 0]
ax1.hist(pnl_bs, bins=100, alpha=0.5, density=True, label='BS Delta', color='blue')
ax1.hist(pnl_cvar, bins=100, alpha=0.5, density=True, label='CVaR-NN', color='red')
ax1.hist(pnl_mean, bins=100, alpha=0.5, density=True, label='Mean-NN', color='green')

ax1.axvline(np.percentile(pnl_bs, 5), color='blue', linestyle='--', alpha=0.8)
ax1.axvline(np.percentile(pnl_cvar, 5), color='red', linestyle='--', alpha=0.8)
ax1.axvline(np.percentile(pnl_mean, 5), color='green', linestyle='--', alpha=0.8)

ax1.set_xlabel('P&L')
ax1.set_ylabel('Density')
ax1.set_title('P&L Distribution Comparison')
ax1.legend()

# Plot 2: Training curves
ax2 = axes[0, 1]
epochs = range(1, len(history_cvar) + 1)
ax2.plot(epochs, [h['loss'] for h in history_cvar], label='CVaR-NN Loss', color='red')
ax2.plot(epochs, [h['loss'] for h in history_mean], label='Mean-NN Loss', color='green')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Loss')
ax2.set_title('Training Loss Curves')
ax2.legend()

# Plot 3: Tail comparison (worst 10%)
ax3 = axes[1, 0]
tail_pct = 10
tail_bs = np.sort(pnl_bs)[:int(len(pnl_bs) * tail_pct / 100)]
tail_cvar = np.sort(pnl_cvar)[:int(len(pnl_cvar) * tail_pct / 100)]
tail_mean = np.sort(pnl_mean)[:int(len(pnl_mean) * tail_pct / 100)]

ax3.hist(tail_bs, bins=50, alpha=0.5, density=True, label='BS Delta', color='blue')
ax3.hist(tail_cvar, bins=50, alpha=0.5, density=True, label='CVaR-NN', color='red')
ax3.hist(tail_mean, bins=50, alpha=0.5, density=True, label='Mean-NN', color='green')
ax3.set_xlabel('P&L (Worst 10%)')
ax3.set_ylabel('Density')
ax3.set_title(f'Tail Risk Comparison (Worst {tail_pct}%)')
ax3.legend()

# Plot 4: Average delta by moneyness
ax4 = axes[1, 1]

# Compute delta surface
moneyness = np.linspace(0.8, 1.2, 50)
ttm_grid = np.linspace(0.01, params.T, 5)

for tau in ttm_grid:
    bs_deltas = []
    nn_deltas = []
    
    for m in moneyness:
        S = m * params.K
        bs_delta = black_scholes_delta(S, params.K, tau, params.r, params.sigma)
        bs_deltas.append(bs_delta)
        
        # Neural network delta
        feat = torch.tensor([[m, tau, np.log(m), 0.5]], dtype=torch.float32, device=device)
        with torch.no_grad():
            nn_delta = model_cvar(feat).item()
        nn_deltas.append(nn_delta)
    
    label = f'τ = {tau*252:.0f} days'
    ax4.plot(moneyness, bs_deltas, '--', alpha=0.5, label=f'BS {label}' if tau == ttm_grid[0] else '')
    ax4.plot(moneyness, nn_deltas, '-', alpha=0.8, label=f'NN {label}' if tau == ttm_grid[0] else '')

ax4.set_xlabel('Moneyness (S/K)')
ax4.set_ylabel('Delta')
ax4.set_title('Delta vs Moneyness (CVaR-NN vs BS)')
ax4.legend(['BS', 'NN'])

plt.tight_layout()
plt.show()

## 6. Risk-Return Frontier

Let's explore how different λ values create a risk-return trade-off.

In [None]:
def quick_train_and_eval(lambda_risk: float, n_epochs: int = 50) -> dict:
    """Quickly train and evaluate a model."""
    model = RiskAwareHedgingNetwork(input_dim=4, hidden_dims=[32, 32])
    trainer = DeepHedgingTrainer(
        model=model,
        params=params,
        risk_measure='cvar',
        lambda_risk=lambda_risk
    )
    
    optimizer = optim.Adam(model.parameters(), lr=1e-3)
    
    for _ in range(n_epochs):
        trainer.train_epoch(optimizer, n_paths=3000)
    
    metrics, _, _ = trainer.evaluate(n_paths=20000)
    metrics['lambda'] = lambda_risk
    return metrics


# Compute risk-return frontier
print("Computing risk-return frontier...")
lambdas = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
frontier_results = []

for lam in lambdas:
    print(f"  λ = {lam}...", end=" ")
    result = quick_train_and_eval(lam)
    frontier_results.append(result)
    print(f"Mean: {result['pnl_mean']:.4f}, CVaR: {result['pnl_cvar_95']:.4f}")

print("Done!")

In [None]:
# Plot risk-return frontier
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

means = [r['pnl_mean'] for r in frontier_results]
cvars = [r['pnl_cvar_95'] for r in frontier_results]
stds = [r['pnl_std'] for r in frontier_results]

# Plot 1: Mean vs CVaR frontier
ax1 = axes[0]
scatter = ax1.scatter([-c for c in cvars], means, c=lambdas, cmap='coolwarm', s=100, edgecolors='black')
ax1.plot([-c for c in cvars], means, 'k--', alpha=0.5)

for i, lam in enumerate(lambdas):
    ax1.annotate(f'λ={lam}', (-cvars[i], means[i]), textcoords="offset points", xytext=(5, 5))

ax1.axhline(metrics_bs['pnl_mean'], color='blue', linestyle=':', label='BS Delta Mean')
ax1.axvline(-metrics_bs['pnl_cvar_95'], color='blue', linestyle='--', label='BS Delta CVaR')

ax1.set_xlabel('Tail Risk (-CVaR 95%)')
ax1.set_ylabel('Expected P&L')
ax1.set_title('Risk-Return Frontier (CVaR Objective)')
ax1.legend()
plt.colorbar(scatter, ax=ax1, label='λ (CVaR weight)')

# Plot 2: Mean vs Std frontier
ax2 = axes[1]
scatter2 = ax2.scatter(stds, means, c=lambdas, cmap='coolwarm', s=100, edgecolors='black')
ax2.plot(stds, means, 'k--', alpha=0.5)

for i, lam in enumerate(lambdas):
    ax2.annotate(f'λ={lam}', (stds[i], means[i]), textcoords="offset points", xytext=(5, 5))

ax2.scatter(metrics_bs['pnl_std'], metrics_bs['pnl_mean'], marker='*', s=200, color='blue', label='BS Delta')

ax2.set_xlabel('P&L Volatility (Std)')
ax2.set_ylabel('Expected P&L')
ax2.set_title('Mean-Variance Frontier')
ax2.legend()
plt.colorbar(scatter2, ax=ax2, label='λ (CVaR weight)')

plt.tight_layout()
plt.show()

## 7. Advanced: Entropic Risk Measure

The entropic risk measure provides exponential penalization of losses:

$$\rho_\gamma(X) = \frac{1}{\gamma} \log\left( \mathbb{E}[e^{\gamma X}] \right)$$

Where $\gamma > 0$ is the risk aversion parameter.

In [None]:
class EntropicRiskMeasure(nn.Module):
    """Entropic (exponential) risk measure.
    
    Provides exponential penalization of large losses.
    Relates to exponential utility maximization.
    """
    
    def __init__(self, gamma: float = 1.0):
        """
        Args:
            gamma: Risk aversion parameter (higher = more risk averse)
        """
        super().__init__()
        self.gamma = gamma
    
    def forward(self, losses: torch.Tensor) -> torch.Tensor:
        """Compute entropic risk measure.
        
        Uses log-sum-exp trick for numerical stability.
        """
        # Log-sum-exp trick: log(mean(exp(γX))) = max(γX) + log(mean(exp(γX - max(γX))))
        scaled = self.gamma * losses
        max_val = scaled.max()
        risk = (1 / self.gamma) * (max_val + torch.log(torch.exp(scaled - max_val).mean()))
        return risk


class DistortionRiskMeasure(nn.Module):
    """Wang distortion risk measure.
    
    Uses a distortion function g:[0,1]->[0,1] to transform the distribution.
    """
    
    def __init__(self, distortion_param: float = 0.5):
        super().__init__()
        self.alpha = distortion_param  # Wang parameter
    
    def forward(self, losses: torch.Tensor) -> torch.Tensor:
        """Compute distortion risk measure."""
        from scipy.stats import norm
        
        sorted_losses, _ = torch.sort(losses)
        n = len(sorted_losses)
        
        # Wang distortion: g(u) = Φ(Φ^{-1}(u) + α)
        u = torch.linspace(1/n, 1, n)
        weights = torch.diff(
            torch.tensor(norm.cdf(norm.ppf(u.numpy()) + self.alpha)),
            prepend=torch.tensor([0.0])
        )
        weights = weights.to(losses.device)
        
        return (weights * sorted_losses).sum()


# Test different risk measures
test_losses = torch.randn(10000) * 10 + 5

entropic_1 = EntropicRiskMeasure(gamma=0.1)
entropic_2 = EntropicRiskMeasure(gamma=1.0)
cvar = CVaRLoss(alpha=0.95)

print("Risk Measure Comparison:")
print(f"  Mean:              {test_losses.mean().item():.4f}")
print(f"  CVaR 95%:          {cvar(test_losses).item():.4f}")
print(f"  Entropic (γ=0.1):  {entropic_1(test_losses).item():.4f}")
print(f"  Entropic (γ=1.0):  {entropic_2(test_losses).item():.4f}")

## 8. Summary and Key Takeaways

### What We Learned

1. **CVaR as a Hedging Objective**
   - CVaR (Expected Shortfall) is a coherent risk measure focusing on tail losses
   - The Rockafellar-Uryasev formulation enables gradient-based optimization
   - CVaR-optimized strategies reduce worst-case losses vs mean-variance

2. **Risk-Aware Neural Networks**
   - Deep hedging networks can learn to optimize any differentiable risk measure
   - The λ parameter controls the mean-CVaR trade-off
   - Higher λ → more conservative hedging → smaller tails but lower average P&L

3. **Comparison with Black-Scholes**
   - BS delta hedging is optimal only under BS assumptions (no costs, continuous trading)
   - Deep hedging can outperform BS when:
     - Transaction costs are significant
     - Volatility is stochastic
     - Tail risk is important

4. **Advanced Risk Measures**
   - Spectral risk measures: weighted quantile averages
   - Entropic risk: exponential utility-based
   - Choice depends on risk preferences and regulatory requirements

### Practical Considerations

- **Training stability**: Use gradient clipping, layer normalization
- **Sample size**: Need many scenarios to estimate tail risk accurately
- **Model capacity**: Simple networks often suffice; overparameterization can hurt
- **Transaction costs**: Critical for realistic performance assessment

In [None]:
# Final summary statistics
print("="*70)
print("FINAL SUMMARY: Advanced Deep Hedging with CVaR")
print("="*70)
print(f"\nMarket Parameters:")
print(f"  S0 = {params.S0}, K = {params.K}, T = {params.T*252:.0f} days")
print(f"  σ = {params.sigma*100:.1f}%, r = {params.r*100:.1f}%")
print(f"  Transaction costs = {params.transaction_cost*10000:.0f} bps")

print(f"\nModel Comparison (50k test scenarios):")
print(f"\n{'Strategy':<25} {'Mean P&L':<12} {'P&L Vol':<12} {'VaR 5%':<12} {'CVaR 5%':<12}")
print("-" * 75)
print(f"{'Black-Scholes Delta':<25} {metrics_bs['pnl_mean']:<12.4f} {metrics_bs['pnl_std']:<12.4f} "
      f"{metrics_bs['pnl_var_95']:<12.4f} {metrics_bs['pnl_cvar_95']:<12.4f}")
print(f"{'CVaR-NN (λ=0.8)':<25} {metrics_cvar['pnl_mean']:<12.4f} {metrics_cvar['pnl_std']:<12.4f} "
      f"{metrics_cvar['pnl_var_95']:<12.4f} {metrics_cvar['pnl_cvar_95']:<12.4f}")
print(f"{'Mean-NN (λ=0.2)':<25} {metrics_mean['pnl_mean']:<12.4f} {metrics_mean['pnl_std']:<12.4f} "
      f"{metrics_mean['pnl_var_95']:<12.4f} {metrics_mean['pnl_cvar_95']:<12.4f}")

print("\n✓ CVaR-focused strategy reduces tail risk at cost of average P&L")
print("✓ Deep hedging adapts to transaction costs automatically")
print("✓ Risk-return trade-off controlled by λ parameter")

## Exercises

1. **Modify the CVaR confidence level**: Train models with α = 0.90, 0.95, 0.99. How does the hedging strategy change?

2. **Add stochastic volatility**: Implement a Heston model simulator and compare deep hedging vs BS delta.

3. **Multi-asset hedging**: Extend to hedging a basket option with multiple underlyings.

4. **LSTM architecture**: Use the RecurrentHedgingNetwork and compare to the feedforward model.

5. **Transaction cost sensitivity**: How does optimal λ change with different transaction cost levels?