# Monte Carlo & Adjoint-Mode Greeks  
**Jackson Pfaff – May 2025**

---
# 1. Monte Carlo Setup (One-Step GBM)

We simulate $N$ correlated assets over the interval $[0,T]$ in a single log-Euler step:

1. **Correlation and normal draws**  
   Let $\Sigma \in \mathbb{R}^{N \times N}$ be the assets’ correlation matrix.  
   Compute its lower-triangular Cholesky factor $L$ so that $LL^\top = \Sigma$.  
   Draw:

   $$
   Z \sim \mathcal{N}(\mathbf{0}, I_N), \quad Y = LZ \quad (\text{then } \mathrm{Cov}[Y] = \Sigma)
   $$

2. **Log-price increment**  
   For each asset $i = 1, \dots, N$, compute:

   $$
   \text{drift}_i = \left( r - \tfrac{1}{2} \sigma_i^2 \right) T, \quad
   \text{diffusion}_i = \sigma_i \sqrt{T} Y_i
   $$

3. **Terminal price**  
   Define the log-argument:

   $$
   G_i = \ln S_{0,i} + \text{drift}_i + \text{diffusion}_i
   $$

   Then:

   $$
   S_i(T) = e^{G_i}
   $$

4. **Worst-of payoff**  
   $$
   S^* = \min_{1 \le i \le N} S_i(T), \quad
   A = S^* - K, \quad
   h = \max(A, 0), \quad
   D = e^{-rT}, \quad
   P = D h
   $$

5. **Monte Carlo estimator**  
   Over $M$ independent paths:

   $$
   \widehat{V} = \frac{1}{M} \sum_{m=1}^M P^{(m)} =
   \frac{e^{-rT}}{M} \sum_{m=1}^M \left( \min_i S_i^{(m)}(T) - K \right)^+
   $$

---

# 2. Adjoint-Mode Delta

We wish to compute the pathwise sensitivity $\Delta_i = \partial P / \partial S_{0,i}$ via reverse-mode (adjoint) differentiation.

1. **Initialize**  
   $\bar{P} = 1$

2. **Back through discount**  
   $P = D h$ gives:

   $$
   \bar{h} = D \bar{P}, \quad
   \bar{D} = h \bar{P}
   $$

3. **Back through ReLU payoff**  
   $h = \max(A, 0)$ gives:

   $$
   \bar{A} = \mathbb{1}_{\{A > 0\}} \bar{h}
   $$

4. **Back through subtraction**  
   $A = S^* - K$ gives:

   $$
   \bar{S}^* = \bar{A}, \quad
   \bar{K} = -\bar{A}
   $$

5. **Back through minimum**  
   $S^* = \min_i S_i(T)$ gives, for each $i$:

   $$
   \bar{S}_i(T) = \mathbb{1}_{\{i = i^*\}} \bar{S}^*, \quad
   i^* = \operatorname*{arg\,min}_i S_i(T)
   $$

6. **Back through exponential**  
   $S_i(T) = e^{G_i}$ gives:

   $$
   \bar{G}_i = S_i(T) \bar{S}_i(T)
   $$

7. **Back through log-Euler step**  
   $G_i = \ln S_{0,i} + (r - \tfrac{1}{2} \sigma_i^2)T + \sigma_i \sqrt{T} Y_i$ gives:

   $$
   \bar{S}_{0,i} = \frac{1}{S_{0,i}} \bar{G}_i
   $$

Hence the pathwise Delta is:

$$
\boxed{
\Delta_i = \frac{\partial P}{\partial S_{0,i}} =
e^{-rT} \mathbb{1}_{\{A > 0\}} \mathbb{1}_{\{i = i^*\}} \frac{S_i(T)}{S_{0,i}}
}
$$

---

# 3. Adjoint-Mode Vega

We now compute $\mathrm{Vega}_i = \partial P / \partial \sigma_i$ by continuing the reverse sweep:

8. **Back through log-Euler step ($\sigma$ branch)**  
From the full expression:

$$
G_i = \ln S_{0,i} + \left( r - \tfrac{1}{2} \sigma_i^2 \right) T + \sigma_i \sqrt{T} Y_i
$$

Differentiate with respect to $\sigma_i$:

$$
\frac{\partial G_i}{\partial \sigma_i} = -\sigma_i T + \sqrt{T} Y_i
$$

Then the adjoint is:

$$
\bar{\sigma}_i = \left( -\sigma_i T + \sqrt{T} Y_i \right) \bar{G}_i
$$

Substituting $\bar{G}_i = S_i(T) \bar{S}_i(T)$ and the definition of $Y_i = (LZ)_i$:

$$
\boxed{
\mathrm{Vega}_i = e^{-rT} \mathbb{1}_{\{A > 0\}} \mathbb{1}_{\{i = i^*\}}
S_i(T) \left( -\sigma_i T + \sqrt{T} (LZ)_i \right)
}
$$

---

This completes the in-depth adjoint derivations for both Delta and Vega.

---
## References

- Capriotti, L. (2010) *Fast Greeks by Algorithmic Differentiation*
- Ferguson, S. & Green, J. (2018) “Deeply Learning Derivatives,” 


# Monte Carlo & Greek Calculations

This notebook demonstrates worst-of basket option pricing under correlated GBM, and compares Greeks computed via finite differences and via adjoint differentiation.

In [15]:
import numpy as np, torch, math

# --- global knobs ---
torch.set_default_dtype(torch.float64)       # keep high precision
torch.manual_seed(0)                         # will be overwritten by `seed` later
N_ASSETS = 3
R_RATE   = 0.03

def cvine_corr_np(d, a=5.0, b=2.0):
    P = np.eye(d)
    for k in range(d-1):
        for i in range(k+1, d):
            rho = 2*np.random.beta(a,b)-1
            for m in range(k-1, -1, -1):
                rho = rho*np.sqrt((1-P[m,i]**2)*(1-P[m,k]**2)) + P[m,i]*P[m,k]
            P[k,i] = P[i,k] = rho
    ev, evec = np.linalg.eigh(P)
    P = evec @ np.diag(np.clip(ev,1e-6,None)) @ evec.T
    return P            # NumPy array

def fg_sample():
    z     = np.random.normal(0.5, np.sqrt(0.25), N_ASSETS)
    S0    = 100*np.exp(z)
    sigma = np.random.uniform(0, 1, N_ASSETS)
    T     = (np.random.randint(1, 44)**2) / 252.0
    return dict(
        S0    = S0,
        sigma = sigma,
        T     = T,
        rho   = cvine_corr_np(N_ASSETS),
        K     = 100.0,
        r     = R_RATE
    )

def gbm_paths_np(S0, sigma, T, r, corr, n_paths, rng):
    L   = np.linalg.cholesky(corr)
    Z   = rng.normal(size=(n_paths, len(S0)))
    Y   = Z @ L.T
    drift = (r - 0.5*sigma**2)*T
    diff  = sigma*np.sqrt(T)*Y
    return np.exp(np.log(S0) + drift + diff)     # shape (n_paths, N)


In [33]:
def delta_vega_fd(p, n_paths, eps=1e-2, rng=None):
    S0, sigma = p['S0'], p['sigma']
    T, r, K   = p['T'],  p['r'],   p['K']
    corr      = p['rho']
    rng = rng or np.random.default_rng()

    ST = gbm_paths_np(S0, sigma, T, r, corr, n_paths, rng)
    pay = np.maximum(ST.min(axis=1) - K, 0.0)
    disc = math.exp(-r*T)
    base_price = disc * pay.mean()

    delta, vega = np.zeros_like(S0), np.zeros_like(sigma)

    for i in range(len(S0)):
        # --- Delta bump ---
        S_up = S0.copy(); S_up[i] += eps
        ST_up = gbm_paths_np(S_up, sigma, T, r, corr, n_paths, rng)
        price_up = disc * np.maximum(ST_up.min(axis=1) - K, 0).mean()
        delta[i] = (price_up - base_price) / eps

        # --- Vega bump ---
        sig_up = sigma.copy(); sig_up[i] += eps
        ST_up  = gbm_paths_np(S0, sig_up, T, r, corr, n_paths, rng)
        price_up = disc * np.maximum(ST_up.min(axis=1) - K, 0).mean()
        vega[i] = (price_up - base_price) / eps

    return base_price, delta, vega


In [28]:
def delta_vega_aad(p, n_paths, seed):
    S0, sigma = p['S0'], p['sigma']
    T, r, K   = p['T'],  p['r'],   p['K']
    corr      = p['rho']

    torch.manual_seed(seed)
    S0_t  = torch.tensor(S0,    requires_grad=True)
    sig_t = torch.tensor(sigma, requires_grad=True)
    r_t   = torch.tensor(r)
    T_t   = torch.tensor(T)
    K_t   = torch.tensor(K)
    L_t   = torch.linalg.cholesky(torch.tensor(corr))

    Z = torch.randn(n_paths, len(S0))
    Y = Z @ L_t.T

    drift = (r_t - 0.5*sig_t**2)*T_t
    diff  = sig_t*torch.sqrt(T_t)*Y
    logS  = torch.log(S0_t) + drift + diff
    ST    = torch.exp(logS)

    payoff = torch.clamp(ST.min(dim=1).values - K_t, min=0.0)
    price  = torch.exp(-r_t*T_t) * payoff.mean()

    delta, vega = torch.autograd.grad(price, (S0_t, sig_t))
    return price.item(), delta.detach().numpy(), vega.detach().numpy()


In [34]:
import time, numpy as np, torch, math
# (assume fg_sample, gbm_paths_np, delta_vega_fd, delta_vega_aad are already defined)

# ---- user knobs ----
seed     = 1234
n_paths  = 100_000_000
np.random.seed(seed)
rng      = np.random.default_rng(seed)

# ---- sample deterministic scenario ----
params = fg_sample()
print("Scenario parameters:")
print({k: params[k] for k in ['S0','sigma','T','r','K']}, "\n")

# ---- finite-difference (timed) ----
t0 = time.perf_counter()
fd_price, fd_delta, fd_vega = delta_vega_fd(params, n_paths, rng=rng)
fd_time = time.perf_counter() - t0

# ---- adjoint-mode (timed) ----
t1 = time.perf_counter()
aad_price, aad_delta, aad_vega = delta_vega_aad(params, n_paths, seed)
aad_time = time.perf_counter() - t1

# ---- identify worst asset ----
i_star = np.argmin(
    gbm_paths_np(params['S0'], params['sigma'], params['T'],
                 params['r'], params['rho'], 1, rng)[0]
)

# ---- report ----
print(f"MC paths        : {n_paths:,}")
print(f"Discount price  : {fd_price:.6f}\n")

print("Finite-Difference:")
print(f"  Δ(i*={i_star}) = {fd_delta[i_star]:.6f}")
print(f"  ν(i*={i_star}) = {fd_vega[i_star]:.6f}")
print(f"  time           : {fd_time:.3f} s\n")

print("Adjoint-Mode (AAD):")
print(f"  Δ(i*={i_star}) = {aad_delta[i_star]:.6f}")
print(f"  ν(i*={i_star}) = {aad_vega[i_star]:.6f}")
print(f"  time           : {aad_time:.3f} s\n")

speedup = fd_time / aad_time if aad_time else float('inf')
print(f"AAD speed-up ≈ {speedup:.1f}×")


Scenario parameters:
{'S0': array([208.69790336,  90.89294075, 337.48587926]), 'sigma': array([0.77997581, 0.27259261, 0.27646426]), 'T': 3.8134920634920637, 'r': 0.03, 'K': 100.0} 

MC paths        : 100,000,000
Discount price  : 12.684097

Finite-Difference:
  Δ(i*=1) = 0.453741
  ν(i*=1) = 39.497323
  time           : 87.835 s

Adjoint-Mode (AAD):
  Δ(i*=1) = 0.329725
  ν(i*=1) = 39.779329
  time           : 12.130 s

AAD speed-up ≈ 7.2×
