# Capstone — Options & Risk Lab (Projects 1→10 Combined) — Interactive

This **single notebook** unifies the full journey from **Brownian motion** to a **full options workflow**:
- **Modeling**: BM / GBM simulation, stylized facts (fat tails, clustering)
- **Pricing**: Black–Scholes (analytic + Greeks), Monte Carlo, Binomial CRR, American option (LSM)
- **Volatility**: Historical vs **GARCH** forecast vs **Implied Vol**
- **Surface**: Market (yfinance) or synthetic **Implied Vol Surface**, **SVI** fit per maturity
- **Hedging**: Delta-hedging P&L and how it changes with volatility inputs
- **Risk**: VaR/CVaR on returns and hedged P&L
- **Strategy**: A volatility-timing example (risk-managed exposure)

> Written in **English**, with **interactive Plotly charts** + **ipywidgets** dashboards.  
> Charts are exported to `assets/` as HTML when possible.

---

## Table of contents
1. Setup & Data
2. Stylized facts (returns, fat tails, clustering)
3. BM/GBM simulation (interactive)
4. Pricing engines: BS / MC / CRR / LSM
5. Volatility modeling: Historical vs GARCH
6. Implied vol + Vol surface + SVI (market or synthetic)
7. Hedging lab: delta hedging P&L
8. Risk lab: VaR/CVaR
9. Strategy lab: volatility timing (toy)
10. Interview pitch (what to say)

---

> **Note:** In this notebook, `pxl` refers to `plotly.express` (to avoid name clashes with the `px` price series).


## 0) Setup (paths + reproducibility)

In [1]:
from __future__ import annotations
from pathlib import Path
import numpy as np

SEED = 42
rng = np.random.default_rng(SEED)

PROJECT_DIR = Path.cwd()
ASSETS_DIR = PROJECT_DIR / "assets"
ASSETS_DIR.mkdir(parents=True, exist_ok=True)

print("CWD:", PROJECT_DIR)
print("ASSETS_DIR:", ASSETS_DIR.resolve())

CWD: c:\Users\Karim\Desktop\quant-finance-portfolio\projects\Capstone 01-10
ASSETS_DIR: C:\Users\Karim\Desktop\quant-finance-portfolio\projects\Capstone 01-10\assets


## 1) Imports

In [2]:
import math
import numpy as np
import pandas as pd
import plotly.express as pxl
import plotly.graph_objects as go

import ipywidgets as widgets
from IPython.display import display, clear_output

from scipy.stats import norm
from scipy.optimize import brentq, minimize

# Optional market data
try:
    import yfinance as yf
    HAVE_YFINANCE = True
except Exception:
    HAVE_YFINANCE = False

# Optional volatility modeling (ARCH)
try:
    from arch import arch_model
    HAVE_ARCH = True
except Exception:
    HAVE_ARCH = False

HAVE_YFINANCE, HAVE_ARCH

(True, True)

## 2) Data utilities: prices → log-returns

We use daily log-returns:
\[
r_t = \log(S_t) - \log(S_{t-1}).
\]

If market download fails, we generate synthetic clustered returns (GARCH-like).

In [3]:
def _today_utc_naive() -> pd.Timestamp:
    # avoid tz-aware vs tz-naive issues
    return pd.Timestamp.utcnow().tz_localize(None).normalize()

def download_prices(ticker: str, start: str = "2015-01-01") -> pd.Series:
    if not HAVE_YFINANCE:
        raise RuntimeError("yfinance not installed. pip install yfinance")
    px = yf.download(ticker, start=start, auto_adjust=True, progress=False)["Close"].dropna()
    if pxl.empty:
        raise RuntimeError("Empty price series")
    return px.rename(ticker)

def log_returns(px: pd.Series) -> pd.Series:
    return np.log(px).diff().dropna().rename("r")

def synthetic_clustered_returns(n_days: int = 2500, seed: int = 42) -> pd.Series:
    rng_ = np.random.default_rng(seed)
    omega, alpha, beta = 1e-6, 0.07, 0.92
    z = rng_.standard_normal(n_days)
    sig2 = np.zeros(n_days)
    r = np.zeros(n_days)
    sig2[0] = omega / (1 - alpha - beta)
    r[0] = np.sqrt(sig2[0]) * z[0]
    for t in range(1, n_days):
        sig2[t] = omega + alpha*(r[t-1]**2) + beta*sig2[t-1]
        r[t] = np.sqrt(sig2[t]) * z[t]
    idx = pd.bdate_range("2015-01-01", periods=n_days)
    return pd.Series(r, index=idx, name="r_SYNTH")

def get_market_data(ticker: str = "SPY", start: str = "2015-01-01"):
    if HAVE_YFINANCE:
        try:
            px = download_prices(ticker, start=start)
            r = log_returns(px)
            if len(r) < 400:
                raise RuntimeError("Not enough data")
            return px, r, False
        except Exception as e:
            print("⚠️ Market download failed, using synthetic clustered returns. Reason:", e)
            r = synthetic_clustered_returns(seed=SEED)
            # build a pseudo price from returns
            px = pd.Series(100*np.exp(np.cumsum(r.values)), index=r.index, name="SYNTH")
            return px, r, True
    else:
        print("⚠️ yfinance not available, using synthetic clustered returns.")
        r = synthetic_clustered_returns(seed=SEED)
        px = pd.Series(100*np.exp(np.cumsum(r.values)), index=r.index, name="SYNTH")
        return px, r, True

px, r, used_synth = get_market_data("SPY", "2015-01-01")
px.head(), r.head(), used_synth

⚠️ Market download failed, using synthetic clustered returns. Reason: module 'plotly.express' has no attribute 'empty'


(2015-01-01    100.305182
 2015-01-02     99.300750
 2015-01-05    100.026847
 2015-01-06    100.930474
 2015-01-07     99.071677
 Freq: B, Name: SYNTH, dtype: float64,
 2015-01-01    0.003047
 2015-01-02   -0.010064
 2015-01-05    0.007285
 2015-01-06    0.008993
 2015-01-07   -0.018588
 Freq: B, Name: r_SYNTH, dtype: float64,
 True)

## 3) Stylized facts (interactive)

We visualize:
- daily returns time series
- rolling volatility proxy (annualized)
- histogram of returns

**Interpretation**
- volatility clustering → motivates GARCH-family models
- heavy tails → motivates Student-t innovations or stochastic volatility

In [4]:
df = pd.DataFrame({"r": r})
df["roll_vol_21d"] = df["r"].rolling(21).std(ddof=1) * np.sqrt(252)

fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=df.index, y=df["r"], mode="lines", name="log-returns"))
fig1.update_layout(template="plotly_dark", title="Daily log-returns", xaxis_title="Date", yaxis_title="r_t")
fig1.show()
fig1.write_html(ASSETS_DIR / "stylized_returns.html")

fig2 = go.Figure()
fig2.add_trace(go.Scatter(x=df.index, y=df["roll_vol_21d"], mode="lines", name="Rolling vol (ann.)"))
fig2.update_layout(template="plotly_dark", title="Rolling volatility (21d, annualized)", xaxis_title="Date", yaxis_title="Vol")
fig2.show()
fig2.write_html(ASSETS_DIR / "stylized_rolling_vol.html")

fig3 = pxl.histogram(df, x="r", nbins=120, template="plotly_dark", title="Return distribution (histogram)")
fig3.update_layout(xaxis_title="Daily log-return", yaxis_title="Count")
fig3.show()
fig3.write_html(ASSETS_DIR / "stylized_hist.html")





This means that static image generation (e.g. `fig.write_image()`) will not work.

Please upgrade Plotly to version 6.1.1 or greater, or downgrade Kaleido to version 0.2.1.




## 4) BM/GBM simulation (interactive)

GBM:
\[
dS_t = \mu S_t\,dt + \sigma S_t\,dW_t
\]
Discrete form:
\[
S_{t+\Delta t} = S_t\exp\left((\mu-\tfrac{1}{2}\sigma^2)\Delta t + \sigma\sqrt{\Delta t}\,Z\right).
\]

**Interpretation**
- \(\mu\) moves the trend; \(\sigma\) controls dispersion
- increasing time horizon widens the fan of possible outcomes

In [5]:
def simulate_gbm_paths(S0: float, mu: float, sigma: float, T: float, steps: int, n_paths: int, seed: int = 42):
    rng_ = np.random.default_rng(seed)
    dt = T / steps
    Z = rng_.standard_normal((steps, n_paths))
    increments = (mu - 0.5*sigma**2)*dt + sigma*np.sqrt(dt)*Z
    logS = np.log(S0) + np.vstack([np.zeros(n_paths), np.cumsum(increments, axis=0)])
    S = np.exp(logS)
    tgrid = np.linspace(0, T, steps+1)
    return tgrid, S

def plot_gbm_paths(t, S, title="GBM paths"):
    n_paths = S.shape[1]
    fig = go.Figure()
    for i in range(min(n_paths, 30)):
        fig.add_trace(go.Scatter(x=t, y=S[:, i], mode="lines", name=f"path {i+1}", showlegend=False))
    # percentile bands
    p05 = np.percentile(S, 5, axis=1)
    p50 = np.percentile(S, 50, axis=1)
    p95 = np.percentile(S, 95, axis=1)
    fig.add_trace(go.Scatter(x=t, y=p50, mode="lines", name="median"))
    fig.add_trace(go.Scatter(x=t, y=p05, mode="lines", name="5%"))
    fig.add_trace(go.Scatter(x=t, y=p95, mode="lines", name="95%"))
    fig.update_layout(template="plotly_dark", title=title, xaxis_title="t", yaxis_title="S_t")
    return fig

S0 = float(px.iloc[-1])
t, S = simulate_gbm_paths(S0=S0, mu=0.08, sigma=0.2, T=1.0, steps=252, n_paths=200, seed=SEED)
fig = plot_gbm_paths(t, S, title="GBM simulation (sample paths + percentile fan)")
fig.show()
fig.write_html(ASSETS_DIR / "gbm_paths.html")

## 5) Pricing engines

### Black–Scholes call price
\[
C = S_0 e^{-qT}N(d_1)-Ke^{-rT}N(d_2)
\]
\[
d_1 = \frac{\ln(S_0/K)+(r-q+\tfrac{1}{2}\sigma^2)T}{\sigma\sqrt{T}},\quad d_2=d_1-\sigma\sqrt{T}.
\]

We implement:
- Black–Scholes (call/put) + Greeks (Delta, Vega)
- Monte Carlo pricing under GBM
- Binomial CRR (European + American)
- Longstaff–Schwartz (LSM) for American put (fast and interview-friendly)

**Interpretation**
- BS is closed-form under constant vol
- MC shows convergence and supports extensions
- CRR/LSM show early exercise / American pricing

In [6]:
def bs_call_put(S0: float, K: float, T: float, r: float, q: float, sigma: float):
    if T <= 0:
        call = max(0.0, S0 - K)
        put = max(0.0, K - S0)
        return call, put
    if sigma <= 0:
        DF_r = math.exp(-r*T)
        DF_q = math.exp(-q*T)
        call = DF_q*S0 - DF_r*K
        call = max(call, 0.0)
        put = call + DF_r*K - DF_q*S0
        return call, put
    vol = sigma*math.sqrt(T)
    d1 = (math.log(S0/K) + (r - q + 0.5*sigma*sigma)*T)/vol
    d2 = d1 - vol
    DF_r = math.exp(-r*T)
    DF_q = math.exp(-q*T)
    call = DF_q*S0*norm.cdf(d1) - DF_r*K*norm.cdf(d2)
    put  = DF_r*K*norm.cdf(-d2) - DF_q*S0*norm.cdf(-d1)
    return float(call), float(put)

def bs_greeks_call(S0: float, K: float, T: float, r: float, q: float, sigma: float):
    if T <= 0 or sigma <= 0:
        return {"delta": float("nan"), "vega": float("nan")}
    vol = sigma*math.sqrt(T)
    d1 = (math.log(S0/K) + (r - q + 0.5*sigma*sigma)*T)/vol
    DF_q = math.exp(-q*T)
    delta = DF_q*norm.cdf(d1)
    vega = DF_q*S0*norm.pdf(d1)*math.sqrt(T)
    return {"delta": float(delta), "vega": float(vega)}

def mc_call_gbm(S0: float, K: float, T: float, r: float, q: float, sigma: float, n_paths: int = 200_000, seed: int = 42):
    rng_ = np.random.default_rng(seed)
    Z = rng_.standard_normal(n_paths)
    ST = S0*np.exp((r - q - 0.5*sigma*sigma)*T + sigma*np.sqrt(T)*Z)
    payoff = np.maximum(ST - K, 0.0)
    return float(np.exp(-r*T)*np.mean(payoff))

def crr_price(S0: float, K: float, T: float, r: float, q: float, sigma: float, steps: int = 200, is_call: bool = True, is_american: bool = False):
    dt = T/steps
    u = math.exp(sigma*math.sqrt(dt))
    d = 1/u
    disc = math.exp(-r*dt)
    p = (math.exp((r-q)*dt) - d) / (u - d)
    # terminal prices
    j = np.arange(steps+1)
    ST = S0*(u**j)*(d**(steps-j))
    if is_call:
        V = np.maximum(ST - K, 0.0)
    else:
        V = np.maximum(K - ST, 0.0)
    # backward induction
    for n in range(steps-1, -1, -1):
        V = disc*(p*V[1:] + (1-p)*V[:-1])
        if is_american:
            j = np.arange(n+1)
            S_n = S0*(u**j)*(d**(n-j))
            intrinsic = np.maximum(S_n - K, 0.0) if is_call else np.maximum(K - S_n, 0.0)
            V = np.maximum(V, intrinsic)
    return float(V[0])

def lsm_american_put(S0: float, K: float, T: float, r: float, q: float, sigma: float, steps: int = 100, n_paths: int = 50_000, seed: int = 42, degree: int = 2):
    """
    Longstaff–Schwartz American put (polynomial basis on S).
    Returns price estimate.
    """
    rng_ = np.random.default_rng(seed)
    dt = T/steps
    disc = math.exp(-r*dt)

    # simulate GBM under risk-neutral drift (r-q)
    Z = rng_.standard_normal((steps, n_paths))
    increments = (r - q - 0.5*sigma*sigma)*dt + sigma*math.sqrt(dt)*Z
    logS = np.log(S0) + np.vstack([np.zeros(n_paths), np.cumsum(increments, axis=0)])
    S = np.exp(logS)  # shape (steps+1, n_paths)

    # cashflows at maturity
    cf = np.maximum(K - S[-1], 0.0)
    exercise_time = np.full(n_paths, steps, dtype=int)

    # backwards
    for t in range(steps-1, 0, -1):
        St = S[t]
        itm = (K - St) > 0
        if not np.any(itm):
            cf *= disc
            continue

        X = St[itm]
        Y = cf[itm] * disc  # discounted continuation values

        # regression basis [1, X, X^2, ...]
        A = np.vstack([X**k for k in range(degree+1)]).T
        # least squares
        beta, *_ = np.linalg.lstsq(A, Y, rcond=None)
        cont = A @ beta

        ex = (K - X)
        exercise_now = ex > cont

        # update CF: if exercise, take payoff now; else keep discounted continuation
        idx_itm = np.where(itm)[0]
        ex_idx = idx_itm[exercise_now]

        cf = cf * disc
        cf[ex_idx] = (K - St[ex_idx])
        exercise_time[ex_idx] = t

    price = float(np.mean(cf) * math.exp(-r*0))  # already discounted to t=0 via loop
    return price

### Quick pricing demo (BS vs MC vs CRR vs LSM)

In [7]:
S0 = float(px.iloc[-1])
K = S0
T = 30/365
r_rf = 0.02
q_div = 0.0
sigma = float(r.rolling(63).std(ddof=1).iloc[-1] * np.sqrt(252))

bsC, bsP = bs_call_put(S0, K, T, r_rf, q_div, sigma)
mcC = mc_call_gbm(S0, K, T, r_rf, q_div, sigma, n_paths=80_000, seed=SEED)
crrC = crr_price(S0, K, T, r_rf, q_div, sigma, steps=200, is_call=True, is_american=False)
am_put_crr = crr_price(S0, K, T, r_rf, q_div, sigma, steps=200, is_call=False, is_american=True)
am_put_lsm = lsm_american_put(S0, K, T, r_rf, q_div, sigma, steps=80, n_paths=30_000, seed=SEED)

pd.DataFrame({
    "value": [bsC, mcC, crrC, bsP, am_put_crr, am_put_lsm]
}, index=["BS Call", "MC Call", "CRR Call (Eur)", "BS Put (Eur)", "CRR Put (Amer)", "LSM Put (Amer)"]).round(6)

Unnamed: 0,value
BS Call,0.823461
MC Call,0.822023
CRR Call (Eur),0.822466
BS Put (Eur),0.769234
CRR Put (Amer),0.771644
LSM Put (Amer),0.768005


## 6) Volatility modeling: Historical vs GARCH forecast

We fit a simple **GARCH(1,1)** (if `arch` is installed).  
Then we compare:
- rolling historical vol
- conditional vol (GARCH)
- 1-step ahead forecast

**Interpretation**
- GARCH reacts to shocks and models volatility clustering
- Using forecast vol can change prices and hedging P&L

In [8]:
def rolling_vol_ann(r: pd.Series, window: int = 63) -> pd.Series:
    return (r.rolling(window).std(ddof=1) * np.sqrt(252)).rename(f"hist_vol_{window}d")

def fit_garch11(r: pd.Series):
    if not HAVE_ARCH:
        return None
    # arch expects percent returns
    r_pct = 100*r
    am = arch_model(r_pct, vol="Garch", p=1, q=1, dist="normal", mean="Zero")
    res = am.fit(disp="off")
    cond_vol_ann = (res.conditional_volatility/100.0) * np.sqrt(252)
    # forecast 1 step
    fc = res.forecast(horizon=1, reindex=False)
    sigma1d = float(np.sqrt(fc.variance.values[-1,0]) / 100.0)
    sigma_forecast_ann = sigma1d * np.sqrt(252)
    return {"res": res, "cond_vol_ann": pd.Series(cond_vol_ann, index=r.index, name="garch_vol"), "forecast_ann": sigma_forecast_ann}

hist_vol = rolling_vol_ann(r, 63)
garch_fit = fit_garch11(r)

cmp = pd.DataFrame({"hist_vol_63d": hist_vol}).dropna()
if garch_fit is not None:
    cmp["garch_vol"] = garch_fit["cond_vol_ann"].reindex(cmp.index)

fig = go.Figure()
for c in cmp.columns:
    fig.add_trace(go.Scatter(x=cmp.index, y=cmp[c], mode="lines", name=c))
fig.update_layout(template="plotly_dark", title="Historical vs GARCH conditional volatility (annualized)", xaxis_title="Date", yaxis_title="Vol")
fig.show()
fig.write_html(ASSETS_DIR / "vol_hist_vs_garch.html")

garch_forecast = garch_fit["forecast_ann"] if garch_fit is not None else float("nan")
{"garch_available": garch_fit is not None, "latest_hist_vol": float(hist_vol.dropna().iloc[-1]), "garch_1step_forecast_ann": garch_forecast}

{'garch_available': True,
 'latest_hist_vol': 0.21102180017206418,
 'garch_1step_forecast_ann': np.float64(0.1966731206046939)}

## 7) Implied vol + Vol surface + SVI (market or synthetic)

We attempt to fetch option chains from **yfinance** and compute implied vols.  
If that fails, we build a synthetic surface (so the notebook always runs).

Then we fit **SVI** per maturity and plot:
- market IV scatter + SVI surface mesh (3D)
- smile slices and residuals

(Implementation is adapted from your Project 10 fixes to avoid tz issues.)

In [9]:
def bs_call_forward(F: float, K: float, T: float, r: float, sigma: float) -> float:
    if T <= 0:
        return max(0.0, F - K)
    DF = math.exp(-r * T)
    if sigma <= 0:
        return DF * max(0.0, F - K)
    vol_sqrt = sigma * math.sqrt(T)
    d1 = (math.log(F / K) + 0.5 * sigma * sigma * T) / vol_sqrt
    d2 = d1 - vol_sqrt
    return DF * (F * norm.cdf(d1) - K * norm.cdf(d2))

def implied_vol_call_forward(price: float, F: float, K: float, T: float, r: float, lo: float = 1e-8, hi: float = 5.0) -> float:
    if T <= 0 or F <= 0 or K <= 0:
        return float("nan")
    DF = math.exp(-r*T)
    intrinsic = DF * max(0.0, F - K)
    upper = DF * F
    if not (intrinsic - 1e-12 <= price <= upper + 1e-12):
        return float("nan")
    def f(sig): return bs_call_forward(F, K, T, r, sig) - price
    a, b = lo, hi
    fa, fb = f(a), f(b)
    if fa*fb > 0:
        for b_try in [7.5, 10.0, 15.0, 20.0]:
            b = b_try
            fb = f(b)
            if fa*fb <= 0:
                break
    if fa*fb > 0:
        return float("nan")
    try:
        return float(brentq(f, a, b, maxiter=200))
    except Exception:
        return float("nan")

def fetch_option_chain_yf(ticker: str, r: float = 0.02, max_expiries: int = 6) -> tuple[float, pd.DataFrame]:
    if not HAVE_YFINANCE:
        raise RuntimeError("yfinance not installed")
    tk = yf.Ticker(ticker)
    hist = tk.history(period="5d", auto_adjust=True)
    if hist.empty:
        raise RuntimeError("Empty price history")
    S0 = float(hist["Close"].iloc[-1])

    expiries = list(tk.options)[:max_expiries]
    if len(expiries) == 0:
        raise RuntimeError("No option expiries returned")

    today = _today_utc_naive()
    rows = []
    for exp in expiries:
        exp_dt = pd.to_datetime(exp, utc=True).tz_convert(None)  # tz-naive
        T = max(0.0, (exp_dt - today).days/365.0)
        if T <= 1e-6:
            continue

        chain = tk.option_chain(exp)
        calls = chain.calls.copy()
        if calls.empty:
            continue

        bid = calls.get("bid", pd.Series([np.nan]*len(calls)))
        ask = calls.get("ask", pd.Series([np.nan]*len(calls)))
        last = calls.get("lastPrice", pd.Series([np.nan]*len(calls)))

        mid = (bid + ask)/2.0
        # if bid/ask unusable, use last
        good_ba = (bid > 0) & (ask > 0) & (ask >= bid)
        mid = mid.where(good_ba, last)

        DF = math.exp(-r*T)
        F = S0 * math.exp(r*T)

        for i in range(len(calls)):
            K = float(calls["strike"].iloc[i])
            C = float(mid.iloc[i]) if pd.notna(mid.iloc[i]) else np.nan
            if not np.isfinite(C) or C <= 0:
                continue
            iv = implied_vol_call_forward(C, F, K, T, r)
            if not np.isfinite(iv) or iv <= 0:
                continue
            k = math.log(K/F)
            rows.append({
                "ticker": ticker, "expiry": exp_dt.date().isoformat(), "T": T,
                "K": K, "mid": C, "iv": iv, "k": k, "F": F, "DF": DF,
                "volume": float(calls.get("volume", pd.Series([np.nan]*len(calls))).iloc[i]) if "volume" in calls else np.nan,
                "openInterest": float(calls.get("openInterest", pd.Series([np.nan]*len(calls))).iloc[i]) if "openInterest" in calls else np.nan,
                "bid": float(bid.iloc[i]) if pd.notna(bid.iloc[i]) else np.nan,
                "ask": float(ask.iloc[i]) if pd.notna(ask.iloc[i]) else np.nan,
            })
    df = pd.DataFrame(rows)
    if df.empty:
        raise RuntimeError("No usable options after cleaning")
    df = df[df["iv"].between(0.01, 5.0)].copy()
    return S0, df

def synthetic_surface(S0: float = 100.0, r: float = 0.02, seed: int = 42) -> tuple[float, pd.DataFrame]:
    rng_ = np.random.default_rng(seed)
    today = _today_utc_naive()
    Ts = np.array([0.08, 0.16, 0.33, 0.5, 1.0, 2.0])
    strikes = np.linspace(0.6*S0, 1.4*S0, 31)
    rows = []
    for T in Ts:
        DF = math.exp(-r*T); F = S0*math.exp(r*T)
        for K in strikes:
            k = math.log(K/F)
            base = 0.18 + 0.05*np.exp(-T) + 0.02*np.sqrt(T)
            skew = -0.22*(1.0/(1.0+2.0*T))
            smile = 0.35
            iv = max(0.03, base + skew*k + smile*(k**2) + 0.005*rng_.standard_normal())
            C = bs_call_forward(F, K, T, r, iv)
            iv_back = implied_vol_call_forward(C, F, K, T, r)
            rows.append({
                "ticker":"SYNTH", "expiry":(today + pd.Timedelta(days=int(round(365*T)))).date().isoformat(), "T": float(T),
                "K": float(K), "mid": float(C), "iv": float(iv_back), "k": float(k), "F": float(F), "DF": float(DF),
                "volume": np.nan, "openInterest": np.nan, "bid": np.nan, "ask": np.nan
            })
    return S0, pd.DataFrame(rows).dropna(subset=["iv"])

# SVI
def svi_total_variance(k: np.ndarray, a: float, b: float, rho: float, m: float, sig: float) -> np.ndarray:
    return a + b*(rho*(k-m) + np.sqrt((k-m)**2 + sig**2))

def _unpack_svi(x: np.ndarray):
    a = x[0]
    b = math.exp(x[1])
    rho = math.tanh(x[2])
    m = x[3]
    sig = math.exp(x[4])
    return a, b, rho, m, sig

def fit_svi_one_maturity(k: np.ndarray, iv: np.ndarray, T: float, n_starts: int = 8, seed: int = 42):
    rng_ = np.random.default_rng(seed)
    w_obs = (iv**2)*T
    a0 = max(1e-8, float(np.min(w_obs)*0.8))
    x0 = np.array([a0, math.log(0.1), np.arctanh(0.0), 0.0, math.log(0.2)])
    def obj(x):
        a,b,rho,m,sig = _unpack_svi(x)
        w = svi_total_variance(k,a,b,rho,m,sig)
        if np.any(w<=0): return 1e6
        return float(np.mean((w-w_obs)**2))
    best, bestv = None, float("inf")
    for s in range(n_starts):
        if s==0: x = x0.copy()
        else:
            x = x0.copy()
            x[0] = max(1e-8, a0*(0.5+1.5*rng_.random()))
            x[1] = math.log(0.05+0.5*rng_.random())
            rho0 = np.clip((rng_.random()*2-1)*0.8, -0.95, 0.95)
            x[2] = np.arctanh(rho0)
            x[3] = (rng_.random()*2-1)*0.4
            x[4] = math.log(0.05+0.5*rng_.random())
        res = minimize(obj, x, method="L-BFGS-B")
        if res.success and float(res.fun)<bestv:
            bestv = float(res.fun); best = res.x.copy()
    if best is None:
        res = minimize(obj, x0, method="L-BFGS-B")
        best = res.x.copy(); bestv = float(res.fun)
    a,b,rho,m,sig = _unpack_svi(best)
    return {"T": float(T), "a": a, "b": b, "rho": rho, "m": m, "sig": sig, "mse": bestv}

def calibrate_svi_surface(df: pd.DataFrame, min_points: int = 12) -> pd.DataFrame:
    out=[]
    for T,g in df.groupby("T"):
        g=g.dropna(subset=["k","iv"])
        if len(g)<min_points: continue
        k = g["k"].values.astype(float)
        iv = g["iv"].values.astype(float)
        mask = np.isfinite(k) & np.isfinite(iv) & (iv>0.01) & (iv<5.0)
        k=k[mask]; iv=iv[mask]
        if len(k)<min_points: continue
        out.append(fit_svi_one_maturity(k, iv, float(T), n_starts=10, seed=SEED))
    return pd.DataFrame(out).sort_values("T")

def svi_iv(k: np.ndarray, T: float, p: dict) -> np.ndarray:
    w = svi_total_variance(k, p["a"], p["b"], p["rho"], p["m"], p["sig"])
    w = np.maximum(w, 1e-12)
    return np.sqrt(w/T)

def build_surface_grids(df: pd.DataFrame, svi_params: pd.DataFrame, n_k: int = 71):
    Ts = np.array(sorted(svi_params["T"].unique()))
    kmin = float(df["k"].quantile(0.02)); kmax = float(df["k"].quantile(0.98))
    k_grid = np.linspace(kmin, kmax, n_k)
    Z = np.zeros((len(Ts), len(k_grid)))
    for i,T in enumerate(Ts):
        p = svi_params.loc[svi_params["T"]==T].iloc[0].to_dict()
        Z[i,:] = svi_iv(k_grid, float(T), p)
    return Ts, k_grid, Z

def plot_surface_market_and_svi(df: pd.DataFrame, Ts: np.ndarray, k_grid: np.ndarray, Z: np.ndarray):
    fig = go.Figure()
    fig.add_trace(go.Scatter3d(x=df["T"], y=df["k"], z=df["iv"], mode="markers", name="Market IV", marker=dict(size=3)))
    fig.add_trace(go.Surface(x=Ts, y=k_grid, z=Z.T, showscale=False, opacity=0.65, name="SVI surface"))
    fig.update_layout(template="plotly_dark", title="Implied Vol Surface — Market scatter + SVI mesh",
                      scene=dict(xaxis_title="T", yaxis_title="k=ln(K/F)", zaxis_title="IV"), height=700)
    return fig

# Try market, else synthetic
ticker_for_surface = "AAPL"
rr = 0.02
try:
    S0_s, df_opt = fetch_option_chain_yf(ticker_for_surface, r=rr, max_expiries=6)
    source = "market"
except Exception as e:
    print("⚠️ Market option chain failed -> synthetic surface. Reason:", e)
    S0_s, df_opt = synthetic_surface(S0=100.0, r=rr, seed=SEED)
    source = "synthetic"

# keep maturities with enough points
counts = df_opt.groupby("T").size()
good_T = counts[counts >= 12].index.values
df_opt = df_opt[df_opt["T"].isin(good_T)].copy()

svi_params = calibrate_svi_surface(df_opt, min_points=12)
Ts, k_grid, Z = build_surface_grids(df_opt, svi_params, n_k=71)

fig = plot_surface_market_and_svi(df_opt, Ts, k_grid, Z)
fig.show()
fig.write_html(ASSETS_DIR / "capstone_vol_surface.html")

print("Surface source:", source)
display(svi_params)

Surface source: market


Unnamed: 0,T,a,b,rho,m,sig,mse
0,0.019178,-0.082165,0.283724,-0.619175,-0.241561,0.37562,9.76822e-06
1,0.038356,-0.082152,0.27059,-0.516495,-0.171445,0.360555,4.592842e-06
2,0.057534,-0.089428,0.26883,-0.465554,-0.146159,0.381751,3.034646e-06
3,0.076712,-0.027107,0.203503,-0.695523,0.469014,0.189234,0.04055978
4,0.09589,-0.046019,0.183101,-0.378023,-0.043277,0.28626,7.905831e-07


## 8) Hedging lab: Delta hedging P&L (toy but very instructive)

We simulate an option under GBM and delta-hedge using a chosen volatility input:
- Historical vol (rolling)
- GARCH forecast (if available)
- Implied vol (from surface, ATM maturity-nearest)

**Interpretation**
- If hedging vol is too low/high → systematic P&L bias
- More frequent hedging reduces discretization error (but costs more in practice)

In [10]:
def bs_delta_call(S: float, K: float, T: float, r: float, q: float, sigma: float) -> float:
    if T <= 0 or sigma <= 0:
        return 1.0 if S > K else 0.0
    vol = sigma*math.sqrt(T)
    d1 = (math.log(S/K) + (r-q+0.5*sigma*sigma)*T)/vol
    return float(math.exp(-q*T)*norm.cdf(d1))

def delta_hedge_pnl_gbm(S0: float, K: float, T: float, r: float, q: float, sigma_true: float, sigma_hedge: float,
                        steps: int = 252, seed: int = 42):
    # simulate one path under true sigma
    t, S = simulate_gbm_paths(S0, mu=r-q, sigma=sigma_true, T=T, steps=steps, n_paths=1, seed=seed)
    S = S[:,0]
    dt = T/steps
    # initial option price
    C0, _ = bs_call_put(S0, K, T, r, q, sigma_true)
    # hedge portfolio: hold delta shares financed by cash B
    delta0 = bs_delta_call(S0, K, T, r, q, sigma_hedge)
    B = C0 - delta0*S0  # self-financing initial
    delta = delta0
    # rebalance
    for i in range(1, steps+1):
        # accrue cash at r
        B *= math.exp(r*dt)
        # portfolio value before rebalance
        V = delta*S[i] + B
        # compute new delta for remaining time
        tau = max(0.0, T - t[i])
        delta_new = bs_delta_call(S[i], K, tau, r, q, sigma_hedge)
        # rebalance: buy/sell shares, adjust cash
        B = V - delta_new*S[i]
        delta = delta_new
    # final payoff and hedged PnL
    payoff = max(S[-1] - K, 0.0)
    pnl = (delta*S[-1] + B) - payoff
    return pnl

# Example: compare hedging PnL distribution by running many paths
def hedge_pnl_distribution(S0: float, K: float, T: float, r: float, q: float, sigma_true: float, sigma_hedge: float,
                           steps: int = 252, n_paths: int = 2000, seed: int = 42):
    rng_ = np.random.default_rng(seed)
    pnls = np.array([delta_hedge_pnl_gbm(S0, K, T, r, q, sigma_true, sigma_hedge, steps=steps, seed=int(rng_.integers(0, 1e9)))
                     for _ in range(n_paths)])
    return pnls

S0 = float(px.iloc[-1]); K = S0; T = 30/365; r_rf = 0.02; q_div = 0.0
sigma_true = float(r.rolling(63).std(ddof=1).iloc[-1]*np.sqrt(252))
sigma_hedge_hist = sigma_true
sigma_hedge_garch = garch_forecast if (not np.isnan(garch_forecast)) else sigma_true

pnls_hist = hedge_pnl_distribution(S0, K, T, r_rf, q_div, sigma_true, sigma_hedge_hist, steps=60, n_paths=1200, seed=SEED)
pnls_garch = hedge_pnl_distribution(S0, K, T, r_rf, q_div, sigma_true, sigma_hedge_garch, steps=60, n_paths=1200, seed=SEED+1)

dfp = pd.DataFrame({"PNL_histVol": pnls_hist, "PNL_garchVol": pnls_garch})
fig = pxl.histogram(dfp, x=["PNL_histVol","PNL_garchVol"], barmode="overlay", nbins=80, template="plotly_dark",
                   title="Delta-hedging P&L distribution (toy) — compare hedging vols")
fig.show()
fig.write_html(ASSETS_DIR / "hedging_pnl_hist.html")

dfp.describe().round(6)

Unnamed: 0,PNL_histVol,PNL_garchVol
count,1200.0,1200.0
mean,0.003848,0.001136
std,0.091298,0.089734
min,-0.456473,-0.448172
25%,-0.051064,-0.048399
50%,0.005866,0.012388
75%,0.05375,0.056101
max,0.304355,0.299643


## 9) Risk lab: VaR and CVaR

We compute VaR/CVaR at confidence level \(\alpha\) for:
- returns \(r_t\)
- hedging P&L distribution

Definitions:
- VaR\(_\alpha\): \(\alpha\)-quantile of losses
- CVaR\(_\alpha\): expected loss beyond VaR

**Interpretation**
- CVaR is more informative for tail risk than VaR

In [11]:
def var_cvar(x: np.ndarray, alpha: float = 0.99) -> dict:
    x = np.asarray(x)
    x = x[np.isfinite(x)]
    if len(x) == 0:
        return {"VaR": np.nan, "CVaR": np.nan}
    # losses are -x
    losses = -x
    var = np.quantile(losses, alpha)
    cvar = losses[losses >= var].mean() if np.any(losses >= var) else np.nan
    return {"VaR": float(var), "CVaR": float(cvar)}

risk_returns = var_cvar(r.values, alpha=0.99)
risk_hedge_hist = var_cvar(pnls_hist, alpha=0.99)
risk_hedge_garch = var_cvar(pnls_garch, alpha=0.99)

pd.DataFrame([risk_returns, risk_hedge_hist, risk_hedge_garch],
             index=["Daily returns (log)", "Hedging PnL (hist vol)", "Hedging PnL (garch vol)"]).round(6)

Unnamed: 0,VaR,CVaR
Daily returns (log),0.02538,0.02964
Hedging PnL (hist vol),0.239408,0.308418
Hedging PnL (garch vol),0.245583,0.327512


## 10) Strategy lab (toy): volatility timing on the underlying

Simple idea:
- When forecast vol is high, reduce exposure; when low, increase exposure.
- Position \(w_t = \min(w_{max}, \frac{\sigma_{target}}{\hat\sigma_t})\).

This is a common “risk control” concept used in volatility targeting funds.

**Interpretation**
- It stabilizes risk and reduces drawdowns
- But can underperform in strong trends (because you reduce exposure after shocks)

In [12]:
def vol_target_strategy(px: pd.Series, vol_est_ann: pd.Series, target_vol: float = 0.15, w_max: float = 2.0):
    r_simple = px.pct_change().dropna()
    vol = vol_est_ann.reindex(r_simple.index).replace(0, np.nan).ffill()
    w = (target_vol / vol).clip(0, w_max).fillna(0.0)
    # lag weights to avoid look-ahead
    strat_ret = (w.shift(1).fillna(0.0) * r_simple).rename("ret")
    equity = (1 + strat_ret).cumprod()
    bh = (1 + r_simple).cumprod().rename("buy_hold")
    out = pd.DataFrame({"ret": strat_ret, "equity": equity, "buy_hold": bh, "weight": w})
    return out

# Use GARCH vol if available, else rolling vol
vol_est = (garch_fit["cond_vol_ann"] if garch_fit is not None else hist_vol).reindex(px.index).ffill()
out = vol_target_strategy(px, vol_est, target_vol=0.15, w_max=2.0).dropna()

fig = go.Figure()
fig.add_trace(go.Scatter(x=out.index, y=out["equity"], mode="lines", name="Vol-target strategy"))
fig.add_trace(go.Scatter(x=out.index, y=out["buy_hold"], mode="lines", name="Buy & Hold"))
fig.update_layout(template="plotly_dark", title="Volatility timing (toy) — equity curve", xaxis_title="Date", yaxis_title="Equity")
fig.show()
fig.write_html(ASSETS_DIR / "strategy_vol_target_equity.html")

fig2 = go.Figure()
fig2.add_trace(go.Scatter(x=out.index, y=out["weight"], mode="lines", name="Weight"))
fig2.update_layout(template="plotly_dark", title="Vol-target strategy — exposure weight", xaxis_title="Date", yaxis_title="Weight")
fig2.show()
fig2.write_html(ASSETS_DIR / "strategy_vol_target_weight.html")

out[["ret","equity","weight"]].tail()

Unnamed: 0,ret,equity,weight
2024-07-25,0.01986,0.345968,0.878653
2024-07-26,-0.001221,0.345546,0.804157
2024-07-29,-0.019121,0.338939,0.828705
2024-07-30,-0.005587,0.337045,0.755959
2024-07-31,0.011273,0.340845,0.772594


## 11) Interactive dashboard (one place to play)

Use this dashboard to:
- choose ticker, start date, r, q
- choose option parameters (K as % of spot, maturity)
- compare pricing methods (BS/MC/CRR/LSM)
- compare vol inputs (historical vs GARCH)
- refresh vol surface (market vs synthetic)

This is designed to be **demo-friendly** for GitHub / interviews.

In [13]:
ticker_w = widgets.Text(value="ABVX", description="Ticker")
start_w  = widgets.Text(value="2020-01-01", description="Start")
r_w = widgets.FloatSlider(value=0.02, min=0.0, max=0.08, step=0.0025, description="r")
q_w = widgets.FloatSlider(value=0.00, min=0.0, max=0.05, step=0.001, description="q")

Kpct_w = widgets.FloatSlider(value=1.00, min=0.7, max=1.3, step=0.01, description="K/S0")
Tdays_w = widgets.IntSlider(value=30, min=7, max=365, step=1, description="T (days)")

mc_paths_w = widgets.IntSlider(value=80000, min=10000, max=200000, step=10000, description="MC paths")
crr_steps_w = widgets.IntSlider(value=200, min=50, max=600, step=50, description="CRR steps")
lsm_paths_w = widgets.IntSlider(value=30000, min=5000, max=80000, step=5000, description="LSM paths")

run_btn = widgets.Button(description="Run Capstone Summary", button_style="success")
out = widgets.Output()

display(widgets.VBox([
    widgets.HBox([ticker_w, start_w, r_w, q_w]),
    widgets.HBox([Kpct_w, Tdays_w]),
    widgets.HBox([mc_paths_w, crr_steps_w, lsm_paths_w]),
    run_btn,
    out
]))

def run_capstone(_):
    with out:
        clear_output(wait=True)
        ticker = ticker_w.value.strip().upper()
        start = start_w.value.strip()
        rr = float(r_w.value)
        qq = float(q_w.value)
        px_, r_, used_synth_ = get_market_data(ticker, start)
        S0_ = float(px_.iloc[-1])
        K_ = float(Kpct_w.value) * S0_
        T_ = float(Tdays_w.value)/365.0

        sigma_hist = float(r_.rolling(63).std(ddof=1).iloc[-1] * np.sqrt(252))
        gfit = fit_garch11(r_)
        sigma_garch = float(gfit["forecast_ann"]) if (gfit is not None and np.isfinite(gfit["forecast_ann"])) else sigma_hist

        # Pricing
        bsC, bsP = bs_call_put(S0_, K_, T_, rr, qq, sigma_hist)
        mcC = mc_call_gbm(S0_, K_, T_, rr, qq, sigma_hist, n_paths=int(mc_paths_w.value), seed=SEED)
        crrC = crr_price(S0_, K_, T_, rr, qq, sigma_hist, steps=int(crr_steps_w.value), is_call=True, is_american=False)
        am_put_lsm = lsm_american_put(S0_, K_, T_, rr, qq, sigma_hist, steps=80, n_paths=int(lsm_paths_w.value), seed=SEED)

        # Compare vol choices for BS call
        bsC_hist, _ = bs_call_put(S0_, K_, T_, rr, qq, sigma_hist)
        bsC_garch, _ = bs_call_put(S0_, K_, T_, rr, qq, sigma_garch)

        tbl = pd.DataFrame({
            "value": [bsC_hist, bsC_garch, mcC, crrC, bsP, am_put_lsm],
        }, index=["BS Call (hist vol)", "BS Call (garch vol)", "MC Call (hist vol)", "CRR Call (hist vol)", "BS Put (hist vol)", "LSM Put (Amer, hist vol)"])
        display(tbl.round(6))

        # Equity + drawdown quick view
        df_tmp = pd.DataFrame({"r": r_})
        df_tmp["roll_vol_21d"] = df_tmp["r"].rolling(21).std(ddof=1) * np.sqrt(252)
        fig = go.Figure()
        fig.add_trace(go.Scatter(x=df_tmp.index, y=df_tmp["roll_vol_21d"], mode="lines", name="Rolling vol"))
        fig.update_layout(template="plotly_dark", title="Rolling vol (21d, ann.)", xaxis_title="Date", yaxis_title="Vol")
        fig.show()

        print("Used synthetic data:", used_synth_)
        print("sigma_hist:", f"{sigma_hist:.2%}", "| sigma_garch(1-step):", f"{sigma_garch:.2%}")

run_btn.on_click(run_capstone)
run_capstone(None)

VBox(children=(HBox(children=(Text(value='ABVX', description='Ticker'), Text(value='2020-01-01', description='…

## 12) Interview pitch (short)

- Built a complete “Options & Risk Lab” from scratch: modeling → pricing → surface → hedging → risk → strategy.
- Implemented multiple pricing engines (BS, MC, CRR, LSM) and compared outputs.
- Modeled volatility with historical and GARCH, and connected vol inputs to pricing and hedging P&L.
- Built an implied vol surface (market/synthetic), calibrated SVI, and visualized it interactively.
- Added VaR/CVaR and volatility timing as a risk/portfolio extension.