# Notebook 2, Delta Hedge 30-Day Variance Strip (SPX), t₀ = 2020-06-15 → T* = 2020-07-13 (28d)

**Goal.**

Show that a quotes-only delta-hedged VIX-style static strip tracks a 30-day variance-swap payoff over the same span.

**Inputs.**

* From Notebook 1: `01_variance_strike/strip_30d.csv`, `01_variance_strike/variance_strike_summary.csv`
* Options panel (mids): `02_hedge/spx_option_quotes_2020-06-15_to_2020-07-17.csv`
* SPX closes: `02_hedge/spx_close_2020-06-01_to_2020-08-15.csv`
* Note: `STRIKE_DIVISOR = 1000`; r = q = 0, quotes-only replication with mids, daily hedging at closes, no transaction cost.

**Methods.**

1. **Build the tradable strip.** Load NB1 legs; drop audit B rows; apply NB1 interpolation weights (`w_near`, `w_next`) so leg weights are in 30 day variance units.
2. **Daily valuation with strict coverage.** From quotes, keep only the two NB1 expiries. A day is valid only if every required `(expiry, strike, side)` has a finite mid. On valid days, mark
   $V_t = \sum_i \text{weight}_i \cdot \text{mid}_{i,t}$ (puts use `mid_P`, calls use `mid_C`).
3. **Quotes-only delta.** Estimate delta as the rolling OLS slope of $V$ on $S$ over the last $k=5$ days; hedge with yesterday’s slope $\Delta^{\text{hedge}}_t = \widehat{\beta}_{t-1}$.
4. **Hedge P&L and cumulation.** $\text{HPnL}_t = - \Delta^{\text{hedge}}_t (S_t - S_{t-1})$; $\Delta V^{\text{hedged}}_t = (V_t - V_{t-1}) + \text{HPnL}_t$. Cumulate to $T^*$.
5. **Comparator and stop rule.** Pick $T^*$ as the closest tradable date to $t_0 + 30$ calendar days. Compute realized variance on the same span
   $RV = \big(\sum (\Delta \ln S)^2\big)\cdot \dfrac{365}{\text{calendar days}}$. Compare the strip’s hedged total to $RV - \sigma^2_{30\text{d}}$ ($\sigma^2_{30\text{d}}$ from NB1).

**Outputs.**

* `02_hedge/hedge_log.csv` — daily log: `date, S, V, dS, dV, delta_hedge, hedge_pnl, dV_hedged, V_unhedged_cum, V_hedged_cum`
* `02_hedge/hedge_summary.csv` — one row: `t0, T_star, days, sigma2_30d_from_NB1, RV_30d, RV_minus_sigma2, strip_hedged_total, tracking_error`

**Results (2020-06-15 → 2020-07-13).**

* $\sigma^2_{30\text{d}}$ (NB1): **0.125807**
* $RV_{30\text{d}}$: **0.039284**
* Target $RV - \sigma^2_{30\text{d}}$: **−0.086523**
* Strip hedged total: **−0.131393**
* Tracking error (hedged − target): **−0.04487**

**Interpretation.**

The quotes-only delta-hedged strip matches the 30-day variance-swap payoff in sign and order of magnitude. The residual is consistent with the finite strike grid, timing (mids vs. close), the non-tradable VIX boundary term $B(T)$ (excluded), and daily (not continuous) hedging.

**Set up.** 

In [1]:
import os
import numpy as np
import pandas as pd
from datetime import timedelta

STRIKE_DIVISOR = 1000

DIR_VS = "01_variance_strike"
DIR_H2 = "02_hedge"
OPTS_CSV = r"02_hedge/spx_option_quotes_2020-06-15_to_2020-07-17.csv"
CLOSES_CSV = r"02_hedge/spx_close_2020-06-01_to_2020-08-15.csv"

# Notebook 1 outputs
STRIP_30D_CSV = os.path.join(DIR_VS, "strip_30d.csv")
VS_SUMMARY = os.path.join(DIR_VS, "variance_strike_summary.csv")

In [2]:
# Load Notebook 1 strip and summary, apply interpolation weights
strip = pd.read_csv(STRIP_30D_CSV)
vs = pd.read_csv(VS_SUMMARY)

strip["expiry"] = pd.to_datetime(strip["expiry"]).dt.date
strip = strip[strip["side"].isin(["C","P"])].copy()
expiries = sorted(strip["expiry"].unique())
exp_near, exp_next = expiries

t0 = pd.to_datetime(vs.loc[0, "t0"]).date()
sigma2_30d = float(vs.loc[0, "sigma2_30d"])
w_near = float(vs.loc[0, "w_near"])
w_next = float(vs.loc[0, "w_next"])

# apply per-expiry interpolation weights
strip["expiry_weight"] = np.where(strip["expiry"] == exp_near, w_near, w_next)
strip["weight"] = strip["weight"] * strip["expiry_weight"]

**Market data.** Load option mids and underlying closes; pivot to mid_C/mid_P.


In [3]:
# option quotes
opts = pd.read_csv(OPTS_CSV, low_memory=False)

opts["date"] = pd.to_datetime(opts["date"]).dt.date
opts["exdate"] = pd.to_datetime(opts["exdate"]).dt.date
opts["K"] = pd.to_numeric(opts["strike_price"], errors="coerce") / STRIKE_DIVISOR
bb = pd.to_numeric(opts["best_bid"], errors="coerce")
bo = pd.to_numeric(opts["best_offer"], errors="coerce")
opts["mid"] = (bb + bo) / 2.0
opts = opts.dropna(subset=["date", "exdate", "K", "mid", "cp_flag"])
opts = opts[opts["exdate"].isin(expiries)].copy()

pvt = (opts
       .pivot_table(index=["date","exdate","K"], columns="cp_flag", values="mid", aggfunc="mean")
       .reset_index()
       .rename(columns={"C":"mid_C","P":"mid_P"}))

# closes
cl = pd.read_csv(CLOSES_CSV)
cl["date"] = pd.to_datetime(cl["date"]).dt.date
cl = cl.dropna(subset=["date","close"]).sort_values("date")
S_map = dict(zip(cl["date"], pd.to_numeric(cl["close"], errors="coerce")))

**Marking.** Build $V_t$ only on days with a full cross‑section.

In [4]:
# build daily strip value V_t in variance units only on days with a strict full cross‑section
strip_legs = strip[["expiry","side","strike","weight"]].copy()
need_keys = strip_legs.rename(columns={"strike":"K","expiry":"exdate"})[["exdate","K","side"]]
need_by_ex = {ex: need_keys[need_keys["exdate"]==ex][["K","side"]].drop_duplicates() for ex in expiries}

# strict full cross‑section
def day_has_full_cross_section(pvt_day: pd.DataFrame) -> bool:
    for ex, need in need_by_ex.items():
        sub = pvt_day[pvt_day["exdate"]==ex]
        have_pairs = set()
        for _, r in sub.iterrows():
            if np.isfinite(r["mid_C"]): have_pairs.add((round(float(r["K"]),6), "C"))
            if np.isfinite(r["mid_P"]): have_pairs.add((round(float(r["K"]),6), "P"))
        req = set(zip(need["K"].round(6), need["side"]))
        if not req.issubset(have_pairs):
            return False
    return True

def strip_value_variance_units(pvt_day: pd.DataFrame, strip_df: pd.DataFrame) -> float:
    m = pvt_day.merge(strip_df.rename(columns={"strike":"K","expiry":"exdate"}),
                      on=["exdate","K"], how="inner")
    m["mid_px"] = np.where(m["side"]=="C", m["mid_C"], m["mid_P"])
    return float((m["weight"] * m["mid_px"]).sum())

dates = sorted(set(pvt["date"].unique()) & set(S_map.keys()))
rows = []
for d in dates:
    day = pvt[pvt["date"]==d]
    if not day_has_full_cross_section(day): 
        continue
    V = strip_value_variance_units(day, strip_legs)
    S = float(S_map[d])
    rows.append({"date": d, "S": S, "V": V})

path = pd.DataFrame(rows).sort_values("date").reset_index(drop=True)

**Hedge.** Rolling OLS delta from quotes $k=5$; cumulative hedged P&L.

In [5]:
# Quotes-only rolling OLS delta (k=5)
def ols_slope(y: np.ndarray, x: np.ndarray) -> float:
    x = x.astype(float)
    y = y.astype(float)
    xm = x.mean()
    ym = y.mean()
    num = ((x - xm) * (y - ym)).sum()
    den = ((x - xm)**2).sum()
    return np.nan if den == 0 else float(num / den)

path["dS"] = path["S"].diff()
path["dV"] = path["V"].diff()

k = 5
path["delta_est"] = np.nan
for i in range(len(path)):
    j0 = max(0, i - (k - 1))
    window = path.iloc[j0:i+1]
    if len(window) >= 3:
        path.loc[path.index[i], "delta_est"] = ols_slope(window["V"].values, window["S"].values)

path["delta_hedge"] = path["delta_est"].shift(1).fillna(0.0)
# cumulative hedged P&L
path["hedge_pnl"] = - path["delta_hedge"] * path["dS"]
path["dV_hedged"] = path["dV"] + path["hedge_pnl"]
path["V_unhedged_cum"] = path["dV"].fillna(0.0).cumsum()
path["V_hedged_cum"] = path["dV_hedged"].fillna(0.0).cumsum()

**Finalize.** Stop at $T^* ≈ t_0+30d$, compute RV with exact span, write hedge_log and hedge_summary.

In [6]:
# stop at T* ~ t0 + 30 days
t_star_target = t0 + timedelta(days=30)
path["abs_diff"] = path["date"].apply(lambda d: abs(d - t_star_target))
t_star = path.loc[path["abs_diff"].idxmin(), "date"]
log_df = path[path["date"] <= t_star].copy().drop(columns=["abs_diff"])

# compute realized variance on the exact same span
cl_use = cl[(cl["date"] >= log_df["date"].min()) & (cl["date"] <= t_star)].copy().sort_values("date")
days_calendar = (t_star - t0).days
rv_30d = np.log(cl_use["close"]).diff().pow(2).sum() * (365.0 / days_calendar)

summary = pd.DataFrame([{
    "t0": t0,
    "T_star": t_star,
    "days": days_calendar,
    "sigma2_30d_from_NB1": sigma2_30d,
    "RV_30d": rv_30d,
    "RV_minus_sigma2": rv_30d - sigma2_30d,
    "strip_hedged_total": log_df["dV_hedged"].sum(),
    "tracking_error": log_df["dV_hedged"].sum() - (rv_30d - sigma2_30d)
}])

os.makedirs(DIR_H2, exist_ok=True)
log_fp = os.path.join(DIR_H2, "hedge_log.csv")
sum_fp = os.path.join(DIR_H2, "hedge_summary.csv")
log_cols = ["date", "S", "V", "dS", "dV", "delta_hedge", "hedge_pnl", "dV_hedged", "V_unhedged_cum", "V_hedged_cum"]
log_df[log_cols].to_csv(log_fp, index=False)
summary.to_csv(sum_fp, index=False)

# outputs
print(summary.to_string(index=False))

        t0     T_star  days  sigma2_30d_from_NB1   RV_30d  RV_minus_sigma2  strip_hedged_total  tracking_error
2020-06-15 2020-07-13    28             0.125807 0.039284        -0.086523           -0.131393        -0.04487
