# Implied vs Realized Volatility Comparison

This notebook compares **Implied Volatility (IV)**, a forward-looking metric derived from option prices, with **Realized Volatility (RV)**, a backward-looking metric derived from historical returns.

## Objectives
1. **Fetch historical prices** (cached locally).
2. **Compute Realized Volatility** using a rolling window.
3. **Simulate Implied Volatility** using a regime-switching model and synthetic option prices.
4. **Analyze** the lag and differences between the two.

In [None]:
# Setup
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from qpl.market.data import get_prices
from qpl.market.stats import rolling_realized_volatility
from qpl.models.black_scholes import bs_price
from qpl.engines.analytic.black_scholes import implied_volatility
from qpl.instruments.options import EuropeanOption
from qpl.models.black_scholes import BlackScholesModel
from qpl.market.market import Market, FlatRateCurve, FlatDividendCurve
import warnings
warnings.filterwarnings('ignore') # filter pandas warnings

## 1. Fetch Historical Data
We use `SPY` text data. The first run requires internet access; subsequent runs use the local cache.

In [None]:
ticker = "SPY"
start_date = "2021-01-01"
end_date = "2023-12-31"

try:
    df = get_prices(ticker, start=start_date, end=end_date)
    print(f"Loaded {len(df)} rows for {ticker}")
except Exception as e:
    print(f"Could not fetch data: {e}. Using synthetic random walk data.")
    dates = pd.date_range(start_date, end_date, freq="B")
    prices = 300 * np.exp(np.cumsum(np.random.normal(0, 0.01, size=len(dates))))
    df = pd.DataFrame({"Close": prices}, index=dates)

# Handle MultiIndex columns if present (common with yfinance >= 0.2)
if isinstance(df.columns, pd.MultiIndex):
    df.columns = df.columns.get_level_values(0)

prices = df["Close"]
df.head()

## 2. Compute Realized Volatility
We calculate the rolling annualized volatility over a 30-day window.

In [None]:
window_30 = 30
# Use .values to ensure we pass 1D array to stats function
rv_30 = rolling_realized_volatility(prices.values, window=window_30, annualization=252)

# Add to dataframe for easier plotting
df["RV_30"] = rv_30

## 3. Simulate Implied Volatility (Synthetic)

To demonstrate the relationship without needing flaky option chain data, we simulate a "True Volatility" regime that drives option prices.

**Scenario**:
- **Regime 1 (Calm)**: Jan 2021 - Dec 2021, $\sigma = 15\%$
- **Regime 2 (Crisis)**: Jan 2022 - Jun 2022, $\sigma = 30\%$
- **Regime 3 (Recovery)**: Jul 2022 - Dec 2023, $\sigma = 20\%$

We compute synthetic option prices using these volatilities, then "solve" for IV to prove consistency.

In [None]:
# Define True Volatility Regimes
sigma_true = pd.Series(0.15, index=df.index)

# Crisis regime
mask_crisis = (df.index >= "2022-01-01") & (df.index < "2022-07-01")
sigma_true[mask_crisis] = 0.30

# Recovery regime
mask_recovery = (df.index >= "2022-07-01")
sigma_true[mask_recovery] = 0.20

df["Sigma_True"] = sigma_true

In [None]:
# Invert synthetic option prices to get IV
# This step is pedagogical: we prove we can recover the parameter from the price.

iv_series = []

# Market parameters for synthetic pricing
r = 0.03
q = 0.00
T = 30/365.0 # 30-day options

for date, row in df.iterrows():
    try:
        S = float(row["Close"])
        sig = float(row["Sigma_True"])
        K = S # ATM option
        
        # Price the option with TRUE sigma
        true_price = bs_price(S=S, K=K, T=T, r=r, q=q, sigma=sig, kind="call")
        
        # Invert to find IV
        # implied_volatility(price, S, K, T, r, q, is_call)
        # implied_volatility needs Option and Market objects
        opt = EuropeanOption(strike=K, expiry=T, kind="call")
        mkt = Market(spot=S, rate_curve=FlatRateCurve(r), dividend_curve=FlatDividendCurve(q))
        
        vol_sol = implied_volatility(price=true_price, option=opt, market=mkt)
        iv_series.append(vol_sol)
    except Exception as e:
        # print(f"Error: {e}")
        iv_series.append(np.nan)

df["Implied_Vol"] = iv_series

## 4. Analysis: Lag vs Lead

Plotting the results reveals a key dynamic:
1. **Implied Volatility (Green)** jumps *instantly* when the regime changes. It is a forward-looking expectation.
2. **Realized Volatility (Blue)** lags behind. It takes time for the rolling window to accumulate the new high-variance returns.

This illustrates why IV is often considered a "fear gauge" while RV measures past turbulence.

In [None]:
plt.figure(figsize=(12, 6))
plt.plot(df.index, df["Implied_Vol"], label="Implied Volatility (Forward-Looking)", color='green', linewidth=2)
plt.plot(df.index, df["RV_30"], label="Realized Volatility (30D Rolling)", color='blue', alpha=0.7)

plt.title("Implied vs Realized Volatility (Synthetic Regimes)")
plt.ylabel("Annualized Volatility")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()