
# ES & VaR Scenario Analysis
**Created:** 2025-08-31 20:23:13

This notebook provides a complete, modular framework to compute **Value at Risk (VaR)** and **Expected Shortfall (ES/CVaR)** via multiple methods and run **stress/scenario** analyses. It also includes **backtesting** utilities (Kupiec POF test) and clean visualizations.

### What you can do here
- Load prices/returns from CSV or quickly **simulate** a synthetic asset path
- Compute **Historical**, **Gaussian (Parametric)**, **Cornish–Fisher**, and **Monte Carlo** VaR & ES
- Run **multi-day horizons** and **shock** scenarios (volatility scaling, distribution tails)
- **Backtest** VaR with the Kupiec Proportion of Failures test
- Plot loss distributions and rolling VaR overlays


In [None]:

# ==== Setup ====
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from scipy.stats import norm, t, skew, kurtosis
import warnings

warnings.filterwarnings("ignore")
pd.options.display.float_format = '{:,.6f}'.format

# For reproducibility in examples
np.random.seed(42)



## 1. Data: Load or Simulate
You can either:
- **Load a CSV** with a price series (set `csv_path` and `price_col`), or
- **Simulate** a synthetic Geometric Brownian Motion (GBM) series.


In [None]:

def load_returns_from_csv(csv_path: str, price_col: str = "Close", date_col: str = None):
    """
    Load a price series from CSV and convert to daily log returns.
    """
    df = pd.read_csv(csv_path)
    if date_col and date_col in df.columns:
        df[date_col] = pd.to_datetime(df[date_col])
        df = df.sort_values(date_col).set_index(date_col)
    else:
        df = df.sort_index()
    if price_col not in df.columns:
        raise ValueError(f"Column '{price_col}' not in CSV. Available: {list(df.columns)}")
    prices = df[price_col].astype(float).dropna()
    rets = np.log(prices).diff().dropna()
    rets.name = "log_return"
    return rets

def simulate_gbm_returns(n_days: int = 1500, mu: float = 0.08, sigma: float = 0.20, dt: float = 1/252):
    """
    Simulate GBM prices and return daily log returns.
    """
    eps = np.random.normal(0, 1, size=n_days)
    log_rets = (mu - 0.5 * sigma**2) * dt + sigma * np.sqrt(dt) * eps
    return pd.Series(log_rets, name="log_return")

# Example toggle: set to a file path to use your own data
csv_path = None  # e.g., "/mnt/data/my_prices.csv"
price_col = "Close"
date_col = None

if csv_path:
    returns = load_returns_from_csv(csv_path, price_col=price_col, date_col=date_col)
else:
    returns = simulate_gbm_returns(n_days=2000, mu=0.07, sigma=0.25)

returns = returns.dropna().astype(float)
returns.tail()



## 2. VaR & ES Utilities
We provide implementations for multiple approaches. By convention here:
- **Losses** are negative returns.
- VaR is quoted as a **positive** number (magnitude of the quantile loss).
- ES is the **average loss beyond** the VaR threshold.


In [None]:

def historical_var_es(returns: pd.Series, alpha: float = 0.99):
    """
    Historical VaR/ES using empirical quantiles of the loss distribution.
    Returns VaR, ES (both positive magnitudes).
    """
    losses = -returns.dropna().values
    q = np.quantile(losses, alpha)
    es = losses[losses >= q].mean() if np.any(losses >= q) else q
    return q, es

def gaussian_var_es(returns: pd.Series, alpha: float = 0.99):
    """
    Parametric (Gaussian) VaR/ES using sample mean & std of returns.
    """
    mu = returns.mean()
    sd = returns.std(ddof=1)
    z = norm.ppf(alpha)
    var = -(mu - z * sd)  # VaR as positive magnitude of loss
    es = -(mu - sd * norm.pdf(z) / (1 - alpha))
    return float(var), float(es)

def cornish_fisher_var_es(returns: pd.Series, alpha: float = 0.99):
    """
    Cornish-Fisher adjusted VaR/ES accounting for skewness & excess kurtosis.
    ES is approximated by numerical tail average using CF-perturbed quantiles.
    """
    x = returns.dropna().values
    mu, sd = x.mean(), x.std(ddof=1)
    s, k = skew(x, bias=False), kurtosis(x, fisher=True, bias=False)  # excess kurtosis

    z = norm.ppf(alpha)
    z_cf = (z
            + (1/6)*(z**2 - 1)*s
            + (1/24)*(z**3 - 3*z)*k
            - (1/36)*(2*z**3 - 5*z)*(s**2))
    var = -(mu - z_cf * sd)

    zs = np.linspace(norm.ppf(alpha), norm.ppf(0.999999), 2000)
    zc = (zs
          + (1/6)*(zs**2 - 1)*s
          + (1/24)*(zs**3 - 3*zs)*k
          - (1/36)*(2*zs**3 - 5*zs)*(s**2))
    tail_losses = -(mu - zc * sd)
    es = tail_losses.mean()
    return float(var), float(es)

def monte_carlo_var_es(returns: pd.Series, alpha: float = 0.99, n_sims: int = 200000, dist: str = "normal", nu: int = 5):
    """
    Monte Carlo VaR/ES by simulating returns from a chosen distribution.
    dist: "normal" or "student_t" (heavy tails via df=nu).
    """
    mu = returns.mean()
    sd = returns.std(ddof=1)

    if dist == "normal":
        sims = np.random.normal(mu, sd, size=n_sims)
    elif dist == "student_t":
        sims = mu + sd * (t.rvs(df=nu, size=n_sims) / np.sqrt(nu/(nu-2))) if nu > 2 else mu + sd * t.rvs(df=nu, size=n_sims)
    else:
        raise ValueError("dist must be 'normal' or 'student_t'")

    losses = -sims
    q = np.quantile(losses, alpha)
    es = losses[losses >= q].mean()
    return float(q), float(es)



## 3. Horizons & Portfolio Scaling
We include simple scaling for multi-day horizons and notional portfolio sizes.


In [None]:

def scale_to_horizon(var: float, es: float, horizon_days: int = 1, method: str = "sqrt_time"):
    """
    Scale daily VaR/ES to multi-day horizon.
    method="sqrt_time": multiply by sqrt(horizon_days).
    """
    if horizon_days <= 0:
        raise ValueError("horizon_days must be >= 1")
    if method == "sqrt_time":
        f = np.sqrt(horizon_days)
        return var * f, es * f
    else:
        raise ValueError("Unsupported scaling method")

def scale_to_notional(var: float, es: float, notional: float = 1_000_000):
    """
    Convert return-based VaR/ES magnitudes to currency using notional exposure.
    """
    return var * notional, es * notional



## 4. Scenarios & Stress Testing
Create bespoke shocks (e.g., volatility up, mean down), simulate from heavy-tailed distributions, or apply single-day crash scenarios.


In [None]:

def scenario_var_es(returns: pd.Series, alpha: float = 0.99, vol_scale: float = 1.0, mean_shift: float = 0.0,
                    method: str = "monte_carlo", dist: str = "student_t", nu: int = 5, n_sims: int = 200000):
    """
    Scenario engine: apply a mean/vol shock, then compute VaR/ES with chosen method.
    """
    x = returns.dropna().values
    mu = x.mean() + mean_shift
    sd = x.std(ddof=1) * vol_scale
    tmp = pd.Series(np.random.normal(mu, sd, size=len(x)))

    if method == "historical":
        return historical_var_es(tmp, alpha)
    elif method == "gaussian":
        return gaussian_var_es(tmp, alpha)
    elif method == "cornish_fisher":
        return cornish_fisher_var_es(tmp, alpha)
    elif method == "monte_carlo":
        series = pd.Series(np.random.normal(mu, sd, size=len(x)))
        return monte_carlo_var_es(series, alpha=alpha, n_sims=n_sims, dist=dist, nu=nu)
    else:
        raise ValueError("Unknown method")

def one_day_crash_var_es(returns: pd.Series, alpha: float = 0.99, crash_return: float = -0.10):
    """
    Inject a single crash observation into the historical sample and recompute historical VaR/ES.
    """
    augmented = pd.concat([returns.dropna(), pd.Series([crash_return])], ignore_index=True)
    return historical_var_es(augmented, alpha)



## 5. VaR Backtesting (Kupiec POF)
A simple **Proportion of Failures** test: compares observed exceedances vs. expected under the chosen confidence level.


In [None]:

def rolling_var(returns: pd.Series, alpha: float = 0.99, window: int = 250, method: str = "historical"):
    """
    Compute rolling daily VaR using specified method.
    Returns a Pandas Series aligned with input returns.
    """
    vals = []
    r = returns.dropna()
    for i in range(len(r)):
        if i < window:
            vals.append(np.nan)
            continue
        sample = r.iloc[i-window:i]
        if method == "historical":
            var, _ = historical_var_es(sample, alpha)
        elif method == "gaussian":
            var, _ = gaussian_var_es(sample, alpha)
        else:
            var, _ = historical_var_es(sample, alpha)
        vals.append(var)
    return pd.Series(vals, index=r.index, name=f"VaR_{method}_{int(alpha*100)}")

def kupiec_pof_test(returns: pd.Series, var_series: pd.Series, alpha: float = 0.99):
    """
    Kupiec Proportion of Failures (POF) test.
    Returns (LR, p_value, n_exceed, expected_exceed).
    """
    aligned = pd.concat([returns, var_series], axis=1).dropna()
    losses = -aligned.iloc[:,0].values
    var_vals = aligned.iloc[:,1].values
    exceed = (losses > var_vals).astype(int)
    n = len(exceed)
    x = exceed.sum()
    pi = 1 - alpha
    if x == 0 or x == n:
        lr = 0 if x == 0 else np.inf
        p = 1.0 if x == 0 else 0.0
    else:
        num = ((1 - pi)**(n - x)) * (pi**x)
        den = ((1 - x/n)**(n - x)) * ((x/n)**x)
        lr = -2 * np.log(num / den)
        from scipy.stats import chi2
        p = 1 - chi2.cdf(lr, df=1)
    return float(lr), float(p), int(x), float(n * (1 - alpha))



## 6. Visualizations
Charts are kept simple and clean (one chart per cell).


In [None]:

def plot_loss_histogram_with_var(returns: pd.Series, var_val: float, bins: int = 60, title: str = "Loss Distribution & VaR"):
    losses = -returns.dropna().values
    plt.figure(figsize=(9,5))
    plt.hist(losses, bins=bins, alpha=0.7)
    plt.axvline(var_val, linestyle="--", linewidth=2)
    plt.title(title)
    plt.xlabel("Loss")
    plt.ylabel("Frequency")
    plt.show()

def plot_rolling_var(returns: pd.Series, var_series: pd.Series, title: str = "Rolling VaR vs PnL"):
    aligned = pd.concat([returns, var_series], axis=1).dropna()
    plt.figure(figsize=(11,5))
    plt.plot(aligned.index, aligned.iloc[:,0].cumsum(), linewidth=1.2, label="Cumulative PnL")
    plt.plot(aligned.index, -aligned.iloc[:,1], linewidth=1.2, label="-(VaR)")
    plt.title(title)
    plt.xlabel("Date Index")
    plt.ylabel("Value")
    plt.legend()
    plt.show()



## 7. Example: Quick Run
Below we compute VaR/ES by multiple methods on the current `returns` series and show a histogram.


In [None]:

alpha = 0.99

h_var, h_es = historical_var_es(returns, alpha)
g_var, g_es = gaussian_var_es(returns, alpha)
cf_var, cf_es = cornish_fisher_var_es(returns, alpha)
mc_var, mc_es = monte_carlo_var_es(returns, alpha, n_sims=200000, dist="student_t", nu=5)

print("=== Daily VaR/ES (return space, positive magnitudes) ===")
print(f"Historical    VaR: {h_var:,.6f} | ES: {h_es:,.6f}")
print(f"Gaussian      VaR: {g_var:,.6f} | ES: {g_es:,.6f}")
print(f"CornishFisher VaR: {cf_var:,.6f} | ES: {cf_es:,.6f}")
print(f"MonteCarlo t5 VaR: {mc_var:,.6f} | ES: {mc_es:,.6f}")

# Scale to 10-day horizon and 1m notional
var_10d, es_10d = scale_to_horizon(h_var, h_es, horizon_days=10, method="sqrt_time")
var_cash, es_cash = scale_to_notional(var_10d, es_10d, notional=1_000_000)
print("\n=== 10-Day Historical (sqrt-time) in currency for 1,000,000 notional ===")
print(f"VaR: {var_cash:,.2f} | ES: {es_cash:,.2f}")

plot_loss_histogram_with_var(returns, h_var, bins=60, title=f"Loss Dist. & Historical VaR ({int(alpha*100)}%)")



## 8. Scenario & Backtest Example
- Apply a **volatility ×2** stress scenario (Monte Carlo, Student‑t).
- Compute a **rolling historical VaR** and run **Kupiec POF**.


In [None]:

# Scenario: vol up 2x, mean -0.5 * original mean
s_var, s_es = scenario_var_es(returns, alpha=alpha, vol_scale=2.0, mean_shift=-0.5*returns.mean(),
                              method="monte_carlo", dist="student_t", nu=5, n_sims=150000)
print(f"Scenario (volx2, mean down): VaR {s_var:,.6f} | ES {s_es:,.6f}")

# One-day -10% crash injection (historical)
c_var, c_es = one_day_crash_var_es(returns, alpha=alpha, crash_return=-0.10)
print(f"One-Day Crash (-10%) Historical: VaR {c_var:,.6f} | ES {c_es:,.6f}")

# Backtest
roll = rolling_var(returns, alpha=alpha, window=250, method="historical")
lr, p, x, exp_x = kupiec_pof_test(returns, roll, alpha=alpha)
print(f"Kupiec POF: LR={lr:,.3f}, p={p:,.4f}, exceed={x}, expected={exp_x:,.2f}")

plot_rolling_var(returns, roll, title=f"Rolling Historical VaR ({int(alpha*100)}%) vs CumPnL")



## 9. Using with Your Portfolio
- Replace the synthetic data by setting `csv_path` to your own file in the **Data** cell.
- If you have **multiple assets**, compute portfolio returns with weights:

```python
# df_prices: DataFrame of prices (columns = tickers)
log_returns = np.log(df_prices).diff().dropna()
weights = np.array([0.4, 0.3, 0.3])  # example
port_rets = (log_returns @ weights).rename("log_return")
```
- Then run the same VaR/ES functions on `port_rets`.
