### 1. Stress Event Model (Operational Risk)

**Question it answers**

> *“What is the probability that the pipeline is under operational stress?”*

This model focuses on **operational conditions**, not price behavior.

### What goes into the model
- Pipeline notices and alerts
- Indicators of operational disruption
- Weather-driven demand pressure (HDD)
- Storage tightness (supply buffer)

### What the model produces
- A probability between **0 and 1** representing the likelihood of operational stress

### Interpretation
- A high value means **conditions are abnormal**
- A low value means **operations appear normal**

This model acts as an **early-warning system** for potential pipeline issues.


In [1]:
import sys
from pathlib import Path

PROJECT_ROOT = Path.cwd().parent
sys.path.insert(0, str(PROJECT_ROOT))

In [2]:
import os
import pandas as pd
from dotenv import load_dotenv
import matplotlib.pyplot as plt
from sklearn.metrics import roc_auc_score

In [3]:
load_dotenv()

True

In [12]:
from models.features.build_model_frame import ModelFrameConfig, build_model_frame
from models.features.derive_forecast_inputs import derive_forecast_inputs
from models.stress.stress_event_model import fit_stress_event_model
from models.stress.forecast_stress_event import forecast_stress_event_prob

In [5]:
from data.mongo import load_henry_hub_daily, load_storage_weekly, \
                    load_noaa_region_daily, \
                    load_capacity_df, load_notices_df

In [6]:
MONGO_DB = os.getenv("MONGO_DB")
MONGO_URI = os.getenv("MONGO_URI")

### 1. Build Model Frame

In [7]:
cfg = ModelFrameConfig(
    pipeline="algonquin",
    capacity_collection="ebb_algonquin_capacity",
    notices_collection="ebb_algonquin_notices",
    start="2025-10-01",
)

In [8]:
df=build_model_frame(cfg, mongo_uri=MONGO_URI, mongo_db=MONGO_DB)

In [None]:
# -----------------------------
# Train / Test Split (time-based)
# -----------------------------

train_end = pd.Timestamp("2025-11-01")

df_train = df[df["date"] <= train_end].copy()
df_test = df[df["date"] > train_end].copy()

print(f"Train rows: {len(df_train)}")
print(f"Test rows: {len(df_test)}")
print(df_train["date"].min(), "→", df_train["date"].max())
print(df_test["date"].min(), "→", df_test["date"].max())

### 2. Fit Stress Event Model

In [9]:
model_event, idata_event, scalers_event = fit_stress_event_model(df_train)

Sampling 4 chains for 1_500 tune and 1_500 draw iterations (6_000 + 6_000 draws total) took 17 seconds.


### 4. Out-of-Sample Checks

In [None]:
results = []

for i in range(len(df_test)):
    row = df_test.iloc[: i + 1]  # expanding window
    
    x_event = derive_forecast_inputs(row)

    p_event = forecast_stress_event_prob(
        model_event,
        idata_event,
        x=x_event,
        scalers=scalers_event,
    )[1]

    results.append(
        {
            "date": row.iloc[-1]["date"],
            "p_stress": p_event,
        }
    )

df_oos = pd.DataFrame(results)


### 5. Diagnostics & Sanity Checks

#### 5.1 Sanity checks

In [None]:
df_oos.describe()

##### Time series visualization

In [None]:
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(df_oos["date"], df_oos["p_stress"], label="Stress Probability")
ax.legend()
ax.set_title("Out-of-Sample Risk Signals")
plt.show()

#### 5.2 Event discrimination (does the stress model work?)

##### Join with realized stress

In [None]:
df_eval = df_oos.merge(
    df[["date", "stress_event"]],
    on="date",
    how="left",
)

##### Compare distributions

In [None]:
df_eval.groupby("stress_event")["p_stress"].describe()


##### Simple AUC

In [None]:
auc = roc_auc_score(df_eval["stress_event"], df_eval["p_stress"])
auc

#### 5.3 Threshold-based evaluation

In [None]:
STRESS_ALERT = 0.30

df_eval["stress_alert"] = df_eval["p_stress"] > STRESS_ALERT



In [None]:
pd.crosstab(
    df_eval["stress_alert"],
    df_eval["stress_event"],
    normalize="index",
)


#### 5.4 Compare against a naive baseline (critical)

In [None]:
df_eval["baseline_stress"] = (
    df["stress_event"]
    .rolling(30)
    .mean()
    .shift(1)
)


In [None]:
roc_auc_score(
    df_eval["stress_event"].dropna(),
    df_eval["baseline_stress"].dropna(),
), roc_auc_score(
    df_eval["stress_event"].dropna(),
    df_eval["p_stress"].dropna(),
)