# Monitoring & Drift

## Notebook Guide

- **Purpose:** Check data/model drift and retraining signals.
- **Inputs:** features.parquet + train split
- **Outputs:** drift summary.
- **Run:** Execute cells top‑to‑bottom. If a file is missing, run the earlier pipeline notebook first.


## Monitoring & Drift

Run data/model drift checks and review retraining decisions.

In [None]:
from pathlib import Path
repo_root = Path.cwd().parent if Path.cwd().name == 'notebooks' else Path.cwd()
print('repo_root:', repo_root)

In [None]:
from gridpulse.monitoring.retraining import load_monitoring_config, compute_data_drift, compute_model_metrics_gbm, evaluate_model_drift, retraining_decision
import pandas as pd

cfg = load_monitoring_config()
features_path = repo_root / 'data' / 'processed' / 'features.parquet'
train_path = repo_root / 'data' / 'processed' / 'splits' / 'train.parquet'
if not features_path.exists() or not train_path.exists():
    print('Missing processed data. Run pipeline first.')
else:
    df = pd.read_parquet(features_path).sort_values('timestamp')
    train_df = pd.read_parquet(train_path)
    current_df = df.tail(7*24)
    feature_cols = [c for c in df.columns if c not in {'timestamp','load_mw','wind_mw','solar_mw'}]
    drift = compute_data_drift(train_df, current_df, feature_cols)
    drift