# Weather Features (Optional)


## Notebook Guide

- **Purpose:** Optional weather ingestion and feature augmentation.
- **Inputs:** weather CSV + raw OPSD
- **Outputs:** features.parquet with weather columns.
- **Run:** Execute cells top‑to‑bottom. If a file is missing, run the earlier pipeline notebook first.


Optional: download and merge weather data into features for improved forecasting.

In [None]:
# Environment setup
import sys, subprocess
from pathlib import Path

print('Python:', sys.executable)
repo_root = Path.cwd().parent if Path.cwd().name == 'notebooks' else Path.cwd()
subprocess.run(['pip', 'install', '-e', str(repo_root)], check=True)

In [None]:
import subprocess

subprocess.run(['python', '-m', 'gridpulse.data_pipeline.download_weather', '--out', 'data/raw', '--start', '2017-01-01', '--end', '2020-12-31'], check=True)
subprocess.run(['python', '-m', 'gridpulse.data_pipeline.build_features', '--in', 'data/raw', '--out', 'data/processed', '--weather', 'data/raw/weather_berlin_hourly.csv'], check=True)


## Visual Sanity Checks

These plots provide a quick visual validation of recent behavior.

In [None]:
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt

repo_root = Path.cwd().parent if Path.cwd().name == 'notebooks' else Path.cwd()
features_path = repo_root / 'data' / 'processed' / 'features.parquet'
if not features_path.exists():
    print('features.parquet not found. Run the feature pipeline first.')
else:
    df_viz = pd.read_parquet(features_path).sort_values('timestamp')
    if {'load_mw','wind_mw','solar_mw'}.issubset(df_viz.columns):
        recent = df_viz.tail(7 * 24)
        fig, ax = plt.subplots(3, 1, figsize=(12, 7), sharex=True)
        recent.plot(x='timestamp', y='load_mw', ax=ax[0], color='#1f77b4', title='Load (last 7 days)')
        recent.plot(x='timestamp', y='wind_mw', ax=ax[1], color='#2ca02c', title='Wind (last 7 days)')
        recent.plot(x='timestamp', y='solar_mw', ax=ax[2], color='#ff7f0e', title='Solar (last 7 days)')
        plt.tight_layout()
    else:
        print('Expected columns not found in features.parquet')
