# MetDataPy Quickstart Tutorial

This notebook demonstrates a complete end-to-end workflow using MetDataPy:

1. **Data Ingestion** - Load raw weather data with automatic column mapping
2. **Quality Control** - Detect and flag anomalies (spikes, flatlines, out-of-range values)
3. **Derived Metrics** - Calculate dew point, VPD, heat index, and wind chill
4. **Data Preparation** - Handle gaps, resample, add calendar features
5. **ML Preparation** - Create supervised dataset with lags and time-safe splits
6. **Export** - Save processed data to Parquet

## Prerequisites

```bash
pip install -e .
pip install jupyter matplotlib
```


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

# MetDataPy imports
from metdatapy.mapper import Detector, Mapper
from metdatapy.core import WeatherSet
from metdatapy.mlprep import make_supervised, time_split, scale

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 120)

print("✓ Imports successful")


## 1. Load Raw Data

We'll use the sample weather dataset generated for this tutorial.


In [None]:
# Load raw CSV data
data_path = "../data/sample_weather_2024.csv"
df_raw = pd.read_csv(data_path)

print(f"Loaded {len(df_raw):,} records")
print(f"\nColumns: {list(df_raw.columns)}")
print(f"\nFirst few rows:")
df_raw.head()
