# Open Electricity Data Toolkit — Quick Start

This notebook demonstrates the core workflow: **collect** electricity market data,
**store** it locally as Parquet, and **query** it back as clean DataFrames.

I'll use AESO (Alberta) data as an example. The same pattern works for IESO (Ontario).

In [1]:
from elec_data import Toolkit

tk = Toolkit(data_dir="./data")

## 1. Collect AESO price data

The `collect()` method fetches data from the upstream API, stores it in
year-partitioned Parquet files, and logs the collection. Large date ranges
are automatically chunked by month.

In [2]:
tk.collect(["AESO"], ["prices"], "2024-06-01", "2024-07-01")

Collecting data: 100%|██████████| 1/1 [00:01<00:00,  1.57s/chunk, AESO/prices 2024-06-01]


## 2. Query prices

`get_prices()` reads from the local Parquet store. If the requested range
isn't stored yet, it will auto-fetch from the API.

In [3]:
prices = tk.get_prices(["AESO"], "2024-06-01", "2024-07-01")
prices.head(10)

Unnamed: 0,timestamp_utc,market,price,currency,price_type,resolution_minutes,source
0,2024-06-01 06:00:00+00:00,AESO,30.83,CAD,pool,60,gridstatus_aeso
1,2024-06-01 07:00:00+00:00,AESO,20.23,CAD,pool,60,gridstatus_aeso
2,2024-06-01 08:00:00+00:00,AESO,13.56,CAD,pool,60,gridstatus_aeso
3,2024-06-01 09:00:00+00:00,AESO,10.53,CAD,pool,60,gridstatus_aeso
4,2024-06-01 10:00:00+00:00,AESO,12.43,CAD,pool,60,gridstatus_aeso
5,2024-06-01 11:00:00+00:00,AESO,17.28,CAD,pool,60,gridstatus_aeso
6,2024-06-01 12:00:00+00:00,AESO,20.34,CAD,pool,60,gridstatus_aeso
7,2024-06-01 13:00:00+00:00,AESO,21.92,CAD,pool,60,gridstatus_aeso
8,2024-06-01 14:00:00+00:00,AESO,22.15,CAD,pool,60,gridstatus_aeso
9,2024-06-01 15:00:00+00:00,AESO,21.46,CAD,pool,60,gridstatus_aeso


In [4]:
print(f"Rows:  {len(prices):,}")
print(f"Range: {prices['timestamp_utc'].min()} → {prices['timestamp_utc'].max()}")
print(f"Mean price: ${prices['price'].mean():.2f}/MWh")

Rows:  714
Range: 2024-06-01 06:00:00+00:00 → 2024-06-30 23:00:00+00:00
Mean price: $31.91/MWh


## 3. Plot prices

Alberta's pool price is set hourly and can spike dramatically during
periods of high demand or tight supply. The chart below shows the
intra-month price pattern for June 2024.

In [5]:
import plotly.express as px

fig = px.line(
    prices,
    x="timestamp_utc",
    y="price",
    title="AESO Pool Price — June 2024",
    labels={"timestamp_utc": "Date (UTC)", "price": "Price (CAD/MWh)"},
    template="plotly_white",
)
fig.update_traces(line_width=0.8)
fig.show()

## 4. Collect and view demand data

In [6]:
tk.collect(["AESO"], ["demand"], "2024-06-01", "2024-07-01")
demand = tk.get_demand(["AESO"], "2024-06-01", "2024-07-01")

fig = px.line(
    demand,
    x="timestamp_utc",
    y="demand_mw",
    title="AESO System Demand — June 2024",
    labels={"timestamp_utc": "Date (UTC)", "demand_mw": "Demand (MW)"},
    template="plotly_white",
)
fig.update_traces(line_width=0.8)
fig.show()

Collecting data: 100%|██████████| 1/1 [00:00<00:00,  1.56chunk/s, AESO/demand 2024-06-01]


## 5. Check data status

`status()` summarizes what's in the local store — which markets and data
types are available, and their date ranges.

In [7]:
tk.status()

Unnamed: 0,market,data_type,start,end
0,AESO,demand,2024-06-01 06:00:00+00:00,2024-07-02 05:00:00+00:00
1,AESO,prices,2024-06-01 06:00:00+00:00,2024-07-02 05:00:00+00:00


## Next steps

- **Add more markets:** `tk.collect(["IESO"], ["prices"], "2024-01-01", "2024-07-01")`
- **Generation data:** `tk.get_generation(["AESO"], "2024-06-01", "2024-07-01")`
- **Multi-market comparison** and **resampling** are on the roadmap (Phase 2).

All data is stored locally as Parquet in `./data/raw/`.