## Unit Validation and Conversion with TimeDB

This notebook demonstrates TimeDB's unit handling using pint-pandas:
1. Uploading data with units via `dtype="pint[MW]"`
2. Reading data back with units preserved in dtypes
3. Automatic conversion between compatible units (kW -> MW)
4. Rejection of incompatible units (MWh vs MW)

In [1]:
import pandas as pd
import pint_pandas
from datetime import datetime, timezone, timedelta
from dotenv import load_dotenv
from timedb import TimeDataClient
load_dotenv()

td = TimeDataClient()
td.delete()
td.create()

Creating database schema...
âœ“ Schema created successfully


### Upload Data with Units

Create series and insert data using pint-pandas dtypes. Units are extracted from the dtype automatically.

In [2]:
base_time = datetime(2025, 1, 1, 0, 0, tzinfo=timezone.utc)
times = [base_time + timedelta(hours=i) for i in range(24)]

# Create series with specific units
series_defs = [
    {"name": "power", "unit": "MW"},
    {"name": "wind_speed", "unit": "m/s"},
    {"name": "temperature", "unit": "degree_Celsius"},
]
for s in series_defs:
    td.create_series(**s)

# Insert data with pint-pandas dtypes
for name, unit, values in [
    ("power", "MW", [1.0 + i * 0.05 for i in range(24)]),
    ("wind_speed", "m/s", [5.0 + i * 0.2 for i in range(24)]),
    ("temperature", "degree_Celsius", [20.0 + i * 0.5 for i in range(24)]),
]:
    df = pd.DataFrame({
        "valid_time": times,
        name: pd.Series(values, dtype=f"pint[{unit}]"),
    })
    td.series(name).insert(df)

print("Inserted 3 series with units")

Inserted 3 series with units


In [3]:
# Verify what was created
for s in td.series().list_series():
    print(f"  {s['name']}: unit={s['unit']}  overlapping={s['overlapping']}")

  power: unit=MW  overlapping=False
  temperature: unit=degree_Celsius  overlapping=False
  wind_speed: unit=m/s  overlapping=False


### Read Data with Units

When reading back, each column has a pint-pandas dtype showing its unit.

In [4]:
# Read each series back
df_power = td.series("power").read()
df_wind = td.series("wind_speed").read()
df_temp = td.series("temperature").read()
df_all = pd.concat([df_power, df_wind, df_temp], axis=1)

print("Column dtypes (units preserved):")
print(df_all.dtypes)
print()
df_all.head()

Column dtypes (units preserved):
name
power                pint[megawatt][Float64]
wind_speed     pint[meter / second][Float64]
temperature    pint[degree_Celsius][Float64]
dtype: object



name,power,wind_speed,temperature
valid_time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2025-01-01 00:00:00+00:00,1.0,5.0,20.0
2025-01-01 01:00:00+00:00,1.05,5.2,20.5
2025-01-01 02:00:00+00:00,1.1,5.4,21.0
2025-01-01 03:00:00+00:00,1.15,5.6,21.5
2025-01-01 04:00:00+00:00,1.2,5.8,22.0


### Unit Conversion

Compatible units are automatically converted. Inserting kW values into a MW series converts them.

In [5]:
# Insert kilowatt values into a megawatt series - auto-converted
new_times = [base_time + timedelta(hours=i) for i in range(24, 48)]

df_kw = pd.DataFrame({
    "valid_time": new_times,
    "power": pd.Series([500.0] * 24, dtype="pint[kW]"),
})

td.series("power").insert(df_kw)

# Read back - should show 0.5 MW (500 kW converted)
df_check = td.series("power").read(start_valid=new_times[0], end_valid=new_times[0] + timedelta(hours=1))
print(f"Inserted 500 kW, stored as: {df_check.iloc[0]['power']} (auto-converted)")

Inserted 500 kW, stored as: 0.5 megawatt (auto-converted)


### Unit Validation

Incompatible units are rejected. MWh (energy) cannot be stored in a MW (power) series.

In [6]:
# Try inserting MWh (energy) into MW (power) series - should fail
df_mwh = pd.DataFrame({
    "valid_time": new_times,
    "power": pd.Series([10.0] * 24, dtype="pint[MWh]"),
})

try:
    td.series("power").insert(df_mwh)
    print("Unexpected: should have failed")
except Exception as e:
    print(f"Rejected: {type(e).__name__}")
    print(f"  {e}")

Rejected: IncompatibleUnitError
  Cannot convert megawatt_hour to MW: incompatible dimensionality (Cannot convert from 'megawatt_hour' ([mass] * [length] ** 2 / [time] ** 2) to 'megawatt' ([mass] * [length] ** 2 / [time] ** 3))


### Summary

- Use `pd.Series(values, dtype="pint[MW]")` to attach units to DataFrame columns
- Units are extracted from pint-pandas dtypes and stored with the series
- Compatible units (kW, MW, W) are automatically converted on insert
- Incompatible units (MW vs MWh) raise `IncompatibleUnitError`
- `read()` returns DataFrames with pint-pandas dtypes preserving unit information