## Unit Validation and Conversion with TimeDB

This notebook demonstrates TimeDB's unit handling using pint-pandas:
1. Uploading data with units via `dtype="pint[MW]"`
2. Reading data back with units preserved in dtypes
3. Automatic conversion between compatible units (kW -> MW)
4. Rejection of incompatible units (MWh vs MW)

In [1]:
import pandas as pd
import pint_pandas
from datetime import datetime, timezone, timedelta
from dotenv import load_dotenv
from timedb import TimeDataClient
load_dotenv()

td = TimeDataClient()
td.delete()
td.create()

Creating database schema...
âœ“ Schema created successfully


### Upload Data with Units

Create series and insert data using pint-pandas dtypes. Units are extracted from the dtype automatically.

In [2]:
base_time = datetime(2025, 1, 1, 0, 0, tzinfo=timezone.utc)
times = [base_time + timedelta(hours=i) for i in range(24)]

# Create series with specific units
series_defs = [
    {"name": "power", "unit": "MW"},
    {"name": "wind_speed", "unit": "m/s"},
    {"name": "temperature", "unit": "degree_Celsius"},
]
for s in series_defs:
    td.create_series(**s)

# Insert data with pint-pandas dtypes
for name, unit, values in [
    ("power", "MW", [1.0 + i * 0.05 for i in range(24)]),
    ("wind_speed", "m/s", [5.0 + i * 0.2 for i in range(24)]),
    ("temperature", "degree_Celsius", [20.0 + i * 0.5 for i in range(24)]),
]:
    df = pd.DataFrame({
        "valid_time": times,
        "value": pd.Series(values, dtype=f"pint[{unit}]"),
    })
    td.series(name).insert(df)

print("Inserted 3 series with units")

Inserted 3 series with units


In [3]:
# Verify what was created
for s in td.series().list_series():
    print(f"  {s['name']}: unit={s['unit']}  overlapping={s['overlapping']}")

  power: unit=MW  overlapping=False
  temperature: unit=degree_Celsius  overlapping=False
  wind_speed: unit=m/s  overlapping=False


### Read Data with Units

When reading back, each column has a pint-pandas dtype showing its unit.

In [4]:
# Read each series back
df_power = td.series("power").read()
df_wind = td.series("wind_speed").read()
df_temp = td.series("temperature").read()

# Rename columns to distinguish series when concatenating
df_power = df_power.rename(columns={'value': 'power'})
df_wind = df_wind.rename(columns={'value': 'wind_speed'})
df_temp = df_temp.rename(columns={'value': 'temperature'})

df_all = pd.concat([df_power, df_wind, df_temp], axis=1)

print("Column dtypes (units preserved):")
print(df_all.dtypes)
print()
df_all.head()

Column dtypes (units preserved):
value     float64
name          str
unit          str
labels     object
value     float64
name          str
unit          str
labels     object
value     float64
name          str
unit          str
labels     object
dtype: object



Unnamed: 0_level_0,Unnamed: 1_level_0,value,name,unit,labels,value,name,unit,labels,value,name,unit,labels
valid_time,series_id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2025-01-01 00:00:00+00:00,1,1.0,power,MW,{},,,,,,,,
2025-01-01 01:00:00+00:00,1,1.05,power,MW,{},,,,,,,,
2025-01-01 02:00:00+00:00,1,1.1,power,MW,{},,,,,,,,
2025-01-01 03:00:00+00:00,1,1.15,power,MW,{},,,,,,,,
2025-01-01 04:00:00+00:00,1,1.2,power,MW,{},,,,,,,,


### Unit Conversion

Compatible units are automatically converted. Inserting kW values into a MW series converts them.

In [6]:
# Insert kilowatt values into a megawatt series - auto-converted
new_times = [base_time + timedelta(hours=i) for i in range(24, 48)]

df_kw = pd.DataFrame({
    "valid_time": new_times,
    "value": pd.Series([500.0] * 24, dtype="pint[kW]"),
})

td.series("power").insert(df_kw)

# Read back - should show 0.5 MW (500 kW converted)
df_check = td.series("power").read(start_valid=new_times[0], end_valid=new_times[0] + timedelta(hours=1))
print(f"Inserted 500 kW, stored as: {df_check['value'].iloc[0]} (auto-converted)")

                                     value   name unit labels
valid_time                series_id                          
2025-01-02 00:00:00+00:00 1          500.0  power   MW     {}


KeyError: 'power'

### Unit Validation

Incompatible units are rejected. MWh (energy) cannot be stored in a MW (power) series.

In [None]:
# Try inserting MWh (energy) into MW (power) series - should fail
df_mwh = pd.DataFrame({
    "valid_time": new_times,
    "value": pd.Series([10.0] * 24, dtype="pint[MWh]"),
})

try:
    td.series("power").insert(df_mwh)
    print("Unexpected: should have failed")
except Exception as e:
    print(f"Rejected: {type(e).__name__}")
    print(f"  {e}")

### Summary

- Use `pd.Series(values, dtype="pint[MW]")` to attach units to DataFrame columns
- Units are extracted from pint-pandas dtypes and stored with the series
- Compatible units (kW, MW, W) are automatically converted on insert
- Incompatible units (MW vs MWh) raise `IncompatibleUnitError`
- `read()` returns DataFrames with pint-pandas dtypes preserving unit information