# Run A Trained Checkpoint On `data/processed/SIM_RPT`

This notebook loads an **existing trained model checkpoint** and runs inference
on the simulated sparse RPT cells stored under `data/processed/SIM_RPT`.

Important: the model was trained with feature/label transformations (e.g. Z-score).
Those fitted transformation parameters are **not stored in the checkpoint**.

So we reuse the `DataBundle` snapshot saved by BatteryML during evaluation
(`predictions_seed_*.pkl`), which contains fitted transforms.

In [1]:
from __future__ import annotations

import os
import pickle
from pathlib import Path

import pandas as pd
import torch
import yaml

from batteryml.builders import FEATURE_EXTRACTORS, LABEL_ANNOTATORS, MODELS
from batteryml.data.databundle import Dataset
from batteryml.data.battery_data import BatteryData

print('cwd:', os.getcwd())
print('torch:', torch.__version__)

cwd: c:\Users\FanWang\Documents\GitHub\BatteryML
torch: 2.10.0+cu126


## Locate the checkpoint workspace

By default we look under `workspaces/rpt_soh600_demo/` for the most recently
modified `latest.ckpt`. If you want to override, set `WORKSPACE_DIR` explicitly.

In [2]:
# Option A: auto-find
WS_ROOT = Path('workspaces/rpt_soh600_demo')

# Option B: manually set
# WORKSPACE_DIR = Path('workspaces/rpt_soh600_demo/<your_run_folder>')
WORKSPACE_DIR = None

def newest_file(glob_iter):
    files = list(glob_iter)
    if not files:
        return None
    return max(files, key=lambda p: p.stat().st_mtime)

if WORKSPACE_DIR is None:
    ckpt = newest_file(WS_ROOT.rglob('latest.ckpt'))
    if ckpt is None:
        raise FileNotFoundError(f'No latest.ckpt found under: {WS_ROOT}')
    WORKSPACE_DIR = ckpt.parent
else:
    ckpt = WORKSPACE_DIR / 'latest.ckpt'
    if not ckpt.exists():
        raise FileNotFoundError(f'Missing checkpoint: {ckpt}')

print('Using workspace:', WORKSPACE_DIR)
print('Using checkpoint:', ckpt)

Using workspace: workspaces\rpt_soh600_demo\rwth_soh600_rptnorm_rf
Using checkpoint: workspaces\rpt_soh600_demo\rwth_soh600_rptnorm_rf\latest.ckpt


## Load the training config + fitted transforms snapshot

BatteryML copies the config into workspace as `config_<timestamp>.yaml` and
writes evaluation artifacts as `predictions_seed_<seed>_<timestamp>.pkl`
which includes a `DataBundle` object with fitted transforms.

In [3]:
cfg_path = newest_file(WORKSPACE_DIR.glob('config_*.yaml'))
pred_path = newest_file(WORKSPACE_DIR.glob('predictions_seed_*.pkl'))

if cfg_path is None:
    raise FileNotFoundError(f'No config_*.yaml found in {WORKSPACE_DIR}')
if pred_path is None:
    raise FileNotFoundError(
        f'No predictions_seed_*.pkl found in {WORKSPACE_DIR}. '
        'Run evaluation once to generate it (batteryml run ... --eval).'
    )

with open(cfg_path, 'r', encoding='utf-8') as f:
    cfg = yaml.safe_load(f)

with open(pred_path, 'rb') as f:
    pred_obj = pickle.load(f)

trained_bundle = pred_obj['data']  # DataBundle with fitted transforms

print('Config:', cfg_path)
print('Pred snapshot:', pred_path)
print('Has feature_transformation:', trained_bundle.feature_transformation is not None)
print('Has label_transformation:', trained_bundle.label_transformation is not None)

Config: workspaces\rpt_soh600_demo\rwth_soh600_rptnorm_rf\config_20260201144646.yaml
Pred snapshot: workspaces\rpt_soh600_demo\rwth_soh600_rptnorm_rf\predictions_seed_0_20260201144648.pkl
Has feature_transformation: True
Has label_transformation: True


## Load the model from checkpoint

This uses the model class defined in the config and loads `latest.ckpt`.

In [4]:
model = MODELS.build(cfg['model'])
model.load_checkpoint(str(ckpt))
model = model.to('cpu')

print('Model:', type(model).__name__)

Model: RandomForestRULPredictor


## Run inference on `data/processed/SIM_RPT`

We compute features using the same feature extractor as in training config,
apply the fitted feature transformation from the snapshot DataBundle, then
predict and inverse-transform outputs back to SOH space.

In [6]:
SIM_DIR = Path('data/processed/SIM_RPT')
sim_paths = sorted(SIM_DIR.glob('*.pkl'))
if not sim_paths:
    raise FileNotFoundError(f'No .pkl files found in: {SIM_DIR}')

cells = [BatteryData.load(p) for p in sim_paths]
cell_ids = [c.cell_id for c in cells]
print('Loaded cells:', len(cells))
print(''.join(cell_ids))

feature_extractor = FEATURE_EXTRACTORS.build(cfg['feature'])
label_annotator = LABEL_ANNOTATORS.build(cfg['label'])

X = feature_extractor(cells).to('cpu')
y_true = label_annotator(cells).to('cpu')

# Apply the SAME fitted transforms used during training
if trained_bundle.feature_transformation is not None:
    X = trained_bundle.feature_transformation.transform(X)

y_for_dataset = y_true
if trained_bundle.label_transformation is not None:
    y_for_dataset = trained_bundle.label_transformation.transform(y_true.float())

# Create a minimal dataset object by reusing the trained DataBundle
trained_bundle = trained_bundle.to('cpu')
trained_bundle.test_data = Dataset(X, y_for_dataset)

pred = model.predict(trained_bundle, data_type='test')

pred_soh = pred
true_soh = y_true
if trained_bundle.label_transformation is not None:
    pred_soh = trained_bundle.label_transformation.inverse_transform(pred_soh)

out = pd.DataFrame({
    'cell_id': cell_ids,
    'true_soh600': true_soh.numpy(),
    'pred_soh600': pred_soh.numpy(),
})
out['abs_err'] = (out['pred_soh600'] - out['true_soh600']).abs()
out

Loaded cells: 1
RPT_SIM_RWTH_002


Extracting features: 100%|██████████| 1/1 [00:00<00:00, 142.50it/s]


Unnamed: 0,cell_id,true_soh600,pred_soh600,abs_err
0,RPT_SIM_RWTH_002,0.848514,0.84588,0.002634


## Notes

- If this fails due to missing `predictions_seed_*.pkl`, run one evaluation pass
  on the same workspace to generate the DataBundle snapshot.
- If `SIM_RPT` has different voltage window than training, predictions may be poor
  unless your feature extractor config uses an appropriate `min_voltage_in_V/max_voltage_in_V`.