<img src="https://raw.githubusercontent.com/daniel-om-weber/identibench/main/assets/logo.svg" width="200" align="left" alt="identibench logo">

## identibench
[![PyPI version](https://badge.fury.io/py/identibench.svg)](https://badge.fury.io/py/sysbench-loader)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Docs Status](https://img.shields.io/badge/docs-up_to_date-brightgreen.svg)](https://daniel-om-weber.github.io/identibench/)
[![Python Versions](https://img.shields.io/pypi/pyversions/sysbench-loader)](https://pypi.org/project/sysbench-loader/)

The `identibench` package provides a collection of standardized data loaders for common system identification benchmark datasets. It downloads, prepares, and converts various benchmark datasets into a unified HDF5 format, making them readily available for machine learning and system identification applications.

## Install

```sh
pip install identibench
```

## Features

- Downloads benchmark datasets from various sources
- Converts data to standardized HDF5 format
- Splits data into train/validation/test sets
- Provides consistent interface across different benchmarks
- Handles setup and cleanup of downloaded files

## Available Benchmarks

The package includes loaders for the following benchmark datasets:

### Nonlinear Systemidentification Workshop Benchmarks
- **Wiener-Hammerstein**: Electronic nonlinear system
- **Silverbox**: Electronic circuit with nonlinear feedback
- **Cascaded Tanks**: Fluid dynamics system
- **EMPS**: Electro-Mechanical Positioning System
- **Noisy Wiener-Hammerstein**: WH system with process noise

### Robotic Systems
- **Industrial Robot**: Forward and inverse identification models
- **Quad Pelican**: Quadrotor UAV system
- **Quad Pi**: Raspberry Pi-based quadrotor system

### Other Systems
- **Ship**: Ship propulsion and steering dynamics
- **Broad**: Broad spectrum system identification dataset

In [None]:
# Basic usage
import identibench as idb
from pathlib import Path

# Example: Download a single dataset
# Note: Always use a Path object, not a string
save_path = Path('./tmp/wh')
idb.datasets.workshop.wiener_hammerstein(save_path)

In [None]:
from sysidentpy.model_structure_selection import FROLS
from sysidentpy.parameter_estimation import LeastSquares
def build_frols_model(context):
    u_train, y_train, _ = next(context.get_train_sequences())
    
    ylag = context.hyperparameters.get('ylag', 5)
    xlag = context.hyperparameters.get('xlag', 5)
    n_terms = context.hyperparameters.get('n_terms', 10)
    estimator = context.hyperparameters.get('estimator', LeastSquares())

    _model = FROLS(xlag=xlag, ylag=ylag, n_terms=n_terms,estimator=estimator)
    _model.fit(X=u_train, y=y_train)

    def model(u_test, y_init):
        nonlocal _model
        yhat_full = _model.predict(X=u_test, y=y_init[:_model.max_lag])
        y_pred = yhat_full[_model.max_lag:]
        return y_pred
    
    return model

In [None]:
hyperparams = {
    'ylag': 2,
    'xlag': 2,
    'n_terms': 10, # Number of terms for FROLS
    'estimator': LeastSquares()
}

idb.run_benchmark(
    spec=idb.BenchmarkWH_Simulation,
    build_model=build_frols_model,
    hyperparameters=hyperparams
)

{'benchmark_name': 'BenchmarkWH_Simulation',
 'dataset_id': 'wh',
 'hyperparameters': {'ylag': 2,
  'xlag': 2,
  'n_terms': 10,
  'estimator': <sysidentpy.parameter_estimation.estimators.LeastSquares>},
 'seed': 3194310919,
 'training_time_seconds': 0.9640552089986159,
 'test_time_seconds': 1.0245963339984883,
 'benchmark_type': 'BenchmarkSpecSimulation',
 'metric_score': 0.2059879758473603}

## HDF5 Data Format

Each dataset is converted to a standard HDF5 format with the following structure:
- Train/valid/test split in separate directories
- Input data stored as 'u0', 'u1', etc. (one per input dimension)
- Output data stored as 'y0', 'y1', etc. (one per output dimension)
- Data converted to 32-bit float (f4) for consistency

This standardized format makes it easy to use these datasets with any machine learning framework that supports HDF5 files.
