Skip to content

noah-puetz/MuViS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MuViS Pipeline

MuViS: Multimodal Virtual Sensing Benchmark

This repository contains the MuViS codebase - dataset preprocessing, unified time-series I/O, configuration driven experiment runners and logging utilities for reproducible benchmarking across datasets. The corresponding paper is available at: <PLACEHOLDER>

Abstract: Virtual sensing infers hard-to-measure quantities from accessible measurements and is central to perception and control in physical systems. Despite rapid progress from first-principle and hybrid models to modern data-driven methods research remains siloed, leaving no established default approach that transfers across processes, modalities, and sensing configurations. We introduce \textsc{MuViS}, a domain-agnostic benchmarking suite for multimodal virtual sensing that consolidates diverse datasets into a unified interface for standardized preprocessing and evaluation. Using this framework, we benchmark representative approaches spanning gradient-boosted decision trees and deep neural network (NN) architectures, and quantify how close current methods come to a broadly useful default. \textsc{MuViS} is released as an open-source, extensible platform for reproducible comparison and future integration of new datasets and model classes.

Overview

Virtual sensing aims to infer hard-to-measure quantities from accessible primary measurements and is central to perceiving and controlling physical systems. Despite rapid progress, research is typically siloed in narrow application domains, limiting insight into how well approaches generalize.

MuViS is a comprehensive, domain-agnostic benchmarking suite for multimodal virtual sensing. It addresses the heterogeneity in file formats, split definitions, and sequence lengths by providing a framework that:

  • Standardizes data preprocessing: Converts raw datasets from data/raw/<dataset>/ into a consistent .ts format in data/processed/<dataset>/ with predefined train-test splits (train.ts, test.ts), uniform sample shapes (X: N×T×C, y: N), and consistent missing value treatment.
  • Enables reproducible experiments: Provides config-driven training pipelines to systematically benchmark neural networks and tree-based models across multiple datasets.

Model-specific preprocessing operations (e.g., standardization, sequence flattening) are performed within training scripts to maintain flexibility in model architecture design.

Datasets

MuViS Pipeline

MuViS aggregates six benchmark datasets spanning environmental monitoring, health sensing, vehicle dynamics, tire thermodynamics, chemical process monitoring, and electrochemical energy systems.

Dataset Domain Target Inputs Features ($D$) Steps ($T$)
Beijing Air Quality Environmental PM2.5 / PM10 Pollutants & Meteorology 9 24
Revs Program Automotive Lateral Velocity ($v_y$) Driver inputs, IMU, Wheel speeds 12 20
Tire Temperature Automotive Tire Temp ($t_{tire}$) Vehicle motion, Control inputs 11 50
Tennessee Eastman Industrial Chemical Conc. Process vars & Manipulated vars 33 20
Panasonic 18650PF Energy State-of-Charge (SoC) Voltage, Current, Temp 7 120
PPG-DaLiA Health Heart Rate (BPM) BVP, EDA, Temp, Accel 6 512

Baselines

We benchmark representative learning approaches spanning gradient-boosted decision trees and deep neural network (NN) architectures:

  • Tree-based: XGBoost, CatBoost
  • Neural Networks: MLP, ResNet1D, LSTM, Transformer

Prerequisites

  • Python >=3.13
  • Set up the environment and install MuViS.
# Create and activate environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies and the package
pip install -r requirements.txt
pip install -e .
MuViS/
├── ...
├── data/
│   ├── raw/                             
│   │   ├── BeijingPM10Quality/
│   │   │   ├── BeijingPM10Quality_TEST.ts
│   │   │   └── BeijingPM10Quality_TRAIN.ts
│   │   ├── BeijingPM25Quality/
│   │   │   ├── BeijingPM25Quality_TEST.ts
│   │   │   └── BeijingPM25Quality_TRAIN.ts
│   │   ├── Panasonic18650PFData/
│   │   │   ├── Normalization/
│   │   │   ├── Test/
│   │   │   ├── Train/
│   │   │   └── Validation/
│   │   ├── PPGDalia/
│   │   │   ├── PPG_FieldStudy/
│   │   │   ├── data.zip
│   │   │   └── readme.pdf
│   │   ├── REVS/
│   │   │   ├── 2013_Monterey_Motorsports_Reunion/
│   │   │   │   └── *.csv
│   │   │   ├── 2013_Targa_Sixty_Six/
│   │   │   │   └── *.csv
│   │   │   └── 2014_Targa_Sixty_Six/
│   │   │       └── *.csv
│   │   ├── TennesseeEastmanProcess/
│   │   │   ├── TEP_FaultFree_Testing.RData
│   │   │   └── TEP_FaultFree_Training.RData
│   │   ├── VehicleDynamicsDataset/
│   │   │   ├── Nov2023/
│   │   │   │   └── *.csv
│   │   │   └── Oct2023/
│   │   │       └── *.csv
│   │   └── ...
│   └── processed/                       
└── ...

Preprocessing

To preprocess the raw datasets into the standardized .ts format, run:

python src/muvis/data_utils/preprocess.py

Run the training

Single experiment

Execute the following command to run a single experiment:

python main.py single --runconf configs/<DATASET_NAME>/<MODEL_NAME>.yaml

Multiple Experiments

To run multiple experiments and save the results to a CSV file, use the command below:

python main.py batch \
  --configs \
    configs/<DATASET_NAME_1>/<MODEL_NAME_1>.yaml \
    configs/<DATASET_NAME_2>/<MODEL_NAME_2>.yaml \
  --metric test_rmse \
  --output experiment_results.csv

Reproduce Results

Our evaluation demonstrates that while gradient-boosted ensembles remain highly competitive, the landscape is nuanced, with specific NN architectures excelling in distinct domains. No single architecture attains a statistically superior edge across the entire benchmark, underscoring the need for specialized architectures in virtual sensing.

To reproduce the results from the paper you can run:

bash run.sh

Contributing

Add your own dataset

Each dataset must ultimately produce two files:

  • train.ts
  • test.ts

Step 1. Place Raw Data

Copy your raw dataset files into: data/raw/<YourDatasetName>/

Note: MuViS does not impose any restrictions on the raw data format.

Step 2: Add a Dataset Converter

MuViS handles different raw dataset formats by converting them into a common .ts representation using dataset-specific converters. All converters live in:

src/muvis/data_utils/converters.py

Each dataset is implemented as a subclass of BaseConverter. To add a new dataset, create a new class that inherits from BaseConverter and implement the load_raw() method.

At a minimum, every converter must:

  1. Read raw files from data/raw/<YourDataset>/
  2. Split data into train and test sequences
  3. Generate fixed-length sliding windows
  4. Return data in MuViS’s internal case format

Step 3: Run Preprocessing

Once the converter is implemented, add it to the command-line interface at src/muvis/data_utils/preprocess.py and run:

python src/muvis/data_utils/preprocess.py --dataset <YourDatasetName> 

Add your own model

MuViS supports both neural and tree-based models.

Step 1: Implement the Model

Model implementations are located in the following files:

Step 2: Create a Configuration File

Each experiment is controlled via a YAML configuration file. Create configs/<YourDatasetName>/<YourModelName>.yaml specifying the model type, hyperparameters, and training settings.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors