Skip to content

Lab-Work/i24-vsl-orl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Offline RL Benchmark for VSL Data

This benchmark folder contains the implementation of offline reinforcement learning algorithms on Variable Speed Limit (VSL) data. The benchmark includes multiple state-of-the-art offline RL algorithms and provides data and tools for training. The evaluation is conducted in TransModeler, a commercial microsimulation software.

Folder Structure

i24-vsl-orl/
├── code/                   # Source code
│   ├── algorithm/         # Offline RL algorithm implementations
│   │   ├── BC.py         # Behavior Cloning
│   │   ├── BCQ.py        # Batch Constrained Q-learning
│   │   ├── CQL.py        # Conservative Q-learning
│   │   ├── IQL.py        # Implicit Q-learning
│   │   └── TD3_BC.py     # TD3 + Behavior Cloning
│   ├── training.py       # Main training script
│   └── utils.py          # Utility functions and ReplayBuffer
├── dataset/              # Preprocessed datasets
│   ├── expert_small.pkl  # Small expert dataset
│   ├── expert_medium.pkl # Medium expert dataset
│   ├── expert_large.pkl  # Large expert dataset
│   ├── mixed_small.pkl   # Small mixed dataset
│   ├── mixed_medium.pkl  # Medium mixed dataset
│   └── mixed_large.pkl   # Large mixed dataset
├── saved_models/         # Trained model checkpoints
│   └── training_with_*/  # Models organized by dataset size
│       └── run_*/        # Multiple training runs
└── README.md            # This file

Datasets

Full Dataset Availability

The complete VSL operation dataset is available on Zenodo: Offline Reinforcement Learning Dataset from a Field Deployed Variable Speed Limit Control System (DOI: 10.5281/zenodo.16376854). This massive dataset contains ~100 million data samples from 18 months of continuous VSL operation data on Interstate 24 (I-24) with 67 VSL controllers operating on both directions of the freeway, covering the period from March 9, 2024, to September 9, 2025. The dataset has been converted to offline RL format for research purposes.

Preprocessed Benchmark Datasets

The benchmark includes three preprocessed dataset sizes for different experimental needs:

Dataset Size Description
expert_small.pkl ~5K samples Small expert dataset for quick testing
expert_medium.pkl ~10K samples Medium expert dataset for moderate experiments
expert_large.pkl ~100K samples Large expert dataset for full evaluation
mixed_small.pkl ~5K samples Small mixed dataset (expert + suboptimal)
mixed_medium.pkl ~10K samples Medium mixed dataset
mixed_large.pkl ~100K samples Large mixed dataset

Each dataset contains:

  • States: 5-dimensional state vectors
  • Actions: Discrete actions (5 possible values)
  • Rewards: Scalar reward values
  • Next States: 5-dimensional next state vectors

Quick Start

Prerequisites

  • Python 3.7+
  • PyTorch
  • NumPy
  • Pickle (built-in)

Installation

  1. Navigate to the project root directory:

    cd /path/to/i24-vsl-orl
  2. Install required dependencies:

    pip install torch numpy

Running the Benchmark

The main training script supports command-line arguments for easy configuration:

python code/training.py [OPTIONS]

Command Line Options

  • --dataset_size {small,medium,large}: Choose dataset size (default: large)
  • --max_training_iters INT: Number of training iterations (default: 100000)
  • --seed INT: Random seed for reproducibility (default: 123)

Examples

# Train with small dataset, 5000 iterations
python code/training.py --dataset_size small --max_training_iters 5000

# Train with medium dataset, custom seed
python code/training.py --dataset_size medium --seed 456

# Train with large dataset (default settings)
python code/training.py

Model Management

Saving Models

Trained models are automatically saved to:

saved_models/training_with_{dataset_size}/run_{i}/
├── BC_expert_{iterations}.pth
├── BC_mixed_{iterations}.pth
├── BCQ_expert_{iterations}.pth
├── BCQ_mixed_{iterations}.pth
├── CQL_expert_{iterations}.pth
├── CQL_mixed_{iterations}.pth
├── IQL_expert_{iterations}.pth
├── IQL_mixed_{iterations}.pth
├── TD3_BC_expert_{iterations}.pth
└── TD3_BC_mixed_{iterations}.pth

Notes

  • All algorithms use the same neural network architecture for fair comparison
  • Training is performed offline using pre-collected datasets
  • Models are saved automatically after training completion

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages