Skip to content

MingxiLii/STSR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STSR: Interpretable Symbolic Regression for Spatio-Temporal Traffic Prediction

This repository hosts the source code, datasets, supplementary materials, and the full appendix for the paper:

Interpretable Symbolic Regression for Spatio-Temporal Traffic Prediction Mingxi Li, Zhengmin Shi, Guoyang Qin, Wei Ma IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2026.


Overview

STSR is an end-to-end symbolic regression framework that learns explicit, human-readable mathematical expressions directly from spatio-temporal traffic data. It couples discrete optimization (GOMEA) for discovering function structures with continuous optimization (L-BFGS) for tuning free coefficients, and exploits CPU parallelism for scalability. STSR matches or surpasses state-of-the-art deep learning models on multiple real-world traffic-speed and traffic-flow benchmarks while offering full interpretability and substantially lower computational cost.

Highlights

  • Interpretable — produces transparent equations rather than black-box weights.
  • Accurate — matches or surpasses state-of-the-art deep models on six benchmarks.
  • Efficient — runs on CPUs with parallelization, without requiring GPU clusters.
  • General — applicable to both traffic-speed and traffic-flow prediction tasks.

Repository Structure

STSR/
├── README.md                  # This file
├── LICENSE
├── appendix.pdf               # 📄 Full appendix PDF (auto-built by CI)
├── appendix/                  # Full appendix sources (moved from the main paper)
│   ├── appendix.md            # Markdown version (renders inline on GitHub)
│   ├── appendix.tex           # Original LaTeX source
│   ├── appendix.pdf           # Same PDF as the root copy
│   └── standalone.tex         # Build wrapper used by GitHub Actions
├── code/                      # Source code for the STSR framework
│   ├── README.md              # Code-level usage guide
│   ├── stsr/                  # Core library
│   │   ├── __init__.py        # Public re-exports
│   │   ├── data.py            # Dataset loading + shape_data + datamodule
│   │   ├── model.py           # GPGConfig + build_regressor
│   │   ├── metrics.py         # MAE / RMSE / MAPE / expression complexity
│   │   ├── trainer.py         # train_one_node + run_experiment (joblib parallel)
│   │   └── io.py              # YAML config loader + Excel / JSONL writers
│   ├── scripts/               # CLI entry points
│   │   ├── train.py           # Single-run training
│   │   ├── analyze_results.py # Cross-run aggregation
│   │   ├── smoke_test.py      # One-node end-to-end self-check
│   │   └── run_all.sh         # One-shot driver for all six datasets
│   ├── configs/               # Per-dataset YAML (la, pemsbay, pems03/04/07/08)
│   ├── data/                  # Data layout + download sources (README.md)
│   ├── requirements.txt
│   └── environment.yml        # Conda env (python>=3.9)
└── data/                      # Dataset download/preprocessing scripts
    └── README.md              # Instructions to obtain METR-LA, PEMS-BAY, PEMS03/04/07/08

Datasets

Six public real-world traffic datasets are used:

Dataset # Sensors # Horizons Time Range
METR-LA 207 34,272 03/2012 – 06/2012
PEMS-BAY 325 52,116 01/2017 – 05/2017
PEMS03 358 26,209 05/2012 – 07/2012
PEMS04 307 16,992 01/2018 – 02/2018
PEMS07 883 28,224 05/2017 – 08/2017
PEMS08 170 17,856 07/2016 – 08/2016

Download instructions are provided in data/README.md.

Installation

git clone https://github.com/MingxiLii/STSR.git
cd STSR
pip install -r code/requirements.txt

Quick Start

# Train STSR on PEMS04
python code/scripts/train.py --dataset PEMS04 --horizon 60

# Evaluate a trained model
python code/scripts/evaluate.py --dataset PEMS04 --checkpoint path/to/ckpt

📄 Appendix

The complete appendix originally accompanying the manuscript has been relocated to this repository to keep the published paper concise. It is also available as the latest GitHub Release — see the sidebar on the right for a direct download link — and as a single PDF inside the repo: appendix/appendix.pdf.

The appendix contains:

  • A. Detailed dataset descriptions
  • B. Full model parameter settings
  • C. Pseudocode and analysis of the GOMEA structure search
  • D. Pseudocode and analysis of the L-BFGS coefficient optimizer
  • E. Theoretical computational complexity comparison
  • F. Long-term prediction (up to 120 min) results
  • G. Supplementary figures, expressions, and ablation tables

Citation

If you find this work useful, please cite:

@article{li2026stsr,
  title   = {Interpretable Symbolic Regression for Spatio-Temporal Traffic Prediction},
  author  = {Li, Mingxi and Shi, Zhengmin and Qin, Guoyang and Ma, Wei},
  journal = {IEEE Transactions on Intelligent Transportation Systems},
  year    = {2026}
}

Contact

For questions or collaboration, please contact the corresponding authors:

  • Guoyang Qin
  • Wei Ma

License

This project is released under the MIT License. See LICENSE for details.

About

Interpretable Symbolic Regression for Spatio-Temporal Traffic Prediction (IEEE T-ITS 2026)

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors