This repository hosts the source code, datasets, supplementary materials, and the full appendix for the paper:
Interpretable Symbolic Regression for Spatio-Temporal Traffic Prediction Mingxi Li, Zhengmin Shi, Guoyang Qin, Wei Ma IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2026.
STSR is an end-to-end symbolic regression framework that learns explicit, human-readable mathematical expressions directly from spatio-temporal traffic data. It couples discrete optimization (GOMEA) for discovering function structures with continuous optimization (L-BFGS) for tuning free coefficients, and exploits CPU parallelism for scalability. STSR matches or surpasses state-of-the-art deep learning models on multiple real-world traffic-speed and traffic-flow benchmarks while offering full interpretability and substantially lower computational cost.
- Interpretable — produces transparent equations rather than black-box weights.
- Accurate — matches or surpasses state-of-the-art deep models on six benchmarks.
- Efficient — runs on CPUs with parallelization, without requiring GPU clusters.
- General — applicable to both traffic-speed and traffic-flow prediction tasks.
STSR/
├── README.md # This file
├── LICENSE
├── appendix.pdf # 📄 Full appendix PDF (auto-built by CI)
├── appendix/ # Full appendix sources (moved from the main paper)
│ ├── appendix.md # Markdown version (renders inline on GitHub)
│ ├── appendix.tex # Original LaTeX source
│ ├── appendix.pdf # Same PDF as the root copy
│ └── standalone.tex # Build wrapper used by GitHub Actions
├── code/ # Source code for the STSR framework
│ ├── README.md # Code-level usage guide
│ ├── stsr/ # Core library
│ │ ├── __init__.py # Public re-exports
│ │ ├── data.py # Dataset loading + shape_data + datamodule
│ │ ├── model.py # GPGConfig + build_regressor
│ │ ├── metrics.py # MAE / RMSE / MAPE / expression complexity
│ │ ├── trainer.py # train_one_node + run_experiment (joblib parallel)
│ │ └── io.py # YAML config loader + Excel / JSONL writers
│ ├── scripts/ # CLI entry points
│ │ ├── train.py # Single-run training
│ │ ├── analyze_results.py # Cross-run aggregation
│ │ ├── smoke_test.py # One-node end-to-end self-check
│ │ └── run_all.sh # One-shot driver for all six datasets
│ ├── configs/ # Per-dataset YAML (la, pemsbay, pems03/04/07/08)
│ ├── data/ # Data layout + download sources (README.md)
│ ├── requirements.txt
│ └── environment.yml # Conda env (python>=3.9)
└── data/ # Dataset download/preprocessing scripts
└── README.md # Instructions to obtain METR-LA, PEMS-BAY, PEMS03/04/07/08
Six public real-world traffic datasets are used:
| Dataset | # Sensors | # Horizons | Time Range |
|---|---|---|---|
| METR-LA | 207 | 34,272 | 03/2012 – 06/2012 |
| PEMS-BAY | 325 | 52,116 | 01/2017 – 05/2017 |
| PEMS03 | 358 | 26,209 | 05/2012 – 07/2012 |
| PEMS04 | 307 | 16,992 | 01/2018 – 02/2018 |
| PEMS07 | 883 | 28,224 | 05/2017 – 08/2017 |
| PEMS08 | 170 | 17,856 | 07/2016 – 08/2016 |
Download instructions are provided in data/README.md.
git clone https://github.com/MingxiLii/STSR.git
cd STSR
pip install -r code/requirements.txt# Train STSR on PEMS04
python code/scripts/train.py --dataset PEMS04 --horizon 60
# Evaluate a trained model
python code/scripts/evaluate.py --dataset PEMS04 --checkpoint path/to/ckptThe complete appendix originally accompanying the manuscript has been relocated to this repository to keep the published paper concise. It is also available as the latest GitHub Release — see the sidebar on the right for a direct download link — and as a single PDF inside the repo: appendix/appendix.pdf.
The appendix contains:
- A. Detailed dataset descriptions
- B. Full model parameter settings
- C. Pseudocode and analysis of the GOMEA structure search
- D. Pseudocode and analysis of the L-BFGS coefficient optimizer
- E. Theoretical computational complexity comparison
- F. Long-term prediction (up to 120 min) results
- G. Supplementary figures, expressions, and ablation tables
If you find this work useful, please cite:
@article{li2026stsr,
title = {Interpretable Symbolic Regression for Spatio-Temporal Traffic Prediction},
author = {Li, Mingxi and Shi, Zhengmin and Qin, Guoyang and Ma, Wei},
journal = {IEEE Transactions on Intelligent Transportation Systems},
year = {2026}
}For questions or collaboration, please contact the corresponding authors:
- Guoyang Qin
- Wei Ma
This project is released under the MIT License. See LICENSE for details.