STReasoner: Time Series LLM for Spatio-Temporal Reasoning

Empowering LLMs for Spatio-Temporal Reasoning in Time Series via Spatial-Aware Reinforcement Learning

✨ Highlights

📊 Multimodal Benchmark — ST-Bench for spatio-temporal reasoning in time series
🔗 Time Series Encoder Integration — LLM with dedicated time series encoder
Supports: Qwen3-8B, Qwen3-4B-Instruct, Qwen2.5-14B-Instruct
🚀 Full Training Pipeline — First SFT + RL training pipeline for LLM with time series encoder

📰 News

[2025/01/06] 🎉 Released full pipeline training code for STReasoner

📖 Overview

STReasoner is a framework designed for spatio-temporal reasoning in time series data, trained through a carefully designed three-stage pipeline:

Stage	Method	Description
1	SFT	Supervised fine-tuning for time series alignment
2	SFT	Supervised fine-tuning for cold-start reasoning
3	RL	Reinforcement learning with S-GRPO (Spatial-aware Group Relative Policy Optimization)

⚙️ Requirements

Hardware: 8 × NVIDIA A100-SXM4-80GB (or equivalent)

Software: CUDA 12.8

SFT Environment (Stage 1 & 2)

conda create --name str python==3.10
conda activate str
pip install -r requirements.txt

RL Environment (Stage 3)

docker pull hiyouga/verl:ngc-th2.8.0-cu12.9-vllm0.11.0

📦 Data Preparation

Download the ST-Bench dataset from 🤗 HuggingFace:

python download_dataset.py

🚀 Training

1. Prepare Base Model

python download_model.py --repo_id Qwen/Qwen3-8B
cp -rf base_model/Config-Qwen3-8B/* base_model/Qwen3-8B/
python initial_model.py

2. Stage 1 & 2 (SFT)

bash scripts/qwen3-8b/train_stage1.sh      # → STReasoner-8B-Align
bash scripts/qwen3-8b/train_stage1+2.sh    # → STReasoner-8B-CoT

📦 SFT Checkpoints:
STReasoner-8B-Align · STReasoner-8B-CoT

3. Stage 3 (RL)

Launch the Docker container:

docker run -it --gpus all \
  --name verl_env \
  --shm-size=40g \
  -v .:/workspace/SpatialTemporalReasoning \
  hiyouga/verl:ngc-th2.8.0-cu12.9-vllm0.11.0 bash

Inside the container:

cd STReasoner

# With Spatial-aware GRPO
bash scripts/qwen3-8b/train_stage1+2+3_w_spatial.sh

# Or with vanilla GRPO
bash scripts/qwen3-8b/train_stage1+2+3.sh

4. Merge Checkpoint

cp base_model/Config-Qwen3-8B/modeling_qwen3_ts.py \
   checkpoints/easy_r1/qwen3_8b_grpo_stage1+2+3_w_spatial/global_step_51/actor/huggingface

python model_merger.py \
   --local_dir checkpoints/easy_r1/qwen3_8b_grpo_stage1+2+3_w_spatial/global_step_51/actor/

🔮 Inference

Run inference across all reasoning tasks:

for task in reasoning_forecasting reasoning_entity reasoning_etiological reasoning_correlation; do 
    python inference/inference_tsmllm_vllm.py \
        --task $task \
        --model_path checkpoints/easy_r1/qwen3_8b_grpo_stage1+2+3_w_spatial/global_step_51/actor/huggingface
done

📊 Evaluation

Evaluate model performance on each task:

for task in reasoning_forecasting reasoning_entity reasoning_etiological reasoning_correlation; do 
    python evaluation/evaluate.py \
        --task $task \
        --exp_path exp/$task-qwen3_8b_grpo_stage1+2+3_w_spatial
done

🎨 Alternative Training: Text or Image Prompting

Modality	Stage 2 Script	Stage 2+3 Script
Text	`scripts/qwen3-8b/train_stage2_only_text.sh`	`scripts/qwen3-8b/train_stage2+3_w_spatial_only_text.sh`
Image	`scripts/qwen3-vl-8b-instruct/train_stage2_only_image.sh`	`scripts/qwen3-vl-8b-instruct/train_stage2+3_w_spatial_only_image.sh`

🙏 Acknowledgements

We thank the following projects for their valuable contributions:

EasyR1 — Reinforcement learning framework for our RL training setup
Verl — Reinforcement learning framework and environment for Stage 3 training
ChatTS — Temporal-spatial encoder and HuggingFace/vLLM implementations
LLaMA-Factory — Supervised fine-tuning framework for SFT stages
vLLM — Fast model inference engine

If you find this work useful, please consider giving it a ⭐!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

STReasoner: Time Series LLM for Spatio-Temporal Reasoning

✨ Highlights

📰 News

📖 Overview

⚙️ Requirements

SFT Environment (Stage 1 & 2)

RL Environment (Stage 3)

📦 Data Preparation

🚀 Training

1. Prepare Base Model

2. Stage 1 & 2 (SFT)

3. Stage 3 (RL)

4. Merge Checkpoint

🔮 Inference

📊 Evaluation

🎨 Alternative Training: Text or Image Prompting

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
base_model		base_model
data		data
ds_config		ds_config
evaluation		evaluation
exp_STReasoner-8B		exp_STReasoner-8B
figures		figures
inference		inference
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md
download_dataset.py		download_dataset.py
download_model.py		download_model.py
evaluation_interface.html		evaluation_interface.html
initial_model.py		initial_model.py
model_merger.py		model_merger.py
requirements.txt		requirements.txt

License

LingFengGold/STReasoner

Folders and files

Latest commit

History

Repository files navigation

STReasoner: Time Series LLM for Spatio-Temporal Reasoning

✨ Highlights

📰 News

📖 Overview

⚙️ Requirements

SFT Environment (Stage 1 & 2)

RL Environment (Stage 3)

📦 Data Preparation

🚀 Training

1. Prepare Base Model

2. Stage 1 & 2 (SFT)

3. Stage 3 (RL)

4. Merge Checkpoint

🔮 Inference

📊 Evaluation

🎨 Alternative Training: Text or Image Prompting

🙏 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages