Skip to content

Plankson/BiTrajDiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BiTrajDiff: Bidirectional Trajectory Generation with Diffusion Models for Offline Reinforcement Learning

Official codebase for the paper BiTrajDiff: Bidirectional Trajectory Generation with Diffusion Models for Offline Reinforcement Learning.

🌍 Overview

TLDR: This paper presents Bi-directional Trajectory Diffusion (BiTrajDiff), a novel data augmentation framework for offline reinforcement learning (RL) that improves trajectory connectivity and dataset diversity through bidirectional diffusion-based trajectory generation. Unlike prior single-direction augmentation methods, BiTrajDiff generates both forward and backward trajectory bridges between disconnected states, enabling more effective recovery of missing trajectory-level transitions in sparse and conservative offline datasets. By leveraging a dual diffusion process, the method synthesizes high-quality intermediate trajectories that better preserve behavioral consistency while enhancing long-horizon compositionality. Extensive experiments demonstrate that BiTrajDiff consistently improves the performance of multiple offline RL algorithms and outperforms existing state-of-the-art data augmentation and trajectory stitching baselines.

⚙️ Getting Started

Our BiTrajDiff is built on the CleanDiffuser repo. You can directly follow CleanDiffuser Guideline to build dependence for Bitrajdiff.

📦 Usage

1. Train bidirectional diffusion models

The BiTrajDiff model training can be reproduced by :

python src/bitrajdiff_pipeline.py task=<env_name> mode=train_diffusion

More detailed hyperparameters are provided in config directory.

2. Generate data for reinforcement learning

After the BiTrajDiff model training finished, you can utilized the trained BiTrajDiff model to generate your own dataset for enhancing the offline RL algorithm:

python src/bitrajdiff_pipeline.py task=<env_name> mode=stitch

3. Offline RL training

As illustrate in the expeirment section of original paper, We directy utilize JAX-CORL repo without modification to run and eval downstream offline RL.

🙏 Acknowledgement

📄 Citation

If you find this work useful for your research, please cite our paper:

@article{qing2025bitrajdiff,
  title={Bitrajdiff: Bidirectional trajectory generation with diffusion models for offline reinforcement learning},
  author={Qing, Yunpeng and Chi, Yixiao and Chen, Shuo and Liu, Shunyu and Yao, Kelu and Lin, Sixu and Liu, Litao and Zou, Changqing},
  journal={arXiv preprint arXiv:2506.05762},
  year={2025}
}

✉️ Contact

Please feel free to contact me via email qingyunpeng@zju.edu.cn if you are interested in our research :)

About

[ICML 2026] BiTrajDiff: Bidirectional Trajectory Generation with Diffusion Models for Offline Reinforcement Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages