Offline Bisimulation

Official pytorch implementation of Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning.

Requirements

The exact requirements depend on the baseline methods, e.g., for applying bisimulation methods on TD3BC, one can follow the requirements from official repo of TD3BC.

Usage

Pretrain with baseline method - SimSR: python main.py --env hopper-medium-expert-v2 --obj simsr --slope 0.5 --reward_norm False --reward_scale False

Pretrain with baseline method - MiCO: python main.py --env hopper-medium-expert-v2 --obj mico --slope 0.5 --reward_norm False --reward_scale False

Pretrain with our method: python main.py --env hopper-medium-expert-v2 --obj simsr --slope 0.7 --reward_norm True --reward_scale True

Citation

If you find this open source release useful, please reference it in your paper:

@inproceedings{
zang2023understanding,
title={Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning},
author={Hongyu Zang and Xin Li and Leiji Zhang and Yang Liu and Baigui Sun and Riashat Islam and Remi Tachet des Combes and Romain Laroche},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=sQyRQjun46}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
TD3_BC_mico.py		TD3_BC_mico.py
TD3_BC_simsr.py		TD3_BC_simsr.py
main.py		main.py
transition_model.py		transition_model.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

TD3_BC_mico.py

TD3_BC_mico.py

TD3_BC_simsr.py

TD3_BC_simsr.py

main.py

main.py

transition_model.py

transition_model.py

utils.py

utils.py

Repository files navigation

Offline Bisimulation

Requirements

Usage

Citation

About

Releases

Packages

Languages

zanghyu/Offline_Bisimulation

Folders and files

Latest commit

History

Repository files navigation

Offline Bisimulation

Requirements

Usage

Citation

About

Resources

Stars

Watchers

Forks

Languages