Provable Benefits of Unsupervised Data Sharing in Offline RL

This is a jax implementation of PDS on Datasets for Deep Data-Driven Reinforcement Learning (D4RL), the corresponding paper is The provable benefits of unsupervised data sharing for offline reinforcement learning.

Quick Start

For experiments on D4RL, our code is implemented based on IQL:

$ python3 train_data_sharing.py --env_name=walker2d-expert-v2 --source_name=walker2d-random-v2 --config=configs/mujoco_config.py --data_share=learn  --target_split=0.05  --source_split=0.1

Citing

If you find this open source release useful, please reference in your paper (it is our honor):

@article{hu2023provable,
  title={The provable benefits of unsupervised data sharing for offline reinforcement learning},
  author={Hu, Hao and Yang, Yiqin and Zhao, Qianchuan and Zhang, Chongjie},
  journal={arXiv preprint arXiv:2302.13493},
  year={2023}
}

Note

If you have any questions, please contact me: yangyiqi19@mails.tsinghua.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
configs		configs
preprocessing		preprocessing
runs		runs
scripts		scripts
wrappers		wrappers
LICENSE		LICENSE
README.md		README.md
actor.py		actor.py
common.py		common.py
critic.py		critic.py
dataset_utils.py		dataset_utils.py
evaluation.py		evaluation.py
get_dataset.py		get_dataset.py
learner.py		learner.py
policy.py		policy.py
requirements.txt		requirements.txt
train_data_sharing.py		train_data_sharing.py
train_finetune.py		train_finetune.py
train_offline.py		train_offline.py
train_with_random_reward.py		train_with_random_reward.py
value_net.py		value_net.py

License

YiqinYang/PDS

Folders and files

Latest commit

History

Repository files navigation

Provable Benefits of Unsupervised Data Sharing in Offline RL

Quick Start

Citing

Note

About

Resources

License

Stars

Watchers

Forks

Languages