SRPO

[NeurIPS 2023] The official code for paper "State Regularized Policy Optimization on Data with Dynamics Shift".

Installation

Follow the steps in OfflineRL

Prepare Offline Dataset

Download the files in Google Drive and change the path parameter in line:15 of examples/train_d4rl.py.

Run the SRPO algorithm

python examples/train_d4rl.py --algo_name=maple_st --exp_name=maple_st --seed 1 --task density_10,body_mass@walker2d-medium-expert-v0 --rew_reg_eta 0.1 --out_train_epoch 200 --device cuda:1

walker2d-medium-expert-v0 can be changed to other Offline RL environments. To run baseline algorithms, maple_st can be changed to maple, mopo, cql, etc.

Citation

If you find our code repository or paper useful, please cite with:

@article{xue2023state,
  title={State Regularized Policy Optimization on Data with Dynamics Shift},
  author={Xue, Zhenghai and Cai, Qingpeng and Liu, Shuchang and Zheng, Dong and Jiang, Peng and Gai, Kun and An, Bo},
  journal={arXiv preprint arXiv:2306.03552},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
examples		examples
offlinerl		offlinerl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

offlinerl

offlinerl

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

SRPO

Installation

Prepare Offline Dataset

Run the SRPO algorithm

Citation

About

Releases

Packages

Languages

License

AIDefender/SRPO

Folders and files

Latest commit

History

Repository files navigation

SRPO

Installation

Prepare Offline Dataset

Run the SRPO algorithm

Citation

About

Resources

License

Stars

Watchers

Forks

Languages