Constrained Policy Optimization with Explicit Behavior Density

Official implementation for NeurIPS 2023 paper Constrained Policy Optimization with Explicit Behavior Density for Offline Reinforcement Learning.

Implementation requirement

Install MuJoCo version 2.1.0
Install D4RL
Install [rlkit][https://github.com/vitchyr/rlkit]

Getting Started

Then in order to run CPED, an example command is:

python examples/flow_hdf5_d4rl.py --env='halfcheetah-medium-v2' --policy_lr=1e-4 --model_type='nice --flowgan_train_epoch=500'
python examples/flow_hdf5_d4rl.py --env='antmaze-umaze-v2' --policy_lr=3e-4 --model_type='nice --flowgan_train_epoch=300'

For the antmaze-large tasks, the traing epoch of flowgan should be increased to 400 or 500 by specifying the hyperparameter flowgan_train_epoch=300 or flowgan_train_epoch=300.

Citation

If you find this code useful for your research, please cite our paper as:

@article{zhang2024constrained,
  title={Constrained Policy Optimization with Explicit Behavior Density For Offline Reinforcement Learning},
  author={Zhang, Jing and Zhang, Chi and Wang, Wenjia and Jing, Bingyi},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

Acknowledgement

This codebase is built off of the official implementation of BEAR (https://github.com/rail-berkeley/d4rl_evaluations/tree/master/bear) ,rlkit (https://github.com/vitchyr/rlkit/) and Flow GAN (https://github.com/ermongroup/flow-gan).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cped		cped
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cped

cped

README.md

README.md

Repository files navigation

Constrained Policy Optimization with Explicit Behavior Density

Implementation requirement

Getting Started

Citation

Acknowledgement

About

Releases

Packages

Languages

evalarzj/cped

Folders and files

Latest commit

History

cped

cped

README.md

README.md

Repository files navigation

Constrained Policy Optimization with Explicit Behavior Density

Implementation requirement

Getting Started

Citation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages