Clipped Action Policy Gradient

This repository contains the implementation of CAPG (https://arxiv.org/abs/1802.07564) with PPO and TRPO.

Dependencies

Chainer v4.1.0
ChainerRL latest master
OpenAI Gym v0.9.4 with MuJoCo envs

Use requirements.txt to install dependencies.

pip install -r requirements.txt

How to run

# Run PPO with PG and CAPG for 1M steps
python train_ppo_gym.py --env Humanoid-v1
python train_ppo_gym.py --env Humanoid-v1 --use-clipped-gaussian

# Run TRPO with PG and CAPG for 10M steps
python train_trpo_gym.py --env Humanoid-v1 --steps 10000000
python train_trpo_gym.py --env Humanoid-v1 --steps 10000000 --use-clipped-gaussian

The figure below shows average returns of training episodes of TRPO with PG and CAPG, both of which are trained for 10M timesteps on Humanoid-v1. See the paper for more results.

BibTeX entry

@inproceedings{Fujita2018Clipped,
  author = {Fujita, Yasuhiro and Maeda, Shin-ichi},
  booktitle = {ICML},
  title = {{Clipped Action Policy Gradient}}
  year = {2018}
}

License

MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md
call_render.py		call_render.py
clip_action.py		clip_action.py
clipped_gaussian.py		clipped_gaussian.py
requirements.txt		requirements.txt
train_ppo_gym.py		train_ppo_gym.py
train_trpo_gym.py		train_trpo_gym.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clipped Action Policy Gradient

Dependencies

How to run

BibTeX entry

License

About

Releases

Packages

Languages

License

pfnet-research/capg

Folders and files

Latest commit

History

Repository files navigation

Clipped Action Policy Gradient

Dependencies

How to run

BibTeX entry

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages