PyTorch implementation of reinforcement learning algorithms

This repository contains:

policy gradient methods (TRPO, PPO, A2C)
Generative Adversarial Imitation Learning (GAIL)

Important notes

The code only works for PyTorch 0.3.x right now.
To run mujoco environments, first install mujoco-py and my modified version of gym which supports mujoco 1.50.
If you have a GPU, I recommend setting the OMP_NUM_THREADS to 1 (PyTorch will create additional threads when performing computations which can damage the performance of multiprocessing. This problem is most serious with Linux, where multiprocessing can be even slower than a single thread):

export OMP_NUM_THREADS=1

Features

Support CUDA. (x10 faster than CPU implementation)
Support discrete and continous action space.
Support multiprocessing for agent to collect samples in multiple environments simultaneously. (x8 faster than single thread)
Fast Fisher vector product calculation. For this part, Ankur kindly wrote a blog explaining the implementation details.

Policy gradient methods

Example

python examples/ppo_gym.py --env-name Hopper-v1

Reference

Generative Adversarial Imitation Learning (GAIL)

To save trajectory

python gail/save_expert_traj.py --model-path assets/expert_traj/Hopper-v1_ppo.p

To do imitation learning

python gail/gail_gym.py --env-name Hopper-v1 --expert-traj-path assets/expert_traj/Hopper-v1_expert_traj.p

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
assets		assets
core		core
examples		examples
gail		gail
models		models
utils		utils
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

core

core

examples

examples

gail

gail

models

models

utils

utils

README.md

README.md

Repository files navigation

PyTorch implementation of reinforcement learning algorithms

Important notes

Features

Policy gradient methods

Example

Reference

Generative Adversarial Imitation Learning (GAIL)

To save trajectory

To do imitation learning

About

Releases

Packages

Languages

BillMatrix/PyTorch-RL

Folders and files

Latest commit

History

Repository files navigation

PyTorch implementation of reinforcement learning algorithms

Important notes

Features

Policy gradient methods

Example

Reference

Generative Adversarial Imitation Learning (GAIL)

To save trajectory

To do imitation learning

About

Resources

Stars

Watchers

Forks

Languages