Skip to content

PyTorch implementation of Sample Efficient Actor-Critic with Experience Replay(ACER)

Notifications You must be signed in to change notification settings

younggyoseo/pytorch-acer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pytorch-acer

An Implementation of Sample Efficient Actor-Critic with Experience Replay.

The paper proposes an off-policy Actor Critic algorithm with Experience Replay(ACER) in order to increase sample efficiency of actor critic algorithm.

This repository is based on pytorch-a2c-ppo-acktr and baselines/acer. A lot of codes are borrowed from them. If there are any license/property problems, please contact me.

A few notes:

  • Original paper have used A3C-like asynchronous update but I adopted A2C-like batch update used in openai's implementation for ACER.

  • This implementation does not support continuous action space. Only works for atari game environments with discrete action spaces.

  • Used torch.autograd.grad with torch.autograd.backward for TRPO implementation.

  • Current RAM requirements are:

    • --num-processes 4 : 7.8GB of RAM required
    • --num-processes 8 : 13.3GB of RAM required
    • --num-processes 16: 24.3GB of RAM required
    • --num-processes 32: 46.5GB of RAM required

Requirements

pytorch 0.4.1
numpy
gym
baselines
python 3.6

You can install gym and baselines for atari games:

git clone https://github.com/openai/baselines.git
cd baselines
pip install -e .
pip install 'atari[gym]'

Examples

python main.py --env 'PongNoFrameskip-v4' --num-processes 4 --recurrent-policy

PongNoFrameskip-v4

python main.py --env 'BreakoutNoFrameskip-v4' --num-processes 16

BreakoutNoFrameskip-v4

Acknowledgements

About

PyTorch implementation of Sample Efficient Actor-Critic with Experience Replay(ACER)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages