Skip to content

A3C-LSTM algorithm tested on CartPole OpenAI Gym environment

Notifications You must be signed in to change notification settings

bekerov/A3C-LSTM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Implementation of Asynchronous Advantage Actor-Critic algorithm using Long Short Term Memory Networks (A3C-LSTM)

Modified from the work of Arthur Juliani: Simple Reinforcement Learning with Tensorflow Part 8: Asynchronous Actor-Critic Agents (A3C)

Paper can be found here: "Asynchronous Methods for Deep Reinforcement Learning" - Mnih et al., 2016

Tested on CartPole

Requirements

Gym and TensorFlow.

Usage

Training only happens on minibatches of greater than 30, effectively preventing poor performing episodes from influencing training. A reward factor is used to allow for effective training at faster learning rates.

Models are saved every 100 episodes. They can be reloaded for further training or visualised for testing by setting either of the global parameters to True.

This is just example code to test an A3C-LSTM implementation. This should not be considered the optimal way to learn for this environment!

About

A3C-LSTM algorithm tested on CartPole OpenAI Gym environment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%