A3C Continuous Reinforcement Learning

Tensorflow implementation of the asynchronous advantage actor-critic (A3C) reinforcement learning algorithm (paper) for continuous action space. Code is mostly based on Morvan Zhou (github).

Components

ACNet: This class contains the actor-critic neural network that estimates an action given a certain state and a value for each state. For continuous action states the action is given as an expected value mu and variance sigma.
Worker: The A3C algorithm employs multiple workers which have their own environment and ACNet and train on these asynchronous. Every few steps they update their weights to the global ACNet.
Main: The main function creates the global ACNet and multiple workers. They start training until a defined number of training episodes is reached. Reward will be plotted over all steps.

Results

Pendulum environment before training:

After 1500 episodes:

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
gifs		gifs
LICENSE		LICENSE
README.md		README.md
a3c.py		a3c.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A3C Continuous Reinforcement Learning

Components

Results

About

Releases

Packages

Languages

License

mendezVKI/A3C-Continuous

Folders and files

Latest commit

History

Repository files navigation

A3C Continuous Reinforcement Learning

Components

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages