Skip to content

Implementations of some of the most well known Deep Reinforcement Learning algorithms

Notifications You must be signed in to change notification settings

lorenzomancini1/DeepRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DeepRL

Implementations of some of the most well known Deep Reinforcement Learning algorithms, namely:

  • Advantage Actor Critic (A2C), which make us of the Generalized Advantage Estimation procedure;
  • Proximal Policy Optimization (PPO);
  • Deep Q-Network;
  • Coming soon

To run a specific algorithm with default options, run:

python3 src/<algorithm name>.py 

Some command-line options supported:

  • --episodes, sets the number of episodes for the training;
  • --max_steps or -s, sets the maximum length of a trajectory;
  • --gamma or -g, sets the discount factor.

For example, to execute A2C with 500 episodes and 150-steps trajectories:

python3 src/a2c.py --episodes 500 --max_steps 150 

Advantage Actor Critic (A2C)

To test it with the OpenAI gym CartPole-v1 environment run:

python3 src/a2c.py -vv

Proximal Policy Optimization (PPO)

Specific options:

  • --epsilon, to set the clip parameter;
  • --epochs, number of updates of the Actor and Critic networks per episode;
  • --mini_batch_size, number of samples to be used for the updates;

About

Implementations of some of the most well known Deep Reinforcement Learning algorithms

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages