Skip to content
Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch
Jupyter Notebook Python
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
agents Improved action distribution stack plot by adding annotations and sho… Feb 14, 2019
networks fixed plotting bug (missing variables) referenced in Issue #5 Nov 29, 2018
saved_agents
utils Improved action distribution stack plot by adding annotations and sho… Feb 14, 2019
.gitignore updated linting Jun 14, 2018
01.DQN.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
02.NStep_DQN.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
03.Double_DQN.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
04.Dueling_DQN.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
05.DQN-NoisyNets.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
06.DQN_PriorityReplay.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
07.Categorical-DQN.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
08.Rainbow.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
09.QuantileRegression-DQN.ipynb fixed plotting bug (missing variables) referenced in Issue #5 Nov 29, 2018
10.Quantile-Rainbow.ipynb fixed plotting bug (missing variables) referenced in Issue #5 Nov 29, 2018
11.DRQN.ipynb fixed plotting bug (missing variables) referenced in Issue #5 Nov 29, 2018
12.A2C.ipynb improved results for first 7 notebooks. Other results to come soon Mar 26, 2019
13.GAE.ipynb GAE dev and test versions added. Updated README Jul 16, 2018
14.PPO.ipynb improved stat collection. At every timestep, collect: Avg. temporal d… Feb 14, 2019
README.md PPO dev and ipynb versions added. README updated Jul 17, 2018
a2c_devel.py PPO dev and ipynb versions added. README updated Jul 17, 2018
dqn_devel.py Improved action distribution stack plot by adding annotations and sho… Feb 14, 2019
results.png improved results for first 7 notebooks. Other results to come soon Mar 26, 2019

README.md

DeepRL-Tutorials

The intent of these IPython Notebooks are mostly to help me practice and understand the papers I read; thus, I will opt for readability over efficiency in some cases. First the implementation will be uploaded, followed by markup to explain each portion of code. I'll be assigning credit for any code which is borrowed in the Acknowledgements section of this README.

Relevant Papers:

  1. Human Level Control Through Deep Reinforement Learning [Publication] [code]
  2. Multi-Step Learning (from Reinforcement Learning: An Introduction, Chapter 7) [Publication][code]
  3. Deep Reinforcement Learning with Double Q-learning [Publication][code]
  4. Dueling Network Architectures for Deep Reinforcement Learning [Publication][code]
  5. Noisy Networks for Exploration [Publication][code]
  6. Prioritized Experience Replay [Publication][code]
  7. A Distributional Perspective on Reinforcement Learning [Publication][code]
  8. Rainbow: Combining Improvements in Deep Reinforcement Learning [Publication][code]
  9. Distributional Reinforcement Learning with Quantile Regression [Publication][code]
  10. Rainbow with Quantile Regression [code]
  11. Deep Recurrent Q-Learning for Partially Observable MDPs [Publication][code]
  12. Advantage Actor Critic (A2C) [Publication1][Publication2][code]
  13. High-Dimensional Continuous Control Using Generalized Advantage Estimation [Publication][code]
  14. Proximal Policy Optimization Algorithms [Publication][code]

Requirements:

  • Python 3.6
  • Numpy
  • Gym
  • Pytorch 0.4.0
  • Matplotlib
  • OpenCV
  • Baslines

Acknowledgements:

  • Credit to @baselines for the environment wrappers and inspiration for the prioritized replay code used only in the development code
  • Credit to @higgsfield for the plotting code, epsilon annealing code, and inspiration for the prioritized replay implementation in the IPython notebook
  • Credit to @Kaixhin for factorized Noisy Linear Layer implementation and the projection_distribution function found in Categorical-DQN.ipynb
  • Credit to @ikostrikov for A2C, GAE, PPO and visdom plotting code implementation reference
You can’t perform that action at this time.