Skip to content

End of Asynchronous Methods

Compare
Choose a tag to compare
@ShangtongZhang ShangtongZhang released this 04 Apr 21:05
· 317 commits to master since this release

I found the current Atari wrapper I used is not fully compatible with the one in OpenAI baselines, resulting a dropped performance for most games (except for Pong). So I plan to do a major update to fix this issue. (To be more specific, OpenAI baselines track the return of the original episode which usually has more than one lives, however I track the return of the episode that only has one life)

Moreover, asynchronous methods are getting deprecated nowadays, so I will remove them and switch to A2C style algorithms in next version.

I made this tag in case someone may still want some old stuff.

To be more specific, following are implemented algorithms in this release:

  • Deep Q-Learning (DQN)
  • Double DQN
  • Dueling DQN
  • (Async) Advantage Actor Critic (A3C / A2C)
  • Async One-Step Q-Learning
  • Async One-Step Sarsa
  • Async N-Step Q-Learning
  • Continuous A3C
  • Distributed Deep Deterministic Policy Gradient (Distributed DDPG, aka D3PG)
  • Parallelized Proximal Policy Optimization (P3O, similar to DPPO)
  • Action Conditional Video Prediction
  • Categorical DQN (C51, Distributional DQN with KL Distance)
  • Quantile Regression DQN (Distributional DQN with Wasserstein Distance)
  • N-Step DQN (similar to A2C)

Most of them are compatible with both Python2 and Python3, however almost all the async methods can only work in Python2.