Skip to content
Switch branches/tags
Go to file

Latest commit

* Rewritten memory to index and retrieve data in parallel from the segment tree

* Analytically compute the size of the segment tree

* Refactor name of n-step reward accumulator vector

* Remove unused argument in _propagate_index

* add index bounding in case we have overshooting rounding approximation in segment tree

* fix logic issue

* add brief comments explaining control flow logic of fix

* fix typo in comment

Git stats


Failed to load latest commit information.


MIT License

Rainbow: Combining Improvements in Deep Reinforcement Learning [1].

Results and pretrained models can be found in the releases.

  • DQN [2]
  • Double DQN [3]
  • Prioritised Experience Replay [4]
  • Dueling Network Architecture [5]
  • Multi-step Returns [6]
  • Distributional RL [7]
  • Noisy Nets [8]

Run the original Rainbow with the default arguments:


Data-efficient Rainbow [9] can be run using the following options (note that the "unbounded" memory is implemented here in practice by manually setting the memory capacity to be the same as the maximum number of timesteps):

python --target-update 2000 \
               --T-max 100000 \
               --learn-start 1600 \
               --memory-capacity 100000 \
               --replay-frequency 1 \
               --multi-step 20 \
               --architecture data-efficient \
               --hidden-size 256 \
               --learning-rate 0.0001 \
               --evaluation-interval 10000

Note that pretrained models from the 1.3 release used a (slightly) incorrect network architecture. To use these, change the padding in the first convolutional layer from 0 to 1 (DeepMind uses "valid" (no) padding).


To install all dependencies with Anaconda run conda env create -f environment.yml and use source activate rainbow to activate the environment.

Available Atari games can be found in the atari-py ROMs folder.



[1] Rainbow: Combining Improvements in Deep Reinforcement Learning
[2] Playing Atari with Deep Reinforcement Learning
[3] Deep Reinforcement Learning with Double Q-learning
[4] Prioritized Experience Replay
[5] Dueling Network Architectures for Deep Reinforcement Learning
[6] Reinforcement Learning: An Introduction
[7] A Distributional Perspective on Reinforcement Learning
[8] Noisy Networks for Exploration
[9] When to Use Parametric Models in Reinforcement Learning?