This folder contains an implementation of the DQN algorithm (Mnih et al., 2013, Mnih et al., 2015), with extras bells & whistles, similar to Rainbow DQN (Hessel et al., 2017).
- Q-learning with neural network function approximation. The loss is given by the Huber loss applied to the temporal difference error.
- Target Q' network updated periodically (Mnih et al., 2015).
- N-step bootstrapping (Sutton & Barto, 2018).
- Double Q-learning (van Hasselt et al., 2015).
- Prioritized experience replay (Schaul et al., 2015).