This is a jax implementation of the paper: Hessel, M., Modayil, J., Van Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M. and Silver, D., 2018, April. Rainbow: Combining improvements in deep reinforcement learning. In Thirty-Second AAAI Conference on Artificial Intelligence.
Please, feel free to raise issues to ask questions or flag flaws and mistakes in the implementation.
Should you find this useful for you, I would be grateful if you'd star⭐ it :)
Roadmap:
- DQN (Deep Q-Network)
- DDQN (Double Deep Q-Network)
- Prioritized DDQN (Prioritized experience replay)
- Dueling DDQN
- Multi-step learning
- A3C (Asynchronous Advantage Actor Critic)
- Distributional DQN
- Noisy DQN
- Multi-