DQN for the Arcade Learning Environment (ALE)
Implementation of DQN for the Arcade Learning Environment adhering to the latest methodologies for doing RL on the ALE. This code was used in our paper Generalization and Regularization in DQN .
You can find dozens of implementations of DQN online... Why another?
- When iterating on DQN or any deep reinforcement learning algorithms a lot of the time the evaluation phase is tightly coupled with the algorithm itself. This can make it hard to jump in and modify what you need
- Plenty of implementations haven't been tested on large scale experiments
- Some implementations have fragmented evaluation protocols (e.g., episode termination, summarizing performance, stochasticity)
Most importantly this implementation adheres to all new evaluation methodologies outlined in the paper by Machado et al., titled "Revising the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents" .
This code was also used to generate results for the paper "Generalization and Regularization in DQN" . Note that this repository has since been cleaned up to support deprecations in TensorFlow. More information on reproducing the results from our paper can be found at JesseFarebro/dqn-generalization.
If you don't need something that's easy to hack on I would highly recommend using the Dopamine framework.
Key Changes in Methodology
- Use sticky actions by default (p = 0.25)
- Don't use loss of life as a terminal signal
- Evaluate performance over the last
n = 100usually) episodes, no distinct evaluation scheme during training
- Do not use the max over the learning curve to report performance
I use Poetry to manage dependencies and virtual environments. It will make your life easier to download and install Poetry.
- Clone the repository with submodules.
- Build the Arcade Learning Environment in the submodule. Instructions can be found in the repo.
poetry installto install dependencies.
poetry shellto launch a shell where you can run
main.pyand explore the hyperparameters in
Note: if you are using a GPU you'll want to specify
poetry install -E tensorflow-gpu in step 3 to switch to using TensorFlow with GPU support.
There are many hyperparameters in deep RL. Sometimes it can be a little daunting. We adhere to most hyperparameters outlined in .
More in depth documentation for each hyperparameter can be found in docs/config.md.