You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 7, 2024. It is now read-only.
I noticed you changed the optimizer and some hyper-parameters in DQN compared to those in the "Nature" paper, well, from my side I can't reproduce results by taking any of the two settings, could you share a learning curve of "Breakout"? I have been struggling with the hyper-parameters optimization for two months. Thanks.
The text was updated successfully, but these errors were encountered:
This agent is intended to be a simple instantiation of the DQN algorithm (Q-learning + non-linear function approximation + experience replay), and isn't intended to reproduce the Nature Atari results. There are numerous subtleties related to interfacing with Atari (frame stacking, reward clipping, etc) that can be tricky to get right. For agents that are set up to run on Atari, see Dopamine (github.com/google/dopamine) or OpenAI baselines (github.com/openai/baselines)
This agent is intended to be a simple instantiation of the DQN algorithm (Q-learning + non-linear function approximation + experience replay), and isn't intended to reproduce the Nature Atari results. There are numerous subtleties related to interfacing with Atari (frame stacking, reward clipping, etc) that can be tricky to get right. For agents that are set up to run on Atari, see Dopamine (github.com/google/dopamine) or OpenAI baselines (github.com/openai/baselines)
I noticed you changed the optimizer and some hyper-parameters in DQN compared to those in the "Nature" paper, well, from my side I can't reproduce results by taking any of the two settings, could you share a learning curve of "Breakout"? I have been struggling with the hyper-parameters optimization for two months. Thanks.
The text was updated successfully, but these errors were encountered: