In this notebook we solve the Pong environment using a version of a DQN </examples/stubs/dqn>
agent, trained using a PrioritizedReplayBuffer
<coax.experience_replay.PrioritizedReplayBuffer>
instead of the standard SimpleReplayBuffer
<coax.experience_replay.SimpleReplayBuffer>
.
This notebook periodically generates GIFs, so that we can inspect how the training is progressing.
After a few hundred episodes, this is what you can expect:
dqn_per.py
dqn_per.py