This was the first open source version of DeepMind's DQN paper. In addition, a crowd-based reward singal was collected which you can use to train your model, available here:
All reinforcement learning done in Python. In addition, solver.cpp was modified to support online observation of training data with
Solver<Dtype>::Solve split into
OnlineForward to set the input of the memory data layer, determine the q-loss in
examples/dqn, then optionally backprop depending on whether we are training or just acting.
To use the crowd-reward data, download from above and set the following in your environment:
Official improved DQN updated and released Feb 25th built on Torch