Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
RL4J - Fix for setTarget() (issue #8107) #8250
What changes were proposed in this pull request?
See issue #8107
After reading again "Playing Atari with Deep Reinforcement Learning" (http://arxiv.org/abs/1312.5602), I still think there is a problem with the current implementation of standard DQN, but I don't think that the target-network should be used for
I have also read the section 3 "Understanding Deep Q-Network" of "A Theoretical Analysis of Deep Q-Learning" (https://arxiv.org/pdf/1901.00137.pdf).
First, the target-network should be used to compute
We can see that the target-network (
Second, the target-network is only used as a stable base to compute the loss function, and not "to obtain the labels" as @flyheroljg says. This means that the Q-Network should used for
The above snippet is used for Double-DQN, but only the td-target part (
How was this patch tested?
With cartpole-v0 (re-written in java because couldn't make gym's version to work; can add it to this PR if requested)