RL4J: Weighted Double Q Learning #5275
From @tom-adsfund on May 12, 2018 3:5
Double Q Learning (greatly) improved on Q Learning to prevent inevitable upward bias.
But DQL has its own (downward) bias issues, which is where WDQL comes in.
Has been used recently in multi agent cooperative reinforcement with success: https://arxiv.org/pdf/1802.08534.pdf
Copied from original issue: deeplearning4j/rl4j#95