Q divergence #19

rbrigden · 2018-06-05T22:47:41Z

Hello! I am working to implement MADDPG in pytorch based on the details of this implementation in tensorflow. I have followed the implementation to a tee, but I when I remove regularization on the policy logits, my Q values diverge. When I remove the same regularization term in your implementation, this does not occur. Did you experience this divergence issue? Was a matter of tuning to fix or does this indicate an issue with my implementation? Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q divergence #19

Q divergence #19

rbrigden commented Jun 5, 2018

Q divergence #19

Q divergence #19

Comments

rbrigden commented Jun 5, 2018