You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi
It's really a good code for learning Reinforcement Learning.
In the network.py, I have 2 questions.
I think you want to assert len(value_hidden_sizes) != 0 and len(advantage_hidden_sizes) != 0.
About the Dueling part, the logic in code is layer contains value_hidden_sizes linear, then the layer is delivered to the next advantage logic. But I read the related paper, if I understand correct, it describes that the state-value and advantage are generated from the same source observation, then they're added together, and minus the mean advantage value.
Looking forward to your further response.
The text was updated successfully, but these errors were encountered:
Hi
It's really a good code for learning Reinforcement Learning.
In the network.py, I have 2 questions.
Looking forward to your further response.
The text was updated successfully, but these errors were encountered: