You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Here are two part loss in actor agent : adv loss and entropy loss, can you tell me why you add the entropy loss? I know the entropy weight decreased from 1 to 0.0001, but I do not know why you need entropy loss.
thank you!
Liu
The text was updated successfully, but these errors were encountered:
Entropy loss is for promoting exploration in RL. Large entropy means the action probability distribution is more spread-out. The agent would then try different trajectories (hence more exploration). Decaying the entropy factor during training is to let the agent converge its policy (i.e., more and more certain about its action choice). You can refer to https://arxiv.org/pdf/1602.01783.pdf (Section 4, Asynchronous advantage actor-critic, entropy paragraph) for more principles behind the entropy loss. Hope this helps!
Hi,
Here are two part loss in actor agent : adv loss and entropy loss, can you tell me why you add the entropy loss? I know the entropy weight decreased from 1 to 0.0001, but I do not know why you need entropy loss.
thank you!
Liu
The text was updated successfully, but these errors were encountered: