Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a question about loss founction #13

Open
CookieYo opened this issue Jun 28, 2020 · 1 comment
Open

a question about loss founction #13

CookieYo opened this issue Jun 28, 2020 · 1 comment

Comments

@CookieYo
Copy link

Hi,
Here are two part loss in actor agent : adv loss and entropy loss, can you tell me why you add the entropy loss? I know the entropy weight decreased from 1 to 0.0001, but I do not know why you need entropy loss.

thank you!
Liu

@hongzimao
Copy link
Owner

Entropy loss is for promoting exploration in RL. Large entropy means the action probability distribution is more spread-out. The agent would then try different trajectories (hence more exploration). Decaying the entropy factor during training is to let the agent converge its policy (i.e., more and more certain about its action choice). You can refer to https://arxiv.org/pdf/1602.01783.pdf (Section 4, Asynchronous advantage actor-critic, entropy paragraph) for more principles behind the entropy loss. Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants