-
Notifications
You must be signed in to change notification settings - Fork 311
Closed
Description
Hey Guys,
I'm really enjoying Brax! I have a question about entropy loss in the PPO code.
I have been logging the entropy loss, and for some training runs it becomes positive... which really confuses me as it seems to suggests that the distributions entropy is negative!
I have looked through the Brax code and have not managed to work out an explanation. I wonder if I have missed some constant being added somewhere, or if there is some other obvious explanation. I feel I must be missing something. Has anyone else experienced this?
Many thanks,
Ben
Metadata
Metadata
Assignees
Labels
No labels