You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great code!
I notice that in the function get_kl(), you use policy net to generate the mean, log_std and std, then copy these three parameters and calculate the KL divergence between the original parameters and the copied parameters, which is obviously zero all the time. Is this a bug or a intended behavior?
The text was updated successfully, but these errors were encountered:
Thanks for your great code!
I notice that in the function get_kl(), you use policy net to generate the mean, log_std and std, then copy these three parameters and calculate the KL divergence between the original parameters and the copied parameters, which is obviously zero all the time. Is this a bug or a intended behavior?
The text was updated successfully, but these errors were encountered: