-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewards don't change while training #17
Comments
I also run the code SeqGAN. The rewards in SeqGAN experiment are changing. |
Because of the Bootstrapped Rescaled Activation trick |
@LeeJuly30 Could you give more explanation? I can't understand why Bootstrapped Rescaled Activation trick causes this problem. And how is that even trained with fixed rewards? |
In this paper,
if you look at the formula, you will find that the reward after rescaled only depends on the batch size B and it's rank, so expectation and variance of reward within a mini-batch won't change.
|
@LeeJuly30 But the results above ran for several mini-batch ( |
The reward will change, it is the expectation and variance of reward that won't change. |
@LeeJuly30 Thanks for tips. I find the rewards do change and the magnitude of rewards before rescaling change a lot. |
When I run the code, I try to print the mean value of rewards. Strange is, the mean of rewards didn't change while training. Code snippet I used is here. I just add a print under line 285:
The output is here:
The text was updated successfully, but these errors were encountered: