Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why adding bias to the forget gate? #7

Closed
AzizCode92 opened this issue Oct 15, 2018 · 1 comment
Closed

Why adding bias to the forget gate? #7

AzizCode92 opened this issue Oct 15, 2018 · 1 comment

Comments

@AzizCode92
Copy link

new_c = c*tf.sigmoid(f+self.forget_bias) + tf.sigmoid(i)*g

The code implementation didn't correspond exactly to the equation we have in the layer normalization paper.
I also have doubts about normalizing all the gates, so for example, the forget gate will never be equal to zero du to the shift we add.
Isn't more logic to just keep the gates as they are and then just normalize cell state?

Thank you

@AzizCode92
Copy link
Author

found the reason why

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant