Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZoneoutLSTMCell incorrect output #1313

Closed
albertz opened this issue Apr 14, 2023 · 2 comments
Closed

ZoneoutLSTMCell incorrect output #1313

albertz opened this issue Apr 14, 2023 · 2 comments
Labels
potential-new-behavior Discussions about RETURNN behaviour

Comments

@albertz
Copy link
Member

albertz commented Apr 14, 2023

Paper.

We currently have this code:

# Apply vanilla LSTM
output, new_state = self._cell(inputs, state, scope)

(prev_c, prev_h) = state
(new_c, new_h) = new_state

# apply zoneout
c = ...
h = ...

new_state = rnn_cell.LSTMStateTuple(c, h)

return output, new_state

output is the original output, and h the zoneout-transformed output.

But actually it should not return output but it should return h instead. At least this is my understanding of the paper.

Can someone confirm?

So, what do we do now? I think there are many existing setups using this. And just changing this would change the behavior and do sth different then, so it is not really compatible.

We could introduce a new option to switch between the incorrect and correct behavior. This flag default would use the correct behavior with a new behavior version.

@albertz albertz added the potential-new-behavior Discussions about RETURNN behaviour label Apr 14, 2023
@michelwi
Copy link
Collaborator

Can someone confirm?

If the h's are the same as the output's, then I agree that the current behavior is not the same as described in the paper. As described in the paper h should be the output which is a combination of prev_h and new_h according to a random binary mask.

@albertz
Copy link
Member Author

albertz commented Apr 19, 2023

I fixed this now. If you switch to behavior version 17, it will by default change to the new correct behavior. By staying at older behavior versions, nothing will change for you.

If you want to stay at your older behavior version, but want to see the effect of this fix, just explicitly set use_zoneout_output=True in the ZoneoutLSTM flags (unit_opts).

If you want to get the old incorrect behavior in a new behavior version, just explicitly set use_zoneout_output=False.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential-new-behavior Discussions about RETURNN behaviour
Projects
None yet
Development

No branches or pull requests

2 participants