You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, first of all thanks for sharing your great work. It is really helpful.
In the seq2seq model definition part, you say The last token input into the decoder is the one before the [EOS] token - the [EOS] token is never input into the decoder.
However, in code decoder loops until the length of target sequence is reached. So it doesn't check if the current output is [EOS] or not. So even if we see [EOS] at some time step, we continue until length of target is reached. So for example if ground truth is ["a", "b", "c", "d", "[EOS]"], it is possible that our prediction will be ["x", "y", "[EOS]", "z", "t"] or ["x", "y", "[EOS]", "z", "[EOS]"] or ["x", "y", "k", "z", "t"]. And due to same reason, it can't be used for inference because forward method depends on target length. Am I missing something?
The text was updated successfully, but these errors were encountered:
Hi, first of all thanks for sharing your great work. It is really helpful.
In the seq2seq model definition part, you say
The last token input into the decoder is the one before the [EOS] token - the [EOS] token is never input into the decoder.
However, in code decoder loops until the length of target sequence is reached. So it doesn't check if the current output is [EOS] or not. So even if we see [EOS] at some time step, we continue until length of target is reached. So for example if ground truth is ["a", "b", "c", "d", "[EOS]"], it is possible that our prediction will be ["x", "y", "[EOS]", "z", "t"] or ["x", "y", "[EOS]", "z", "[EOS]"] or ["x", "y", "k", "z", "t"]. And due to same reason, it can't be used for inference because forward method depends on target length. Am I missing something?
The text was updated successfully, but these errors were encountered: