Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebook 1 <eos> problem. #197

Closed
yusufsali61 opened this issue Mar 23, 2023 · 2 comments
Closed

Notebook 1 <eos> problem. #197

yusufsali61 opened this issue Mar 23, 2023 · 2 comments

Comments

@yusufsali61
Copy link

yusufsali61 commented Mar 23, 2023

Hi, first of all thanks for sharing your great work. It is really helpful.

In the seq2seq model definition part, you say The last token input into the decoder is the one before the [EOS] token - the [EOS] token is never input into the decoder.
However, in code decoder loops until the length of target sequence is reached. So it doesn't check if the current output is [EOS] or not. So even if we see [EOS] at some time step, we continue until length of target is reached. So for example if ground truth is ["a", "b", "c", "d", "[EOS]"], it is possible that our prediction will be ["x", "y", "[EOS]", "z", "t"] or ["x", "y", "[EOS]", "z", "[EOS]"] or ["x", "y", "k", "z", "t"]. And due to same reason, it can't be used for inference because forward method depends on target length. Am I missing something?

@zjysteven
Copy link

Same question here. Did you figure out anything by any chance @yusufsali61 ?

@NickleDave
Copy link

Hi @yusufsali61 and @zjysteven I had the same question.
You might find this implementation from @sooftware helpful:
https://github.com/sooftware/seq2seq/blob/646c3cd70f19c761a657cb7ea7a54be04934a58d/seq2seq/model/decoder.py#L95

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants