Notebook 1 <eos> problem. #197

yusufsali61 · 2023-03-23T13:37:11Z

Hi, first of all thanks for sharing your great work. It is really helpful.

In the seq2seq model definition part, you say The last token input into the decoder is the one before the [EOS] token - the [EOS] token is never input into the decoder.
However, in code decoder loops until the length of target sequence is reached. So it doesn't check if the current output is [EOS] or not. So even if we see [EOS] at some time step, we continue until length of target is reached. So for example if ground truth is ["a", "b", "c", "d", "[EOS]"], it is possible that our prediction will be ["x", "y", "[EOS]", "z", "t"] or ["x", "y", "[EOS]", "z", "[EOS]"] or ["x", "y", "k", "z", "t"]. And due to same reason, it can't be used for inference because forward method depends on target length. Am I missing something?

The text was updated successfully, but these errors were encountered:

zjysteven · 2023-06-15T18:28:08Z

Same question here. Did you figure out anything by any chance @yusufsali61 ?

NickleDave · 2023-12-19T21:28:49Z

Hi @yusufsali61 and @zjysteven I had the same question.
You might find this implementation from @sooftware helpful:
https://github.com/sooftware/seq2seq/blob/646c3cd70f19c761a657cb7ea7a54be04934a58d/seq2seq/model/decoder.py#L95

bentrevett closed this as completed Jan 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notebook 1 <eos> problem. #197

Notebook 1 <eos> problem. #197

yusufsali61 commented Mar 23, 2023 •

edited

Loading

zjysteven commented Jun 15, 2023

NickleDave commented Dec 19, 2023

Notebook 1 <eos> problem. #197

Notebook 1 <eos> problem. #197

Comments

yusufsali61 commented Mar 23, 2023 • edited Loading

zjysteven commented Jun 15, 2023

NickleDave commented Dec 19, 2023

yusufsali61 commented Mar 23, 2023 •

edited

Loading