-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty prediction on CNN/DM with beam > 1 #457
Comments
Can you print your logs with -verbose as well? |
Hm, I have run into a related problem in the past where one in every ~20 predictions was empty, even with beam size 1. Looking at the other top predictions, everything seems normal. I have not been able to replicate this error consistently yet, but a simple fix is to set the probability of EOS to -1e7 or so for the very first step. Let me know if you make progress figuring this bug out! |
Oh okay. Let me try adding a |
I just checked Abisee's work, she is indeed using an min_length option, and discarding beams that are too short. Adding an option would make the implementation in line with her work then. |
@pltrdy Brilliant! |
@mataney sry was in vacations. |
I trained a summarization model using @srush 's setup (described here), during translate, using a beam size higher than one result in an empty prediction. Using beam = 1 make redundancy.
Preprocessing
The problem may come from the dataset which I processed again. I'm basically running:
</s>
tokens from target files to</t>
(
*.repeos.txt
files are those with</t>
replacing</s>
)Training
## Translation
Results
with beam_size = 1: redundant, but not empty, e.g.
with beam_size > 1: empty, each beam produce and eos (token_id = 3)
with
n_best
> 1: interestingly, I find good sentences in n_best, that are not THE best, e.g.but also some that contains redundancy:
This is not a trivial problem, I really don't know how this happens.
Any clues are welcome!
The text was updated successfully, but these errors were encountered: