-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
should mBART-large-en-ro have decoder_start_token_id by default? #6156
Comments
Hi @sshleifer, I'd like to contribute and help out here if still needed. My thinking is to remove transformers/src/transformers/generation_utils.py Lines 403 to 409 in 6028ed9
to:
|
I dont think that change will do anything since decoder_start_token_id = 250020. What I would do is change the 250020 to a bos_token_id (0, I think) or a pad_token_id (1) and see what the BLEU score is. |
Ah yes that makes sense. I tried those two and the eos_token_id and got the following results:
|
Super interesting, thanks for running that. It seems like I should change decoder_start_token_id in the mbart-large-en-ro config to 2. Do you have opinions on mbart-large-cc25? |
No problem! Yes I think configuring decoder_start_token_id to 2 is a good idea. Unfortunately, I'm getting the same issues you're getting with mbart-large-cc25 (output's in English not Romanian and missing the first word when I use bos_token_id or 250020 and gibberish with eos/pad_token_id) and don't understand why that's the case. I'll investigate and post any useful findings. |
I think I fixed this another way in #6526
=> {'bleu': 26.81}
{'bleu': 11.57} (and takes 40 mins!) in the original fairseq I get 26.83. |
Gunna close this since the score is now basically the same as fairseq. Thanks for your help! |
Hypothesis: since the argument
prepend_bos
is set to "False" in fairseq/examples/README.md, mbart-large-en-ro does not needdecoder_start_token_id
.TODO:
decoder_start_token_id
. Setting it to None in the config might not be enough.generate
.The text was updated successfully, but these errors were encountered: