Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed decoder_start_token_id for T5 #1552

Merged
merged 3 commits into from
Nov 17, 2023

Conversation

Ehsan-Jahanbakhsh
Copy link
Contributor

in most t5 models pad_token is equal to decoder_start_token, so this wasn't a problem. but MADLAD-400 uses different decoder_start_token and pad_tokens and doesn't work. this fixes the problem

@vince62s
Copy link
Member

I don't think that the config file ALWAYS has a decoder_start_token_id option set so maybe you need to use pad as default instead.

@vince62s
Copy link
Member

you need to run black to fix the python style

@vince62s vince62s merged commit c5f46a3 into OpenNMT:master Nov 17, 2023
17 checks passed
@vince62s
Copy link
Member

if you get some interesting results with MADLAD400 can you post something here: https://forum.opennmt.net/t/madlad-400-a-multilingual-and-document-level-large-audited-dataset-model/5487/2
thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants