New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
"AutoTokenizer.from_pretrained" does not work when loading a pretrained MarianTokenizer from a local directory #5040
Comments
I noticed that after saving the pretrained MarianTokenizer to "my_dir", the "source.spm" file and "target.spm" file are actually named as:
and
When I changed the file names back to "source.spm" and "target.spm", the error disappears. |
I figured it out! The spm files are coming from the cache. |
Thanks a lot... Will this fix be included in the next release? |
Yes! |
Same issue exists for |
Please make a new issue with instructions to reproduce. Thanks! |
Did you ever solve this for Albert models? @mittalsuraj18 |
If you also see the warning: "The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. |
馃悰 Bug
Information
I want to save MarianConfig, MarianTokenizer, and MarianMTModel to a local directory ("my_dir") and then load them:
But the above code failed when loading the saved MarianTokenizer from "my_dir":
The text was updated successfully, but these errors were encountered: