You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
in both models BertGenerationEncoder, BertGenerationDecoder, there's no need for token_type_ids however the BertGenerationTokenizer provides it, this issue will be raised if you want to input the tokenizer results directly with **,
and if it meant to be like this, and the user should be aware of this behaviour, I think a change should be in the documentation.
Note: Another issue with BertGenerationTokenizer is the necessity of sentencepiece module, do you prefer that it should for the user to install it separately or it should be included in transformers dependencies.
The text was updated successfully, but these errors were encountered:
sadakmed
changed the title
BertGenerationTokenizer provides unexpected value for BertGenerationModel
BertGenerationTokenizer provides an unexpected value for BertGenerationModel
Feb 6, 2021
You're right, there's no need for token type IDs in this tokenizer. The workaround for this is to remove token_type_ids from the model input names, as it is done in the DistilBERT tokenizer:
Regarding the necessity of sentencepiece module, yes it is necessary. It was previously in the transformers dependencies and we removed it because it was causing compilation issues on some hardware. The error should be straightforward and mention a sentencepiece installation is necessary in order to use that tokenizer, so no problem there.
transformers
version: 4.2.2Information
in both models BertGenerationEncoder, BertGenerationDecoder, there's no need for
token_type_ids
however the BertGenerationTokenizer provides it, this issue will be raised if you want to input the tokenizer results directly with**
,and if it meant to be like this, and the user should be aware of this behaviour, I think a change should be in the documentation.
Note: Another issue with BertGenerationTokenizer is the necessity of sentencepiece module, do you prefer that it should for the user to install it separately or it should be included in transformers dependencies.
The text was updated successfully, but these errors were encountered: