Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

选择albert模型tokenizer加载错误 #11

Open
ReverseRoy opened this issue Oct 9, 2023 · 0 comments
Open

选择albert模型tokenizer加载错误 #11

ReverseRoy opened this issue Oct 9, 2023 · 0 comments

Comments

@ReverseRoy
Copy link

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BertTokenizer'.
The class this function is called from is 'AlbertTokenizer'.
Traceback (most recent call last):
File "/home/efsz/localCode/Pytorch-NLU/test/tc/tet_tc_base_multi_label.py", line 73, in
lc.process()
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcRun.py", line 32, in process
self.corpus = Corpus(self.config, self.logger)
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcData.py", line 25, in init
self.tokenizer = self.load_tokenizer(self.config)
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcData.py", line 206, in load_tokenizer
tokenizer = PRETRAINED_MODEL_CLASSES[config.model_type][1].from_pretrained(config.pretrained_model_name_or_path)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2017, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/models/albert/tokenization_albert.py", line 183, in init
self.sp_model.Load(vocab_file)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string

其他模型好像没问题,但是albert的tokenizer加载会报这个错

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant