-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
运行脚本generate_chatllama.py后,tokenizer报错 #8
Comments
我也是求教 |
同样出错 |
subscribe this issue as meet the same issue |
同样问题,怎么解决 |
spm_model_file = '../ChatLLaMA-zh-7B/tokenizer.model'这个分词模型是不是损坏了? |
同样出错 |
我测试了没有遇到这个问题,检查一下Sentencepiece版本? 我这里是0.1.97 |
我这边Sentencepiece版本也是0.1.97,刚试了还是报错: |
已解决,重新下载模型权重文件。git clone时要安装git lfs |
安装之后下载模型权重文件速度太慢了,有什么好方法吗? |
Traceback (most recent call last):
File "scripts/generate_chatllama.py", line 82, in
args.tokenizer = str2tokenizerargs.tokenizer
File "/home/mo/llama/TencentPretrain/tencentpretrain/utils/tokenizers.py", line 255, in init
super().init(args, is_src)
File "/home/mo/llama/TencentPretrain/tencentpretrain/utils/tokenizers.py", line 30, in init
self.sp_model.Load(spm_model_path)
File "/home/mo/miniconda3/envs/llm_env/lib/python3.8/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/home/mo/miniconda3/envs/llm_env/lib/python3.8/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
我运行脚本后报错了,请问这个问题有谁遇到过嘛
The text was updated successfully, but these errors were encountered: