Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glm4 9b 1m 启动报错 #58

Closed
1 of 2 tasks
brightzhu2020 opened this issue Jun 6, 2024 · 12 comments
Closed
1 of 2 tasks

glm4 9b 1m 启动报错 #58

brightzhu2020 opened this issue Jun 6, 2024 · 12 comments
Assignees

Comments

@brightzhu2020
Copy link

brightzhu2020 commented Jun 6, 2024

System Info / 系統信息

win10
cuda 11.8
python 3.12
transformer 4.40
GPU A4000

Who can help? / 谁可以帮助到您?

anyone who could help

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

启动composite demo后页面出现,点击 all tools 或文本分析任意按钮 报错

KeyError: '<|endoftext|>'
Traceback:
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 575, in _run_script
self._session_state.on_script_will_rerun(
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\state\safe_session_state.py", line 65, in on_script_will_rerun
self._state.on_script_will_rerun(latest_widget_states)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\state\session_state.py", line 517, in on_script_will_rerun
self._call_callbacks()
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\state\session_state.py", line 530, in _call_callbacks
self._new_widget_state.call_callback(wid)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\state\session_state.py", line 274, in call_callback
callback(*args, **kwargs)
File "C:\Users\XXX\glm4\GLM-4\composite_demo\src\main.py", line 123, in page_changed
st.session_state.client = build_client(Mode(new_page))
File "C:\Users\XXX\glm4\GLM-4\composite_demo\src\main.py", line 107, in build_client
return get_client(CHAT_MODEL_PATH, typ)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 165, in wrapper
return cached_func(*args, **kwargs)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 194, in call
return self._get_or_create_cached_value(args, kwargs)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 221, in _get_or_create_cached_value
return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 277, in _handle_cache_miss
computed_value = self._info.func(*func_args, **func_kwargs)
File "C:\Users\XXX\glm4\GLM-4\composite_demo\src\client.py", line 89, in get_client
return HFClient(model_path)
File "C:\Users\XXX\glm4\GLM-4\composite_demo\src\clients\hf.py", line 18, in init
self.tokenizer = AutoTokenizer.from_pretrained(
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 678, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\tokenization_utils_base.py", line 1825, in from_pretrained
return cls._from_pretrained(
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\tokenization_utils_base.py", line 2061, in _from_pretrained
added_tokens = tokenizer.sanitize_special_tokens()
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\tokenization_utils_base.py", line 856, in sanitize_special_tokens
return self.add_tokens(self.all_special_tokens_extended, special_tokens=True)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\tokenization_utils_base.py", line 999, in add_tokens
return self._add_tokens(new_tokens, special_tokens=special_tokens)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\tokenization_utils.py", line 421, in _add_tokens
and self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\tokenization_utils.py", line 575, in convert_tokens_to_ids
return self._convert_token_to_id_with_added_voc(tokens)
File "C:\Users\XXX\anaconda3\envs\GLM4\lib\site-packages\transformers\tokenization_utils.py", line 588, in _convert_token_to_id_with_added_voc
return self._convert_token_to_id(token)
File "C:\Users\XXX/.cache\huggingface\modules\transformers_modules\glm-4-9b-chat-1m\tokenization_chatglm.py", line 95, in _convert_token_to_id
return self.mergeable_ranks[token]

Expected behavior / 期待表现

期待正常启动

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jun 6, 2024
@JiaweiMorris
Copy link

使用huggingface model页的多模态demo代码出现同样报错
return self.mergeable_ranks[token]
KeyError: '<|endoftext|>'

@bridgearchway
Copy link

Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).

@brightzhu2020
Copy link
Author

Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad

@bridgearchway
Copy link

Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad

I just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.

@huaizhe2012
Copy link

huaizhe2012 commented Jun 6, 2024

情况一样,tokenizer加载报错
transformer 升级为4.40.0确实好了

@zRzRzRzRzRzRzR
Copy link
Member

请严格按照req安装依赖哦,如果是windows系统,不能装vLLM,使用transformers后端

@brightzhu2020
Copy link
Author

Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad

I just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.

would you please shall your full enviroment details?
i had other issues....
CUDA
Pytorch
etc..?

Thanks!

@bridgearchway
Copy link

Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad

I just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.

would you please shall your full enviroment details? i had other issues.... CUDA Pytorch etc..?

Thanks!

Sure, my envs are:

win 10,
cuda 12.1,
python 3.10,
gpu 3090*2,
transformer==4.40.0,
torch==2.1.0

That's all for inference. Btw, I noticed that the README in basic_demo suggests "GPUs above A100, V100, 20 and older GPU architectures are not supported". I hope this may help.

@brightzhu2020
Copy link
Author

Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad

I just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.

would you please shall your full enviroment details? i had other issues.... CUDA Pytorch etc..?
Thanks!

Sure, my envs are:

win 10, cuda 12.1, python 3.10, gpu 3090*2, transformer==4.40.0, torch==2.1.0

That's all for inference. Btw, I noticed that the README in basic_demo suggests "GPUs above A100, V100, 20 and older GPU architectures are not supported". I hope this may help.

Thanks!
it shall be a amper GPU, not a tuning GPUT

@brightzhu2020
Copy link
Author

请严格按照req安装依赖哦,如果是windows系统,不能装vLLM,使用transformers后端
非常感谢
我是从GLM3的环境继承的,全局的CUDA版本也不一样
还是希望官方出一般最低要求或者兼容性比较好的要求。一台本地机器上能跑GLM3 6b,也能跑GLM4-9b
期待越来越好,越来越完善。
等待你们的更新

@zRzRzRzRzRzRzR
Copy link
Member

或许可以使用我们trans_cli_demo,环境还是要重新装的,现在的依赖默认不会装vLLM,但是如果用trans后段能推理的长度非常非常短,8K差不多就到消费卡24G显存的上限

@brightzhu2020
Copy link
Author

Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad

I just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.

would you please shall your full enviroment details? i had other issues.... CUDA Pytorch etc..?
Thanks!

Sure, my envs are:

win 10, cuda 12.1, python 3.10, gpu 3090*2, transformer==4.40.0, torch==2.1.0

That's all for inference. Btw, I noticed that the README in basic_demo suggests "GPUs above A100, V100, 20 and older GPU architectures are not supported". I hope this may help.

May I ask what is the motherboard and CPU? can two GPU run maxium performance, as the PCIE channel limts of the CPU vs. two 3090. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants