-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
加载不了模型 提示 No compiled kernel found #6
Comments
按提示修复吧,urllib3 or chardet 版本不对吧 |
To create a public link, set 貌似不是版本问题,我修复了版本警告。 感觉是找不到kernel? 我看了下我cpu是四核16G的。 |
这是chatglm-6b-int4的问题,看下ChatGLM的issue,看能解决不? 或者用CPU跑,参数都设置为cpu。 |
我用的 wsl 加载模型时候总是失败 然后程序暂停。
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.16) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
2023-05-28 21:02:00.484 | INFO | main::16 - CONTENT_DIR: /mnt/c/Users/MEIP-users/Desktop/ChatPDF-main/ChatPDF-main/content
Running on local URL: http://0.0.0.0:7860
To create a public link, set
share=True
inlaunch()
.2023-05-28 21:02:25.002 | DEBUG | text2vec.sentence_model:init:74 - Use device: cuda
2023-05-28 21:02:28.328 | DEBUG | textgen.chatglm.chatglm_model:init:94 - Device: cuda
No compiled kernel found.
Compiling kernels : /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.c -shared -o /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.so
Load kernel : /home/meip/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4/02a065cf2797029c036a02cac30f1da1a9bc49a3/quantization_kernels_parallel.so
Setting CPU quantization kernel threads to 4
Using quantization cache
Applying quantization to glm layers
Killed
The text was updated successfully, but these errors were encountered: