[BUG] linux 启动阶段加载模型卡住 #2191

WangXBruc · 2023-11-27T11:36:36Z

问题描述 / Problem Description
在 linux 上启动，加载模型阶段卡住
使用 python3 startup.py --all-api 启动api服务时，一直卡在 Loading the model ['chatglm3-6b'] on worker 855b18b5 ..
但是同样的代码，在mac上可以启动。
辛苦看下。

环境信息 / Environment Information
操作系统：Linux-5.10.84-004.centos7.x86_64-x86_64-with-glibc2.32.
python版本：3.8.10 (default, Nov 6 2023, 19:45:31)
项目版本：v0.2.7
langchain版本：0.0.340. fastchat版本：0.2.32

当前使用的分词器：ChineseRecursiveTextSplitter
当前启动的LLM模型：['chatglm3-6b', 'chatglm2-6b', 'zhipu-api', 'openai-api'] @ cuda
{'device': 'cuda',
'host': '0.0.0.0',
'infer_turbo': False,
'model_path': '/home/admin/chatglm3-6b',
'port': 20002}
{'device': 'cuda',
'host': '0.0.0.0',
'infer_turbo': False,
'model_path': '/home/admin/chatglm2-6b',
'port': 20002}
{'api_key': '',
'device': 'auto',
'host': '0.0.0.0',
'infer_turbo': False,
'online_api': True,
'port': 21001,
'provider': 'ChatGLMWorker',
'version': 'chatglm_turbo',
'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
{'api_base_url': 'https://api.openai.com/v1',
'api_key': '',
'device': 'auto',
'host': '0.0.0.0',
'infer_turbo': False,
'model_name': 'gpt-35-turbo',
'online_api': True,
'openai_proxy': '',
'port': 20002}
当前Embbedings模型： m3e-base @ cuda
==============================Langchain-Chatchat Configuration==============================

2023-11-27 18:58:44,830 - startup.py[line:647] - INFO: 正在启动服务：
2023-11-27 18:58:44,830 - startup.py[line:648] - INFO: 如需查看 llm_api 日志，请前往 /home/admin/LangChain-Chatchat/logs
2023-11-27 18:58:53 | INFO | model_worker | Register to controller
2023-11-27 18:58:53 | ERROR | stderr | INFO: Started server process [107018]
2023-11-27 18:58:53 | ERROR | stderr | INFO: Waiting for application startup.
2023-11-27 18:58:53 | ERROR | stderr | INFO: Application startup complete.
2023-11-27 18:58:53 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:20000 (Press CTRL+C to quit)
2023-11-27 18:58:55 | INFO | model_worker | Loading the model ['chatglm3-6b'] on worker 855b18b5 ...
2023-11-27 18:58:55 | INFO | model_worker | Loading the model ['chatglm2-6b'] on worker 7f4a31da ...
Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s]
Loading checkpoint shards: 14%|███████████████████ | 1/7 [00:25<02:33, 25.53s/it]
Loading checkpoint shards: 14%|███████████████████ | 1/7 [00:25<02:34, 25.67s/it]
Loading checkpoint shards: 29%|██████████████████████████████████████ | 2/7 [00:52<02:12, 26.46s/it]
Loading checkpoint shards: 29%|██████████████████████████████████████ | 2/7 [00:53<02:13, 26.67s/it]
Loading checkpoint shards: 43%|█████████████████████████████████████████████████████████ | 3/7 [01:19<01:45, 26.44s/it]
Loading checkpoint shards: 43%|█████████████████████████████████████████████████████████ | 3/7 [01:19<01:46, 26.59s/it]
Loading checkpoint shards: 57%|████████████████████████████████████████████████████████████████████████████ | 4/7 [01:43<01:17, 25.74s/it]
Loading checkpoint shards: 57%|████████████████████████████████████████████████████████████████████████████ | 4/7 [01:44<01:17, 25.93s/it]
Loading checkpoint shards: 71%|███████████████████████████████████████████████████████████████████████████████████████████████ | 5/7 [02:10<00:52, 26.09s/it]
Loading checkpoint shards: 71%|███████████████████████████████████████████████████████████████████████████████████████████████ | 5/7 [02:11<00:52, 26.29s/it]
Loading checkpoint shards: 86%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 6/7 [02:36<00:26, 26.12s/it]
Loading checkpoint shards: 86%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 6/7 [02:38<00:26, 26.44s/it]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [02:53<00:00, 22.80s/it]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [02:53<00:00, 24.78s/it]
2023-11-27 19:01:48 | ERROR | stderr |
2023-11-27 19:01:53 | INFO | model_worker | Register to controller

WangXBruc · 2023-11-29T13:01:03Z

pytorch 版本 2.1.0+cu121
cuda 版本 12.0

卡在 @加载模型阶段，大佬们，求救啊！
已经将 python 版本改成 3.10.0 还是不 work

zRzRzRzRzRzRzR · 2023-11-30T05:58:35Z

你更新到最新dev试试
然后，检查一下下显卡内存和运行内存是不是炸了

xiaoxiao9992 · 2023-11-30T10:31:16Z

你更新到最新的开发尝试然后，检查一下下一个显卡内存和运行内存是不是炸了

显卡和内存没炸, 可能是因为同时或者之前起得有其他服务, 把之前的服务停掉就能运行, 但是...没办法一起运行嘛

WangXBruc · 2023-12-01T03:10:36Z

同时启动了两个本地模型，跑不起来，改成启动一个可以了

WangXBruc added the bug Something isn't working label Nov 27, 2023

WangXBruc closed this as completed Dec 1, 2023

dosubot bot mentioned this issue Apr 12, 2024

运行 python startup.py -a 加载chatglm3-6b模型到71%卡住不动 #3734

Closed

dosubot bot mentioned this issue May 14, 2024

[BUG] 简洁阐述问题 / 拉取docker镜像后，启动不起来，卡在模型加载地方不动 #4010

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] linux 启动阶段加载模型卡住 #2191

[BUG] linux 启动阶段加载模型卡住 #2191

WangXBruc commented Nov 27, 2023

WangXBruc commented Nov 29, 2023

zRzRzRzRzRzRzR commented Nov 30, 2023

xiaoxiao9992 commented Nov 30, 2023

WangXBruc commented Dec 1, 2023

[BUG] linux 启动阶段加载模型卡住 #2191

[BUG] linux 启动阶段加载模型卡住 #2191

Comments

WangXBruc commented Nov 27, 2023

WangXBruc commented Nov 29, 2023

zRzRzRzRzRzRzR commented Nov 30, 2023

xiaoxiao9992 commented Nov 30, 2023

WangXBruc commented Dec 1, 2023