Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 对话和操作知识库接口无响应,一直pending #379

Open
2 tasks done
honins opened this issue Jun 4, 2024 · 1 comment
Open
2 tasks done

[BUG] 对话和操作知识库接口无响应,一直pending #379

honins opened this issue Jun 4, 2024 · 1 comment

Comments

@honins
Copy link

honins commented Jun 4, 2024

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

如图,对话时一直pending中

image

期望行为 | Expected Behavior

修复bug

运行环境 | Environment

- OS:Ubuntu 22.04 
- NVIDIA Driver:525.105.17
- CUDA:Cuda compilation tools, release 12.1, V12.1.105
- docker: 25.03
- docker-compose: 24.0.5
- NVIDIA GPU:GTX 4080 Super
- NVIDIA GPU Memory:16GB

QAnything日志 | QAnything logs

启动日志

请输入您使用的大模型B数(示例:1.8B/3B/7B): 3B
model_size=3B
GPUID1=0, GPUID2=0, device_id=0
llm_api is set to [local]
device_id is set to [0]
runtime_backend is set to [hf]
model_name is set to [MiniChat-2-3B]
conv_template is set to [minichat]
tensor_parallel is set to [1]
gpu_memory_utilization is set to [0.81]
Do you want to use the previous ip: localhost? (yes/no) 是否使用上次的ip: ?(yes/no) 回车默认选yes,请输入:
Running under native Linux
[+] Running 5/6
 ⠼ Network qanything_milvus_mysql_local  Created                                                                                                                                                              1.5s 
 ✔ Container milvus-minio-local          Started                                                                                                                                                              0.6s 
 ✔ Container mysql-container-local       Started                                                                                                                                                              0.8s 
 ✔ Container milvus-etcd-local           Started                                                                                                                                                              0.8s 
 ✔ Container milvus-standalone-local     Started                                                                                                                                                              0.9s 
 ✔ Container qanything-container-local   Started                                                                                                                                                              1.2s 
qanything-container-local  | 
qanything-container-local  | =============================
qanything-container-local  | == Triton Inference Server ==
qanything-container-local  | =============================
qanything-container-local  | 
qanything-container-local  | NVIDIA Release 23.05 (build 61161506)
qanything-container-local  | Triton Server Version 2.34.0
qanything-container-local  | 
qanything-container-local  | Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
qanything-container-local  | 
qanything-container-local  | Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
qanything-container-local  | 
qanything-container-local  | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
qanything-container-local  | By pulling and using the container, you accept the terms and conditions of this license:
qanything-container-local  | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
qanything-container-local  | 
qanything-container-local  | llm_api is set to [local]
qanything-container-local  | device_id is set to [0]
qanything-container-local  | runtime_backend is set to [hf]
qanything-container-local  | model_name is set to [MiniChat-2-3B]
qanything-container-local  | conv_template is set to [minichat]
qanything-container-local  | tensor_parallel is set to [1]
qanything-container-local  | gpu_memory_utilization is set to [0.81]
qanything-container-local  | checksum 3299f3701e32a952d9d5769897bbb4b5
qanything-container-local  | default_checksum 3299f3701e32a952d9d5769897bbb4b5
qanything-container-local  | 
qanything-container-local  | [notice] A new release of pip is available: 23.3.2 -> 24.0
qanything-container-local  | [notice] To update, run: python3 -m pip install --upgrade pip
qanything-container-local  | GPU ID: 0, 0
qanything-container-local  | GPU1 Model: NVIDIA GeForce RTX 4080 SUPER
qanything-container-local  | Compute Capability: 8.9
qanything-container-local  | OCR_USE_GPU=True because 8.9 >= 7.5
qanything-container-local  | ====================================================
qanything-container-local  | ******************** 重要提示 ********************
qanything-container-local  | ====================================================
qanything-container-local  | 
qanything-container-local  | 您当前的显存为 16376 MiB 推荐部署小于等于7B的大模型
qanything-container-local  | tokens上限默认设置为4096
qanything-container-local  | The triton server for embedding and reranker will start on 0 GPUs
qanything-container-local  | Executing hf runtime_backend
qanything-container-local  | The rerank service is ready! (2/8)
qanything-container-local  | rerank服务已就绪! (2/8)
qanything-container-local  | The ocr service is ready! (3/8)
qanything-container-local  | OCR服务已就绪! (3/8)
qanything-container-local  | Waiting for the backend service to start...
qanything-container-local  | 等待启动后端服务
qanything-container-local  | Waiting for the backend service to start...
qanything-container-local  | 等待启动后端服务
qanything-container-local  | The qanything backend service is ready! (4/8)
qanything-container-local  | qanything后端服务已就绪! (4/8)
qanything-container-local  | Dependencies related to npm are obtained. (5/8)
qanything-container-local  | The front_end/dist folder already exists, no need to build the front end again.(6/8)
qanything-container-local  | Waiting for the front-end service to start...
qanything-container-local  | 等待启动前端服务
qanything-container-local  | 
qanything-container-local  | > ai-demo@1.0.1 serve
qanything-container-local  | > vite preview --port 5052
qanything-container-local  | 
qanything-container-local  | The CJS build of Vite's Node API is deprecated. See https://vitejs.dev/guide/troubleshooting.html#vite-cjs-node-api-deprecated for more details.
qanything-container-local  |   ➜  Local:   http://localhost:5052/qanything
qanything-container-local  |   ➜  Network: http://172.20.0.6:5052/qanything
qanything-container-local  | The front-end service is ready!...(7/8)
qanything-container-local  | 前端服务已就绪!...(7/8)
qanything-container-local  | I0604 01:29:55.535924 129 grpc_server.cc:377] Thread started for CommonHandler
qanything-container-local  | I0604 01:29:55.535953 129 infer_handler.cc:629] New request handler for ModelInferHandler, 0
qanything-container-local  | I0604 01:29:55.535959 129 infer_handler.h:1025] Thread started for ModelInferHandler
qanything-container-local  | I0604 01:29:55.535986 129 infer_handler.cc:629] New request handler for ModelInferHandler, 0
qanything-container-local  | I0604 01:29:55.535991 129 infer_handler.h:1025] Thread started for ModelInferHandler
qanything-container-local  | I0604 01:29:55.536022 129 stream_infer_handler.cc:122] New request handler for ModelStreamInferHandler, 0
qanything-container-local  | I0604 01:29:55.536026 129 infer_handler.h:1025] Thread started for ModelStreamInferHandler
qanything-container-local  | I0604 01:29:55.536028 129 grpc_server.cc:2450] Started GRPCInferenceService at 0.0.0.0:9001
qanything-container-local  | I0604 01:29:55.536133 129 http_server.cc:3555] Started HTTPService at 0.0.0.0:9000
qanything-container-local  | I0604 01:29:55.576918 129 http_server.cc:185] Started Metrics Service at 0.0.0.0:9002
qanything-container-local  | I0604 01:30:10.435466 129 http_server.cc:3449] HTTP request: 0 /v2/health/ready
qanything-container-local  | The embedding and rerank service is ready!. (7.5/8)
qanything-container-local  | Embedding 和 Rerank 服务已准备就绪!(7.5/8)
qanything-container-local  | You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
  0%|          | 0/1 [00:00<?, ?it/s]04 09:29:55 | ERROR | stderr | 
100%|██████████| 1/1 [00:14<00:00, 14.91s/it]:10 | ERROR | stderr | 
100%|██████████| 1/1 [00:14<00:00, 14.91s/it]:10 | ERROR | stderr | 
qanything-container-local  | 2024-06-04 09:30:10 | ERROR | stderr | 
qanything-container-local  | 2024-06-04 09:30:10 | INFO | model_worker | Register to controller
qanything-container-local  | 2024-06-04 09:30:10 | ERROR | stderr | INFO:     Started server process [144]
qanything-container-local  | 2024-06-04 09:30:10 | ERROR | stderr | INFO:     Waiting for application startup.
qanything-container-local  | 2024-06-04 09:30:10 | ERROR | stderr | INFO:     Application startup complete.
qanything-container-local  | 2024-06-04 09:30:10 | ERROR | stderr | INFO:     Uvicorn running on http://0.0.0.0:7801 (Press CTRL+C to quit)
qanything-container-local  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
qanything-container-local  |                                  Dload  Upload   Total   Spent    Left  Speed
100    28  100    28    0     0  49557      0 --:--:-- --:--:-- --:--:-- 28000
qanything-container-local  | The llm service is ready!, now you can use the qanything service. (8/8)
qanything-container-local  | LLM 服务已准备就绪!现在您可以使用qanything服务。(8/8)
qanything-container-local  | 开始检查日志文件中的错误信息...
qanything-container-local  | /workspace/qanything_local/logs/debug_logs/rerank_server.log 中未检测到明确的错误信息。请手动排查 /workspace/qanything_local/logs/debug_logs/rerank_server.log 以获取更多信息。
qanything-container-local  | /workspace/qanything_local/logs/debug_logs/ocr_server.log 中未检测到明确的错误信息。请手动排查 /workspace/qanything_local/logs/debug_logs/ocr_server.log 以获取更多信息。
qanything-container-local  | /workspace/qanything_local/logs/debug_logs/sanic_api.log 中未检测到明确的错误信息。请手动排查 /workspace/qanything_local/logs/debug_logs/sanic_api.log 以获取更多信息。
qanything-container-local  | Time elapsed: 17 seconds.
qanything-container-local  | 已耗时: 17 秒.
qanything-container-local  | Please visit the front-end service at [http://localhost:5052/qanything/] to conduct Q&A.
qanything-container-local  | 请在[http://localhost:5052/qanything/]下访问前端服务来进行问答,如果前端报错,请在浏览器按F12以获取更多报错信息

sanic_api.log

INFO:debug_logger:history: [] 
INFO:debug_logger:question: 测试
INFO:debug_logger:kb_ids: ['KB2027ce29d6384fc5a6bb4a7bfc59c9ff']
INFO:debug_logger:user_id: zzp
INFO:debug_logger:check_kb_exist [('KB2027ce29d6384fc5a6bb4a7bfc59c9ff',)]
INFO:debug_logger:collection zzp exists
INFO:debug_logger:partitions: ['KB2027ce29d6384fc5a6bb4a7bfc59c9ff']
INFO:debug_logger:list_docs zzp
INFO:debug_logger:kb_id: KBb9a5b4d0871c48bb8338df488ddd2054
INFO:debug_logger:list_docs zzp
INFO:debug_logger:kb_id: KB2be1fe6448384b81bd8872821917dfe0
INFO:debug_logger:list_docs zzp

debug.log

2024-06-04 09:47:45,729 - [PID: 886][Sanic-Server-3-0] - [Function: local_doc_chat] - INFO - rerank True
2024-06-04 09:47:45,729 - [PID: 886][Sanic-Server-3-0] - [Function: local_doc_chat] - INFO - history: [] 
2024-06-04 09:47:45,729 - [PID: 886][Sanic-Server-3-0] - [Function: local_doc_chat] - INFO - question: 测试
2024-06-04 09:47:45,729 - [PID: 886][Sanic-Server-3-0] - [Function: local_doc_chat] - INFO - kb_ids: ['KB2027ce29d6384fc5a6bb4a7bfc59c9ff']
2024-06-04 09:47:45,730 - [PID: 886][Sanic-Server-3-0] - [Function: local_doc_chat] - INFO - user_id: zzp
2024-06-04 09:47:45,731 - [PID: 886][Sanic-Server-3-0] - [Function: check_kb_exist] - INFO - check_kb_exist [('KB2027ce29d6384fc5a6bb4a7bfc59c9ff',)]
2024-06-04 09:47:45,755 - [PID: 886][Sanic-Server-3-0] - [Function: init] - INFO - collection zzp exists
2024-06-04 09:47:45,757 - [PID: 886][Sanic-Server-3-0] - [Function: init] - INFO - partitions: ['KB2027ce29d6384fc5a6bb4a7bfc59c9ff']
2024-06-04 09:48:27,009 - [PID: 884][Sanic-Server-1-0] - [Function: list_docs] - INFO - list_docs zzp
2024-06-04 09:48:27,010 - [PID: 884][Sanic-Server-1-0] - [Function: list_docs] - INFO - kb_id: KBb9a5b4d0871c48bb8338df488ddd2054
2024-06-04 09:48:27,108 - [PID: 884][Sanic-Server-1-0] - [Function: list_docs] - INFO - list_docs zzp

model-worker.log

024-06-04 09:29:54 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=7801, worker_address='http://0.0.0.0:7801', controller_address='http://0.0.0.0:7800', model_path='/model_repos/CustomLLM/MiniChat-2-3B', revision='main', device='cuda', gpus='0', num_gpus=1, max_gpu_memory=None, dtype='bfloat16', load_8bit=True, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=16, gptq_groupsize=-1, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, enable_exllama=False, exllama_max_seq_len=4096, exllama_gpu_split=None, exllama_cache_8bit=False, enable_xft=False, xft_max_seq_len=4096, xft_dtype=None, model_names=None, conv_template='minichat', embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None, debug=False, ssl=False)
2024-06-04 09:29:54 | INFO | model_worker | Loading the model ['MiniChat-2-3B'] on worker 3f33aa48 ...
2024-06-04 09:29:55 | ERROR | stderr | 
  0%|          | 0/1 [00:00<?, ?it/s]
2024-06-04 09:30:10 | ERROR | stderr | 
100%|██████████| 1/1 [00:14<00:00, 14.91s/it]
2024-06-04 09:30:10 | ERROR | stderr | 
100%|██████████| 1/1 [00:14<00:00, 14.91s/it]
2024-06-04 09:30:10 | ERROR | stderr | 

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

@fire717
Copy link

fire717 commented Jun 27, 2024

我重启后刚启动服务器也是这样,聊天、创建知识库,都一直转圈,多等一会就好了,可能是某些服务启动反应较慢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants