You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
请输入您使用的大模型B数(示例:1.8B/3B/7B): 3B
model_size=3B
GPUID1=0, GPUID2=0, device_id=0
llm_api is set to [local]
device_id is set to [0]
runtime_backend is set to [hf]
model_name is set to [MiniChat-2-3B]
conv_template is set to [minichat]
tensor_parallel is set to [1]
gpu_memory_utilization is set to [0.81]
Do you want to use the previous ip: localhost? (yes/no) 是否使用上次的ip: ?(yes/no) 回车默认选yes,请输入:
Running under native Linux
[+] Running 5/6
⠼ Network qanything_milvus_mysql_local Created 1.5s
✔ Container milvus-minio-local Started 0.6s
✔ Container mysql-container-local Started 0.8s
✔ Container milvus-etcd-local Started 0.8s
✔ Container milvus-standalone-local Started 0.9s
✔ Container qanything-container-local Started 1.2s
qanything-container-local |
qanything-container-local | =============================
qanything-container-local | == Triton Inference Server ==
qanything-container-local | =============================
qanything-container-local |
qanything-container-local | NVIDIA Release 23.05 (build 61161506)
qanything-container-local | Triton Server Version 2.34.0
qanything-container-local |
qanything-container-local | Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
qanything-container-local |
qanything-container-local | Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
qanything-container-local |
qanything-container-local | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
qanything-container-local | By pulling and using the container, you accept the terms and conditions of this license:
qanything-container-local | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
qanything-container-local |
qanything-container-local | llm_api is set to [local]
qanything-container-local | device_id is set to [0]
qanything-container-local | runtime_backend is set to [hf]
qanything-container-local | model_name is set to [MiniChat-2-3B]
qanything-container-local | conv_template is set to [minichat]
qanything-container-local | tensor_parallel is set to [1]
qanything-container-local | gpu_memory_utilization is set to [0.81]
qanything-container-local | checksum 3299f3701e32a952d9d5769897bbb4b5
qanything-container-local | default_checksum 3299f3701e32a952d9d5769897bbb4b5
qanything-container-local |
qanything-container-local | [notice] A new release of pip is available: 23.3.2 -> 24.0
qanything-container-local | [notice] To update, run: python3 -m pip install --upgrade pip
qanything-container-local | GPU ID: 0, 0
qanything-container-local | GPU1 Model: NVIDIA GeForce RTX 4080 SUPER
qanything-container-local | Compute Capability: 8.9
qanything-container-local | OCR_USE_GPU=True because 8.9 >= 7.5
qanything-container-local | ====================================================
qanything-container-local | ******************** 重要提示 ********************
qanything-container-local | ====================================================
qanything-container-local |
qanything-container-local | 您当前的显存为 16376 MiB 推荐部署小于等于7B的大模型
qanything-container-local | tokens上限默认设置为4096
qanything-container-local | The triton server for embedding and reranker will start on 0 GPUs
qanything-container-local | Executing hf runtime_backend
qanything-container-local | The rerank service is ready! (2/8)
qanything-container-local | rerank服务已就绪! (2/8)
qanything-container-local | The ocr service is ready! (3/8)
qanything-container-local | OCR服务已就绪! (3/8)
qanything-container-local | Waiting for the backend service to start...
qanything-container-local | 等待启动后端服务
qanything-container-local | Waiting for the backend service to start...
qanything-container-local | 等待启动后端服务
qanything-container-local | The qanything backend service is ready! (4/8)
qanything-container-local | qanything后端服务已就绪! (4/8)
qanything-container-local | Dependencies related to npm are obtained. (5/8)
qanything-container-local | The front_end/dist folder already exists, no need to build the front end again.(6/8)
qanything-container-local | Waiting for the front-end service to start...
qanything-container-local | 等待启动前端服务
qanything-container-local |
qanything-container-local | > ai-demo@1.0.1 serve
qanything-container-local | > vite preview --port 5052
qanything-container-local |
qanything-container-local | The CJS build of Vite's Node API is deprecated. See https://vitejs.dev/guide/troubleshooting.html#vite-cjs-node-api-deprecated for more details.
qanything-container-local | ➜ Local: http://localhost:5052/qanything
qanything-container-local | ➜ Network: http://172.20.0.6:5052/qanything
qanything-container-local | The front-end service is ready!...(7/8)
qanything-container-local | 前端服务已就绪!...(7/8)
qanything-container-local | I0604 01:29:55.535924 129 grpc_server.cc:377] Thread started for CommonHandler
qanything-container-local | I0604 01:29:55.535953 129 infer_handler.cc:629] New request handler for ModelInferHandler, 0
qanything-container-local | I0604 01:29:55.535959 129 infer_handler.h:1025] Thread started for ModelInferHandler
qanything-container-local | I0604 01:29:55.535986 129 infer_handler.cc:629] New request handler for ModelInferHandler, 0
qanything-container-local | I0604 01:29:55.535991 129 infer_handler.h:1025] Thread started for ModelInferHandler
qanything-container-local | I0604 01:29:55.536022 129 stream_infer_handler.cc:122] New request handler for ModelStreamInferHandler, 0
qanything-container-local | I0604 01:29:55.536026 129 infer_handler.h:1025] Thread started for ModelStreamInferHandler
qanything-container-local | I0604 01:29:55.536028 129 grpc_server.cc:2450] Started GRPCInferenceService at 0.0.0.0:9001
qanything-container-local | I0604 01:29:55.536133 129 http_server.cc:3555] Started HTTPService at 0.0.0.0:9000
qanything-container-local | I0604 01:29:55.576918 129 http_server.cc:185] Started Metrics Service at 0.0.0.0:9002
qanything-container-local | I0604 01:30:10.435466 129 http_server.cc:3449] HTTP request: 0 /v2/health/ready
qanything-container-local | The embedding and rerank service is ready!. (7.5/8)
qanything-container-local | Embedding 和 Rerank 服务已准备就绪!(7.5/8)
qanything-container-local | You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
0%| | 0/1 [00:00<?, ?it/s]04 09:29:55 | ERROR | stderr |
100%|██████████| 1/1 [00:14<00:00, 14.91s/it]:10 | ERROR | stderr |
100%|██████████| 1/1 [00:14<00:00, 14.91s/it]:10 | ERROR | stderr |
qanything-container-local | 2024-06-04 09:30:10 | ERROR | stderr |
qanything-container-local | 2024-06-04 09:30:10 | INFO | model_worker | Register to controller
qanything-container-local | 2024-06-04 09:30:10 | ERROR | stderr | INFO: Started server process [144]
qanything-container-local | 2024-06-04 09:30:10 | ERROR | stderr | INFO: Waiting for application startup.
qanything-container-local | 2024-06-04 09:30:10 | ERROR | stderr | INFO: Application startup complete.
qanything-container-local | 2024-06-04 09:30:10 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:7801 (Press CTRL+C to quit)
qanything-container-local | % Total % Received % Xferd Average Speed Time Time Time Current
qanything-container-local | Dload Upload Total Spent Left Speed
100 28 100 28 0 0 49557 0 --:--:-- --:--:-- --:--:-- 28000
qanything-container-local | The llm service is ready!, now you can use the qanything service. (8/8)
qanything-container-local | LLM 服务已准备就绪!现在您可以使用qanything服务。(8/8)
qanything-container-local | 开始检查日志文件中的错误信息...
qanything-container-local | /workspace/qanything_local/logs/debug_logs/rerank_server.log 中未检测到明确的错误信息。请手动排查 /workspace/qanything_local/logs/debug_logs/rerank_server.log 以获取更多信息。
qanything-container-local | /workspace/qanything_local/logs/debug_logs/ocr_server.log 中未检测到明确的错误信息。请手动排查 /workspace/qanything_local/logs/debug_logs/ocr_server.log 以获取更多信息。
qanything-container-local | /workspace/qanything_local/logs/debug_logs/sanic_api.log 中未检测到明确的错误信息。请手动排查 /workspace/qanything_local/logs/debug_logs/sanic_api.log 以获取更多信息。
qanything-container-local | Time elapsed: 17 seconds.
qanything-container-local | 已耗时: 17 秒.
qanything-container-local | Please visit the front-end service at [http://localhost:5052/qanything/] to conduct Q&A.
qanything-container-local | 请在[http://localhost:5052/qanything/]下访问前端服务来进行问答,如果前端报错,请在浏览器按F12以获取更多报错信息
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
如图,对话时一直pending中
期望行为 | Expected Behavior
修复bug
运行环境 | Environment
QAnything日志 | QAnything logs
启动日志
sanic_api.log
debug.log
model-worker.log
复现方法 | Steps To Reproduce
No response
备注 | Anything else?
No response
The text was updated successfully, but these errors were encountered: