Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 使用python版本在Linux上运行本地模型时出现“The client socket has failed to connect to [::ffff:0.0.188.125]:59771 (errno: 22 - Invalid argument).” #381

Open
2 tasks done
hingkan opened this issue Jun 4, 2024 · 0 comments

Comments

@hingkan
Copy link

hingkan commented Jun 4, 2024

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

创建干净的Python虚拟环境后,安装需要的包。运行:bash scripts/run_for_7B_in_Linux_or_WSL.sh

期望行为 | Expected Behavior

运行成功,并能通过网站访问前端页面,或使用API。

运行环境 | Environment

- OS:CentOS Linux 8
- NVIDIA Driver:550.54.14
- CUDA:12.1
- docker:25.0.4
- docker-compose:2.24.7
- NVIDIA GPU:NVIDIA GeForce RTX 3090
- NVIDIA GPU Memory:24GB

QAnything日志 | QAnything logs

即将启动后端服务,启动成功后请复制[http://0.0.0.0:8777/qanything/]到浏览器进行测试。
运行qanything-server的命令是:
CUDA_VISIBLE_DEVICES=0 python3 -m qanything_kernel.qanything_server.sanic_api --host 0.0.0.0 --port 8777 --model_size 7B
LOCAL DATA PATH: /home//mnt/workspace/QAnything/QANY_DB/content
LOCAL_RERANK_REPO: netease-youdao/bce-reranker-base_v1
LOCAL_EMBED_REPO: netease-youdao/bce-embedding-base_v1
<Logger debug_logger (INFO)> <Logger qa_logger (INFO)>
2024-06-04 10:13:07,353 - modelscope - INFO - PyTorch version 2.1.2 Found.
2024-06-04 10:13:07,354 - modelscope - INFO - Loading ast index from /home/
/.cache/modelscope/ast_indexer
2024-06-04 10:13:07,389 - modelscope - INFO - Loading done! Current index file version is 1.13.0, with md5 6bd8910d8bf93d5739004f525912b700 and a total number of 972 components indexed
use_cpu: False
use_openai_api: False
onnxruntime-gpu 1.17.1 已经安装。
vllm 0.2.7 已经安装。
functions:[{'name': 'duckduckgo_search', 'description': 'duckduckgo_search(query: str) - Search infomation on internet. Useful for when the context can not answer the question. Input should be a search query.', 'parameters': {'type': 'object', 'properties': {'query': {'description': 'search query', 'type': 'string'}}, 'required': ['query']}}]
[2024-06-04 10:13:19 +0800] [730396] [WARNING] Sanic is running in PRODUCTION mode. Consider using '--debug' or '--dev' while actively developing your application.
[2024-06-04 10:13:19 +0800] [730396] [INFO] Sanic Extensions:
[2024-06-04 10:13:19 +0800] [730396] [INFO] > injection [0 dependencies; 0 constants]
[2024-06-04 10:13:19 +0800] [730396] [INFO] > openapi [http://0.0.0.0:8777/docs]
[2024-06-04 10:13:19 +0800] [730396] [INFO] > http
[2024-06-04 10:13:19 +0800] [730396] [INFO] > templating [jinja2==3.1.4]
INFO 06-04 10:13:19 llm_engine.py:70] Initializing an LLM engine with config: model='/home//mnt/workspace/QAnything/assets/custom_models/netease-youdao/Qwen-7B-QAnything', tokenizer='/home//mnt/workspace/QAnything/assets/custom_models/netease-youdao/Qwen-7B-QAnything', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=8192, download_dir=None, load_format=auto, tensor_parallel_size=1, quantization=None, enforce_eager=False, seed=0)
[W socket.cpp:663] [c10d] The client socket has failed to connect to [::ffff:0.0.188.125]:59771 (errno: 22 - Invalid argument).
[W socket.cpp:663] [c10d] The client socket has failed to connect to 0.0.188.125:59771 (errno: 22 - Invalid argument).
[E socket.cpp:719] [c10d] The client socket has failed to connect to any network address of (0.0.188.125, 59771).
[2024-06-04 10:13:20 +0800] [730396] [ERROR] Experienced exception while trying to serve
Traceback (most recent call last):
File "/home//anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/mixins/startup.py", line 958, in serve_single
worker_serve(monitor_publisher=None, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 143, in worker_serve
raise e
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 117, in worker_serve
return _serve_http_1(
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/server/runners.py", line 223, in _serve_http_1
loop.run_until_complete(app._server_event("init", "before"))
File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1764, in _server_event
await self.dispatch(
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 208, in dispatch
return await dispatch
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 183, in _dispatch
raise e
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 167, in _dispatch
retval = await maybe_coroutine
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1315, in _listener
await maybe_coro
File "/home/
/mnt/workspace/QAnything/qanything_kernel/qanything_server/sanic_api.py", line 199, in init_local_doc_qa
local_doc_qa.init_cfg(args=args)
File "/home/
/mnt/workspace/QAnything/qanything_kernel/core/local_doc_qa.py", line 61, in init_cfg
self.llm: OpenAICustomLLM = OpenAICustomLLM(args)
File "/home/
/mnt/workspace/QAnything/qanything_kernel/connector/llm/llm_for_fastchat.py", line 40, in init
self.engine = AsyncLLMEngine.from_engine_args(engine_args)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 500, in from_engine_args
engine = cls(parallel_config.worker_use_ray,
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 273, in init
self.engine = self._init_engine(args, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 318, in _init_engine
return engine_class(args, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 111, in init
self._init_workers()
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 145, in _init_workers
self._run_workers("init_model")
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 795, in _run_workers
driver_worker_output = getattr(self.driver_worker,
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 74, in init_model
_init_distributed_environment(self.parallel_config, self.rank,
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 212, in _init_distributed_environment
torch.distributed.init_process_group(
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/c10d_logger.py", line 74, in wrapper
func_return = func(args, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1141, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 196, in _tcp_rendezvous_handler
store = _create_c10d_store(result.hostname, result.port, rank, world_size, timeout)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 172, in _create_c10d_store
return TCPStore(
RuntimeError: The client socket has failed to connect to any network address of (0.0.188.125, 59771). The client socket has failed to connect to 0.0.188.125:59771 (errno: 22 - Invalid argument).
[2024-06-04 10:13:20 +0800] [730396] [INFO] Server Stopped
Traceback (most recent call last):
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/
/mnt/workspace/QAnything/qanything_kernel/qanything_server/sanic_api.py", line 240, in
app.run(host=args.host, port=args.port, single_process=True, access_log=False)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/mixins/startup.py", line 215, in run
serve(primary=self) # type: ignore
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/mixins/startup.py", line 958, in serve_single
worker_serve(monitor_publisher=None, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 143, in worker_serve
raise e
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/worker/serve.py", line 117, in worker_serve
return _serve_http_1(
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/server/runners.py", line 223, in _serve_http_1
loop.run_until_complete(app._server_event("init", "before"))
File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1764, in _server_event
await self.dispatch(
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 208, in dispatch
return await dispatch
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 183, in _dispatch
raise e
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/signals.py", line 167, in _dispatch
retval = await maybe_coroutine
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/sanic/app.py", line 1315, in _listener
await maybe_coro
File "/home/
/mnt/workspace/QAnything/qanything_kernel/qanything_server/sanic_api.py", line 199, in init_local_doc_qa
local_doc_qa.init_cfg(args=args)
File "/home/
/mnt/workspace/QAnything/qanything_kernel/core/local_doc_qa.py", line 61, in init_cfg
self.llm: OpenAICustomLLM = OpenAICustomLLM(args)
File "/home/
/mnt/workspace/QAnything/qanything_kernel/connector/llm/llm_for_fastchat.py", line 40, in init
self.engine = AsyncLLMEngine.from_engine_args(engine_args)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 500, in from_engine_args
engine = cls(parallel_config.worker_use_ray,
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 273, in init
self.engine = self._init_engine(args, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 318, in _init_engine
return engine_class(args, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 111, in init
self._init_workers()
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 145, in _init_workers
self._run_workers("init_model")
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 795, in _run_workers
driver_worker_output = getattr(self.driver_worker,
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 74, in init_model
_init_distributed_environment(self.parallel_config, self.rank,
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/vllm/worker/worker.py", line 212, in _init_distributed_environment
torch.distributed.init_process_group(
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/c10d_logger.py", line 74, in wrapper
func_return = func(args, kwargs)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 1141, in init_process_group
store, rank, world_size = next(rendezvous_iterator)
File "/home/
/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 196, in _tcp_rendezvous_handler
store = _create_c10d_store(result.hostname, result.port, rank, world_size, timeout)
File "/home/
**/anaconda3/envs/qanything-python/lib/python3.10/site-packages/torch/distributed/rendezvous.py", line 172, in _create_c10d_store
return TCPStore(
RuntimeError: The client socket has failed to connect to any network address of (0.0.188.125, 59771). The client socket has failed to connect to 0.0.188.125:59771 (errno: 22 - Invalid argument).

复现方法 | Steps To Reproduce

  1. 安装环境
    conda create -n qanything-python python=3.10
    conda activate qanything-python
    git clone -b qanything-python https://github.com/netease-youdao/QAnything.git
    cd QAnything
    pip install -r requirements.txt

  2. 到 “ https://www.modelscope.cn/models/netease-youdao/QAnything-pdf-parser/summary ” 下载QAnything PDF解析相关模型,并放入指定位置;

  3. 运行命令(以下两者报错一样):
    bash scripts/run_for_3B_in_Linux_or_WSL.sh bash scripts/run_for_7B_in_Linux_or_WSL.sh

备注 | Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant