Skip to content

[Bug]: 使用vllm 0.8.2 torch 0.2.6版本启动模型报错: CRITICAL 04-02 10:00:15 [core_client.py:269] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.rm) 已杀死 #15918

Closed as not planned
@TZJ12

Description

@TZJ12

Your current environment

PyTorch version: 2.6.0+cu124
Is debug build: False
CUDA used to build PyTorch: 12.4
ROCM used to build PyTorch: N/A

OS: CentOS Linux 7 (Core) (x86_64)
GCC version: (GCC) 12.2.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.17

Python version: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] (64-bit runtime)
Python platform: Linux-3.10.0-1160.el7.x86_64-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 12.2.140

🐛 Describe the bug

我使用vllm 0.8.2 torch 0.2.6启动qwen2.5-32n-int4模型,启动命令为:
python -m vllm.entrypoints.openai.api_server
--served-model-name qwen2.5-32n-int4
--model qwen2.5-32n-int4
--tensor-parallel-size 2
--port 8019
--dtype float16
--enforce-eager
--trust-remote-code
--gpu-memory-utilization 0.7
--max-model-len 3200

报错为:
(VllmWorker rank=0 pid=10003) INFO 04-02 10:00:12 [backends.py:415] Using cache directory: /root/.cache/vllm/torch_compile_cache/68e5addbf5/rank_0_0 for vLLM's torch.compile
(VllmWorker rank=0 pid=10003) INFO 04-02 10:00:13 [backends.py:425] Dynamo bytecode transform time: 18.08 s
(VllmWorker rank=1 pid=10014) INFO 04-02 10:00:13 [backends.py:415] Using cache directory: /root/.cache/vllm/torch_compile_cache/68e5addbf5/rank_1_0 for vLLM's torch.compile
(VllmWorker rank=1 pid=10014) INFO 04-02 10:00:13 [backends.py:425] Dynamo bytecode transform time: 18.53 s
gcc: fatal error: cannot execute ‘cc1’: execvp: 没有那个文件或目录
compilation terminated.
ERROR 04-02 10:00:15 [core.py:343] EngineCore hit an exception: Traceback (most recent call last):
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 335, in run_engine_core
ERROR 04-02 10:00:15 [core.py:343] engine_core = EngineCoreProc(*args, **kwargs)
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 290, in init
ERROR 04-02 10:00:15 [core.py:343] super().init(vllm_config, executor_class, log_stats)
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 63, in init
ERROR 04-02 10:00:15 [core.py:343] num_gpu_blocks, num_cpu_blocks = self._initialize_kv_caches(
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 122, in _initialize_kv_caches
ERROR 04-02 10:00:15 [core.py:343] available_gpu_memory = self.model_executor.determine_available_memory()
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/executor/abstract.py", line 66, in determine_available_memory
ERROR 04-02 10:00:15 [core.py:343] output = self.collective_rpc("determine_available_memory")
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 134, in collective_rpc
ERROR 04-02 10:00:15 [core.py:343] raise e
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 118, in collective_rpc
ERROR 04-02 10:00:15 [core.py:343] status, result = w.worker_response_mq.dequeue(
ERROR 04-02 10:00:15 [core.py:343] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/distributed/device_communicators/shm_broadcast.py", line 471, in dequeue
ERROR 04-02 10:00:15 [core.py:343] obj = pickle.loads(buf[1:])
ERROR 04-02 10:00:15 [core.py:343] TypeError: BackendCompilerFailed.init() missing 1 required positional argument: 'inner_exception'
ERROR 04-02 10:00:15 [core.py:343]
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] WorkerProc hit an exception: %s
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] Traceback (most recent call last):
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/executor/multiproc_executor.py", line 372, in worker_busy_loop
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] output = func(*args, **kwargs)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] return func(*args, **kwargs)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py", line 157, in determine_available_memory
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] self.model_runner.profile_run()
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1499, in profile_run
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] hidden_states = self._dummy_run(self.max_num_tokens)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] return func(*args, **kwargs)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1336, in _dummy_run
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] hidden_states = model(
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/model_executor/models/qwen2.py", line 462, in forward
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] hidden_states = self.model(input_ids, positions, intermediate_tensors,
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/vllm/compilation/decorators.py", line 238, in call
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] output = self.compiled_callable(*args, **kwargs)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 574, in _fn
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] return fn(*args, **kwargs)
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1380, in call
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] return self._torchdynamo_orig_callable(
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 547, in call
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] return _compile(
(VllmWorker rank=0 pid=10003) ERROR 04-02 10:00:15 [multiproc_executor.py:379] File "/root/anaconda3/envs/vllm/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 986, in _compile
CRITICAL 04-02 10:00:15 [core_client.py:269] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.rm)
已杀死

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions