Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to run vllm model deployment #6464

Open
riyajatar37003 opened this issue Jul 16, 2024 · 15 comments
Open

unable to run vllm model deployment #6464

riyajatar37003 opened this issue Jul 16, 2024 · 15 comments
Labels
bug Something isn't working

Comments

@riyajatar37003
Copy link

Your current environment

Failed to import from vllm._C with ImportError("/usr/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_C.abi3.so)")

INFO 07-16 09:29:50 custom_cache_manager.py:17] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
(VllmWorkerProcess pid=658) INFO 07-16 09:29:52 multiproc_worker_utils.py:215] Worker ready; awaiting tasks
(VllmWorkerProcess pid=656) INFO 07-16 09:29:52 multiproc_worker_utils.py:215] Worker ready; awaiting tasks
(VllmWorkerProcess pid=657) INFO 07-16 09:29:53 multiproc_worker_utils.py:215] Worker ready; awaiting tasks
INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=656) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=658) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2
INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=657) INFO 07-16 09:29:53 utils.py:737] Found nccl from library libnccl.so.2
(VllmWorkerProcess pid=656) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=658) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=657) INFO 07-16 09:29:53 pynccl.py:63] vLLM is using nccl==2.20.5
(VllmWorkerProcess pid=658) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB
(VllmWorkerProcess pid=656) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB
INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB
(VllmWorkerProcess pid=657) INFO 07-16 09:31:44 model_runner.py:266] Loading model weights took 21.7573 GB
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm'
ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm'
ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm.
ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ .
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm.
ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm'
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 _custom_ops.py:42] Error in calling custom op rms_norm: '_OpNamespace' '_C' object has no attribute 'rms_norm'
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ .
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm.
ERROR 07-16 09:31:45 _custom_ops.py:42] Possibly you have built or installed an obsolete version of vllm.
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ .
ERROR 07-16 09:31:45 _custom_ops.py:42] Please try a clean build and install of vllm,or remove old built files such as vllm/cpython.so and build/ .
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last):
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last):
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] Exception in worker VllmWorkerProcess while processing method determine_num_available_blocks: '_OpNamespace' '_C' object has no attribute 'rms_norm', Traceback (most recent call last):
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_worker_utils.py", line 223, in _run_worker_process
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] output = executor(*args, **kwargs)
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run()
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run()
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.model_runner.profile_run()
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors)
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors)
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] self.execute_model(model_input, kv_caches, intermediate_tensors)
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable(
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return func(*args, **kwargs)
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable(
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_or_intermediate_states = model_executable(
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches,
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches,
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.model(input_ids, positions, kv_caches,
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states,
(VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states,
(VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states, residual = layer(positions, hidden_states,
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states)
(VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] hidden_states = self.input_layernorm(hidden_states)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
(VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
(VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._call_impl(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
(VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return forward_call(*args, **kwargs)
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward
(VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, **kwargs)
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return self._forward_method(*args, **kwargs)
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm(
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm(
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] ops.rms_norm(
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise e
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, **kwargs)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] return fn(*args, **kwargs)
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=657) (VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon)
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon)
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] torch.ops._C.rms_norm(out, input, weight, epsilon)
(VllmWorkerProcess pid=658) (VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError(
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError(
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr
(VllmWorkerProcess pid=656) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm'
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm'
(VllmWorkerProcess pid=658) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] raise AttributeError(
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226]
(VllmWorkerProcess pid=656) (VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226]
ERROR 07-16 09:31:45 multiproc_worker_utils.py:226] AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm'
(VllmWorkerProcess pid=657) ERROR 07-16 09:31:45 multiproc_worker_utils.py:226]
[rank0]: Traceback (most recent call last):
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/runpy.py", line 196, in _run_module_as_main
[rank0]: return _run_code(code, main_globals, None,
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/runpy.py", line 86, in _run_code
[rank0]: exec(code, run_globals)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 282, in
[rank0]: run_server(args)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 224, in run_server
[rank0]: if llm_engine is not None else AsyncLLMEngine.from_engine_args(
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 444, in from_engine_args
[rank0]: engine = cls(
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 373, in init
[rank0]: self.engine = self._init_engine(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 520, in _init_engine
[rank0]: return engine_class(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 263, in init
[rank0]: self._initialize_kv_caches()
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 362, in _initialize_kv_caches
[rank0]: self.model_executor.determine_num_available_blocks())
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/distributed_gpu_executor.py", line 38, in determine_num_available_blocks
[rank0]: num_blocks = self._run_workers("determine_num_available_blocks", )
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/executor/multiproc_gpu_executor.py", line 135, in _run_workers
[rank0]: driver_worker_output = driver_worker_method(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/worker.py", line 179, in determine_num_available_blocks
[rank0]: self.model_runner.profile_run()
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 923, in profile_run
[rank0]: self.execute_model(model_input, kv_caches, intermediate_tensors)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 1341, in execute_model
[rank0]: hidden_or_intermediate_states = model_executable(
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 349, in forward
[rank0]: hidden_states = self.model(input_ids, positions, kv_caches,
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 277, in forward
[rank0]: hidden_states, residual = layer(positions, hidden_states,
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/models/mixtral.py", line 219, in forward
[rank0]: hidden_states = self.input_layernorm(hidden_states)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/custom_op.py", line 13, in forward
[rank0]: return self._forward_method(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/model_executor/layers/layernorm.py", line 62, in forward_cuda
[rank0]: ops.rms_norm(
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 43, in wrapper
[rank0]: raise e
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 34, in wrapper
[rank0]: return fn(*args, **kwargs)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/vllm/_custom_ops.py", line 158, in rms_norm
[rank0]: torch.ops._C.rms_norm(out, input, weight, epsilon)
[rank0]: File "/tmp/.conda/envs/vllm_env/lib/python3.10/site-packages/torch/_ops.py", line 921, in getattr
[rank0]: raise AttributeError(
[rank0]: AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm'
ERROR 07-16 09:31:46 multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 658 died, exit code: -15
INFO 07-16 09:31:46 multiproc_worker_utils.py:123] Killing local vLLM worker processes
/tmp/.conda/envs/vllm_env/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

🐛 Describe the bug

tried to install using
pip install vllm

@riyajatar37003 riyajatar37003 added the bug Something isn't working label Jul 16, 2024
@yumaofan
Copy link

The same error when running any model. Install VLLM via pip directly.

@riyajatar37003
Copy link
Author

did the same only

@wheresmyhair
Copy link

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

@riyajatar37003
Copy link
Author

i am trying graphrag with vllm deployed model
but i am getting this error

ERROR 07-16 12:08:18 api_server.py:247] Error in applying chat template from request: Conversation roles must alternate user/assistant/user/assistant/...

@JaheimLee
Copy link

JaheimLee commented Jul 16, 2024

Same issue. And vllm 0.5.1 works well.

@Rogersiy
Copy link

Same issue. And vllm 0.5.1 works well.

Thannnnnnk u

@WMeng1
Copy link

WMeng1 commented Jul 18, 2024

Same issue.

@vlsav
Copy link

vlsav commented Jul 18, 2024

@rzes
Copy link

rzes commented Jul 18, 2024

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

works well!!

@zichaow
Copy link

zichaow commented Jul 19, 2024

+1, pip install vllm==0.5.0 solves the issue, not sure about other versions.

encounter the same issue and can confirm this works for me, too

@AlexBlack2202
Copy link

hello , any one find any solution about this problem?

@vlsav
Copy link

vlsav commented Jul 22, 2024

hello , any one find any solution about this problem?

#6464 (comment)

@heya5
Copy link

heya5 commented Jul 22, 2024

Delete the directory named "vllm" resolves my issue. I find the method from this comment #1814 (comment)

@lonngxiang
Copy link

0.5.4 same error

@DreamerZhang11
Copy link

why the source build have so many problem, i meet the same error.. Have it fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests