-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: RuntimeError: CUDA error: no kernel image is available for execution on the device #5547
Comments
You have mixed cuda version. Please follow https://docs.vllm.ai/en/latest/getting_started/installation.html to have a clean installation. |
I followed the installation process but still get the same issue. This is the new conda environment I have:
And this is the error I have:
I tried to compile with |
Looks like the error comes from pytorch. Can you try: import torch
a = torch.ones((5,)).cuda()
print(a.sum().item()) |
Your current environment
PyTorch version: 2.3.0
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.29.5
Libc version: glibc-2.35
Python version: 3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-107-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.7.99
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration:
GPU 0: Tesla V100-PCIE-32GB
Nvidia driver version: 535.171.04
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.5.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.5.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.5.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.5.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.5.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.5.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.5.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] ctransformers==0.2.27
[pip3] numpy==1.26.4
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] torch==2.3.0
[pip3] torchaudio==2.3.1+cu121
[pip3] torchmetrics==1.1.2
[pip3] torchvision==0.18.0
[pip3] transformers==4.41.2
[pip3] triton==2.3.0
[conda] blas 1.0 mkl
[conda] ctransformers 0.2.27 pypi_0 pypi
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] libjpeg-turbo 2.0.0 h9bf148f_0 pytorch
[conda] mkl 2023.1.0 h213fc3f_46344
[conda] mkl-fft 1.3.8 pypi_0 pypi
[conda] mkl-random 1.2.4 pypi_0 pypi
[conda] mkl-service 2.4.0 pypi_0 pypi
[conda] mkl_fft 1.3.8 py311h5eee18b_0
[conda] mkl_random 1.2.4 py311hdb19cb5_0
[conda] numpy 1.26.4 pypi_0 pypi
[conda] numpy-base 1.26.4 py311hf175353_0
[conda] nvidia-nccl-cu12 2.20.5 pypi_0 pypi
[conda] pytorch 2.3.0 py3.11_cuda12.1_cudnn8.9.2_0 pytorch
[conda] pytorch-cuda 12.1 ha16c6d3_5 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torch 2.3.0 pypi_0 pypi
[conda] torchaudio 2.3.0 pypi_0 pypi
[conda] torchmetrics 1.1.2 pypi_0 pypi
[conda] torchtriton 2.3.0 py311 pytorch
[conda] torchvision 0.18.0 pypi_0 pypi
[conda] transformers 4.41.2 pypi_0 pypi
[conda] triton 2.3.0 pypi_0 pypi
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.5.0
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
GPU0 GPU1 GPU2 GPU3 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X SYS SYS SYS 0-9,20-29 0 N/A
GPU1 SYS X NODE NODE 10-19,30-39 1 N/A
GPU2 SYS NODE X NODE 10-19,30-39 1 N/A
GPU3 SYS NODE NODE X 10-19,30-39 1 N/A
🐛 Describe the bug
This is the message I get:
And this is my code for vllm:
I tried to reinstall vllm and other libs but I still get this issue. Does anyone know what's wrong? Please ask for further info if needed. Thanks a lot in advance.
The text was updated successfully, but these errors were encountered: