Skip to content

Tesla P40 -> ERROR: Your GPU does not support Int8 Matmul #118

@gururise

Description

@gururise

After pulling the latest from main, still getting errors on a TESLA P40, when using load_in_8bit with Huggingface.

=============================================
ERROR: Your GPU does not support Int8 Matmul!
=============================================

python: /home/gene/dockerx/temp/bitsandbytes/csrc/ops.cu:408: int igemmlt(cublasLtHandle_t, int, int, int, const int8_t*, const int8_t*, void*, float*, int, int, int) [with int FORMATB = 3; int DTYPE_OUT = 32; int SCALE_ROWS = 0; cublasLtHandle_t = cublasLtContext*; int8_t = signed char]: Assertion `false' failed.

Built from source: make cuda11x_nomatmul

confirmed libbitsandbytes_cuda118_nocublaslt.so is being loaded.

CUDA SETUP: CUDA runtime path found: /opt/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 6.1
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /home/gene/dockerx/bloom/llmvenv/lib/python3.10/site-packages/bitsandbytes-0.36.0.post2-py3.10.egg/bitsandbytes/libbitsandbytes_cuda118_nocublaslt.so...

nvcc --version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

GPU has CUDA COMPUTE capability 6.1

UPDATE: I found your note where you indicated LLM.int8 currently only supported on compute capability 7.5+ and that it would be added at a future date for GPU's with lower Compute Capabilities.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions