-
-
Notifications
You must be signed in to change notification settings - Fork 797
Closed
Description
After pulling the latest from main, still getting errors on a TESLA P40, when using load_in_8bit with Huggingface.
=============================================
ERROR: Your GPU does not support Int8 Matmul!
=============================================
python: /home/gene/dockerx/temp/bitsandbytes/csrc/ops.cu:408: int igemmlt(cublasLtHandle_t, int, int, int, const int8_t*, const int8_t*, void*, float*, int, int, int) [with int FORMATB = 3; int DTYPE_OUT = 32; int SCALE_ROWS = 0; cublasLtHandle_t = cublasLtContext*; int8_t = signed char]: Assertion `false' failed.
Built from source: make cuda11x_nomatmul
confirmed libbitsandbytes_cuda118_nocublaslt.so is being loaded.
CUDA SETUP: CUDA runtime path found: /opt/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 6.1
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /home/gene/dockerx/bloom/llmvenv/lib/python3.10/site-packages/bitsandbytes-0.36.0.post2-py3.10.egg/bitsandbytes/libbitsandbytes_cuda118_nocublaslt.so...
nvcc --version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
GPU has CUDA COMPUTE capability 6.1
UPDATE: I found your note where you indicated LLM.int8 currently only supported on compute capability 7.5+ and that it would be added at a future date for GPU's with lower Compute Capabilities.
Metadata
Metadata
Assignees
Labels
No labels