not working since commit 31f04dc bitsandbytes problem #614

Marc899 · 2023-03-28T17:47:12Z

Describe the bug

Starting with commit 31f04dc I am getting a lot of CUDA errors related to bitsandbytes when running the start-webui.bat

RuntimeError:
CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
https://github.com/TimDettmers/bitsandbytes/issues

Reverting to 966168b makes it run again.

Is there an existing issue for this?

I have searched the existing issues

Reproduction

Update to the latest version and run the start-webui.bat

Screenshot

No response

Logs

RuntimeError:
        CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
        If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
        https://github.com/TimDettmers/bitsandbytes/issues

System Info

Windows 11 64bit
RTX 3090

The text was updated successfully, but these errors were encountered:

oivio · 2023-03-28T20:34:45Z

I can confirm the same issue on my side
Windows 10
RTX4080
CUDA 11.7.0_516.01

ARandomUserFromGithub · 2023-03-28T22:07:33Z

Same. I get `Starting the web UI...

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('A')}
warn(msg)
A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: A:\TGWU---\installer_files\env did not contain libcudart.so as expected! Searching further paths...
warn(msg)
A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/A'), WindowsPath('file'), WindowsPath('/TGWU---/installer_files/env/etc/xml/catalog')}
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')}
warn(msg)
CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine!
A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
warn(msg)
A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning: WARNING: No GPU detected! Check your CUDA paths. Proceeding to load CPU-only library...
warn(msg)
CUDA SETUP: Loading binary A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
argument of type 'WindowsPath' is not iterable
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine!
CUDA SETUP: Loading binary A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
argument of type 'WindowsPath' is not iterable
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine!
CUDA SETUP: Loading binary A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so...
argument of type 'WindowsPath' is not iterable
CUDA SETUP: Problem: The main issue seems to be that the main CUDA library was not detected.
CUDA SETUP: Solution 1): Your paths are probably not up-to-date. You can update them via: sudo ldconfig.
CUDA SETUP: Solution 2): If you do not have sudo rights, you can do the following:
CUDA SETUP: Solution 2a): Find the cuda library via: find / -name libcuda.so 2>/dev/null
CUDA SETUP: Solution 2b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_2a
CUDA SETUP: Solution 2c): For a permanent solution add the export from 2b into your .bashrc file, located at ~/.bashrc
Traceback (most recent call last):
File "A:\TGWU---\text-generation-webui\server.py", line 13, in
from modules import chat, shared, training, ui
File "A:\TGWU---\text-generation-webui\modules\training.py", line 11, in
from peft import (LoraConfig, get_peft_model, get_peft_model_state_dict,
File "A:\TGWU---\installer_files\env\lib\site-packages\peft_init_.py", line 22, in
from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING, PEFT_TYPE_TO_CONFIG_MAPPING, get_peft_config, get_peft_model
File "A:\TGWU---\installer_files\env\lib\site-packages\peft\mapping.py", line 16, in
from .peft_model import (
File "A:\TGWU---\installer_files\env\lib\site-packages\peft\peft_model.py", line 31, in
from .tuners import LoraModel, PrefixEncoder, PromptEmbedding, PromptEncoder
File "A:\TGWU---\installer_files\env\lib\site-packages\peft\tuners_init_.py", line 20, in
from .lora import LoraConfig, LoraModel
File "A:\TGWU---\installer_files\env\lib\site-packages\peft\tuners\lora.py", line 36, in
import bitsandbytes as bnb
File "A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes_init_.py", line 7, in
from .autograd.functions import (
File "A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\autograd_init.py", line 1, in
from ._functions import undo_layout, get_inverse_transform_indices
File "A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\autograd_functions.py", line 9, in
import bitsandbytes.functional as F
File "A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\functional.py", line 17, in
from .cextension import COMPILED_WITH_CUDA, lib
File "A:\TGWU---\installer_files\env\lib\site-packages\bitsandbytes\cextension.py", line 22, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Inspect the CUDA SETUP outputs above to fix your environment!
If you cannot find any issues and suspect a bug, please open an issue with detals about your environment:
https://github.com/TimDettmers/bitsandbytes/issues
Press any key to continue . . .`

RJSprod · 2023-03-28T23:56:29Z

I'm facing the exact same issue, windows 11 w/3090. The libcuda and libcudart files requested don't seem to exist on my system.

bmoconno · 2023-03-29T00:57:28Z

I was having the same issue, once I found this and saw everyone having it seemed to be using windows, I thought this was probably the culprit, it fixed the issue for me:

from how_to_install_llama_8bit_and_4bit

Download libbitsandbytes_cuda116.dll and put it in C:\Users\xxx\miniconda3\envs\textgen\lib\site-packages\bitsandbytes\

In \bitsandbytes\cuda_setup\main.py search for: if not torch.cuda.is_available(): return 'libsbitsandbytes_cpu.so', None, None, None, None and replace with: if torch.cuda.is_available(): return 'libbitsandbytes_cuda116.dll', None, None, None, None

In \bitsandbytes\cuda_setup\main.py search for this twice: self.lib = ct.cdll.LoadLibrary(binary_path) and replace with: self.lib = ct.cdll.LoadLibrary(str(binary_path))

After re-doing those steps, I ran into another issue that I believe was caused by #615, it was complaining about kernel_switch_threshold not being a valid argument or something while trying to use the llama 30b 128 model. To fix this I modified the modules\GPTQ_loader.py file by doing the follow:

re-add import llama line under sys.path.insert(0, str(Path("repositories/GPTQ-for-LLaMa")))
replace load_quant = _load_quant with load_quant = llama.load_quant
replace model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold) with model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize)

It's possible you won't need to modify the modules\GPTQ_loader.py, so try to load your model before making those changes.

oobabooga · 2023-03-29T05:40:01Z

8-bit should work more reliably with the new one-click installer

https://github.com/oobabooga/text-generation-webui#one-click-installers

hdkiller · 2023-03-29T06:48:27Z

I had similar issue on linux, probably caused by #615 since if I revert changes as @bmoconno mentioned it loads llama.

/home/hdkiller/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes-0.37.2-py3.10.egg/bitsandbytes/cuda_setup/main.py:136: UserWarning: /home/hdkiller/miniconda3/envs/textgen did not contain libcudart.so as expected! Searching further paths...
  warn(msg)
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 121
CUDA SETUP: Loading binary /home/hdkiller/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes-0.37.2-py3.10.egg/bitsandbytes/libbitsandbytes_cuda121.so...
Loading llama-7b-hf...
Found models/llama-7b-4bit.safetensors
Traceback (most recent call last):
  File "/home/hdkiller/text-generation-webui/server.py", line 273, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
  File "/home/hdkiller/text-generation-webui/modules/models.py", line 101, in load_model
    model = load_quantized(model_name)
  File "/home/hdkiller/text-generation-webui/modules/GPTQ_loader.py", line 113, in load_quantized
    model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
  File "/home/hdkiller/text-generation-webui/modules/GPTQ_loader.py", line 36, in _load_quant
    make_quant(model, layers, wbits, groupsize, faster=faster_kernel, kernel_switch_threshold=kernel_switch_threshold)
TypeError: make_quant() got an unexpected keyword argument 'kernel_switch_threshold'`

So I had to re-install GPTQ-for-LLaMa in ./repositories and then it works.

Azeirah · 2023-03-29T13:34:37Z

I have a similar error @hdkiller,

Loading llama-7b-hf...
CUDA extension not installed.
Found models/llama-7b-4bit.pt
Traceback (most recent call last):
  File "/home/lb/Downloads/LLaMA/text-generation-webui/server.py", line 273, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
  File "/home/lb/Downloads/LLaMA/text-generation-webui/modules/models.py", line 101, in load_model
    model = load_quantized(model_name)
  File "/home/lb/Downloads/LLaMA/text-generation-webui/modules/GPTQ_loader.py", line 113, in load_quantized
    model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
  File "/home/lb/Downloads/LLaMA/text-generation-webui/modules/GPTQ_loader.py", line 36, in _load_quant
    make_quant(model, layers, wbits, groupsize, faster=faster_kernel, kernel_switch_threshold=kernel_switch_threshold)
TypeError: make_quant() got an unexpected keyword argument 'faster'

How did you "reinstall" gptq-for-llama? I did

cd repositories/GPTQ-for-LLaMa
git pull
pip install -r requirements.txt

Still getting the same error with python server.py --model_type llama --wbits 4 --groupsize 128

Edit:

It did work after removing the GPTQ-for-LLama directory and literally performing a new git clone and pip install. No idea why.

remghoost · 2023-03-29T21:23:21Z

So I had to re-install GPTQ-for-LLaMa in ./repositories and then it works.

It did work after removing the GPTQ-for-LLama directory and literally performing a new git clone and pip install. No idea why.

This worked for me as well.

Seems like a fairly common occurrence. Happens every few commits.
Nice to know this fix works.
I've had this problem before (a few weeks ago) and literally spent days trying to fix it.

Might make myself a quick script to automate this for fixing it in the future. haha.

edit - hmm. I thought it did, but maybe it didn't....?

edit2 - Okay, so it says that it won't use my GPU, yet my GPU clock speed still spikes when I generate text and nvidia-smi shows that my VRAM is populated with the model. So maybe it's just lying....? I'm using the ozcur/alpaca-native-4bit. It's definitely using my GPU though.

davidliudev · 2023-03-30T05:47:11Z

Same issue for me too under Windows 11. Tried removing the GPTQ folder and re-pull and reinstall but it is not working. Had to temporarily revert to 966168b

StefanDanielSchwarz · 2023-03-30T21:47:29Z

Same here, fresh WSL install, got the "TypeError: make_quant() got an unexpected keyword argument 'faster'" message when trying to load ozcur's alpaca-native-4bit.

oobabooga · 2023-03-30T21:51:16Z

It's necessary to clone the GPTQ-for-llama repository with

git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git -b cuda

now. The default branch in that repository has been changed for one that breaks backward compatibility.

This has been updated in the one-click installer, which must be re-downloaded manually (just the install.bat script) oobabooga/one-click-installers@85e4ec6

StefanDanielSchwarz · 2023-03-30T22:13:11Z

Excellent, that fixes it! 👍 Glad to be able to use the latest version of your text-generation-webui again (and special thanks for merging my PR ❤).

hdkiller · 2023-04-01T18:19:39Z

Seems like something going on in that cuda branch of GPTQ-for-LLaMa

I had to revert to this commit.

git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git -b cuda
git reset --hard b820805
python setup_cuda.py install

This is the commit which removed a parameter from the function definition of make_quant which throws the error @Azeirah had.

This way now I am able to CPU offload llama:

python3 server.py --listen --wbits 4 --groupsize 128 --pre_layer 30 --model llama-7b-4bit-128g

StefanDanielSchwarz · 2023-04-01T18:56:42Z

Confirming that - both the problem and the workaround. Thanks @hdkiller for figuring out the commit that broke compatibility (f1af89a).

Here's what my WSL console reported before I reverted:

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: ~/miniconda3/envs/textgen/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary ~/miniconda3/envs/textgen/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
Loading ozcur_alpaca-native-4bit...
Traceback (most recent call last):
  File "~/text-generation-webui/server.py", line 275, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
  File "~/text-generation-webui/modules/models.py", line 102, in load_model
    model = load_quantized(model_name)
  File "~/text-generation-webui/modules/GPTQ_loader.py", line 114, in load_quantized
    model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
  File "~/text-generation-webui/modules/GPTQ_loader.py", line 36, in _load_quant
    make_quant(model, layers, wbits, groupsize, faster=faster_kernel, kernel_switch_threshold=kernel_switch_threshold)
TypeError: make_quant() got an unexpected keyword argument 'faster'

The last working commit is 608f3ba. Reverting to that made text-generation-webui work again:

cd repositories/GPTQ-for-LLaMa

git reset --hard 608f3ba71e40596c75f8864d73506eaf57323c6e

pip install -r requirements.txt
python setup_cuda.py install
cd ../..

oobabooga · 2023-04-02T05:09:34Z

Please use my fork of GPTQ-for-LLaMa. It corresponds to commit a6f363e3f93b9fb5c26064b5ac7ed58d22e3f773 in the cuda branch.

# activate the conda environment
conda activate textgen

# remove the existing GPT-for-LLaMa
cd text-generation-webui/repositories
rm -rf GPTQ-for-LLaMa
pip uninstall quant-cuda

# reinstall
git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda
cd GPTQ-for-LLaMa
python setup_cuda.py install

I will keep using this until qwopqwop's branch stabilizes. Upstream changes will not be supported. This works with @USBhost's torrents for llama that are linked here.

github-actions · 2023-11-26T23:16:18Z

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

Marc899 added the bug Something isn't working label Mar 28, 2023

oobabooga mentioned this issue Mar 29, 2023

Failed Launch #622

Closed

1 task

github-actions bot added the stale label Nov 26, 2023

github-actions bot closed this as completed Nov 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

not working since commit 31f04dc bitsandbytes problem #614

not working since commit 31f04dc bitsandbytes problem #614

Marc899 commented Mar 28, 2023

oivio commented Mar 28, 2023

ARandomUserFromGithub commented Mar 28, 2023

RJSprod commented Mar 28, 2023

bmoconno commented Mar 29, 2023

oobabooga commented Mar 29, 2023

hdkiller commented Mar 29, 2023 •

edited

Azeirah commented Mar 29, 2023 •

edited

remghoost commented Mar 29, 2023 •

edited

davidliudev commented Mar 30, 2023 •

edited

StefanDanielSchwarz commented Mar 30, 2023 •

edited

oobabooga commented Mar 30, 2023

StefanDanielSchwarz commented Mar 30, 2023

hdkiller commented Apr 1, 2023

StefanDanielSchwarz commented Apr 1, 2023 •

edited

oobabooga commented Apr 2, 2023

github-actions bot commented Nov 26, 2023

not working since commit 31f04dc bitsandbytes problem #614

not working since commit 31f04dc bitsandbytes problem #614

Comments

Marc899 commented Mar 28, 2023

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

oivio commented Mar 28, 2023

ARandomUserFromGithub commented Mar 28, 2023

===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

RJSprod commented Mar 28, 2023

bmoconno commented Mar 29, 2023

oobabooga commented Mar 29, 2023

hdkiller commented Mar 29, 2023 • edited

Azeirah commented Mar 29, 2023 • edited

remghoost commented Mar 29, 2023 • edited

davidliudev commented Mar 30, 2023 • edited

StefanDanielSchwarz commented Mar 30, 2023 • edited

oobabooga commented Mar 30, 2023

StefanDanielSchwarz commented Mar 30, 2023

hdkiller commented Apr 1, 2023

StefanDanielSchwarz commented Apr 1, 2023 • edited

oobabooga commented Apr 2, 2023

github-actions bot commented Nov 26, 2023

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues

hdkiller commented Mar 29, 2023 •

edited

Azeirah commented Mar 29, 2023 •

edited

remghoost commented Mar 29, 2023 •

edited

davidliudev commented Mar 30, 2023 •

edited

StefanDanielSchwarz commented Mar 30, 2023 •

edited

StefanDanielSchwarz commented Apr 1, 2023 •

edited