Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4-bit: "Allocator: not enough memory: you tried to allocate 35389440 bytes." #429

Closed
patrickmros opened this issue Mar 19, 2023 · 12 comments
Labels

Comments

@patrickmros
Copy link

I followed the guide 4bit LLaMA Setup for Windows and it worked. One time.

The next time i tried to start it, i get

Loading llama-13b... Loading model ... Traceback (most recent call last): File "J:\LLaMA\text-generation-webui\server.py", line 236, in <module> shared.model, shared.tokenizer = load_model(shared.model_name) File "J:\LLaMA\text-generation-webui\modules\models.py", line 100, in load_model model = load_quantized(model_name) File "J:\LLaMA\text-generation-webui\modules\GPTQ_loader.py", line 55, in load_quantized model = load_quant(str(path_to_model), str(pt_path), shared.args.gptq_bits) File "J:\LLaMA\text-generation-webui\repositories\GPTQ-for-LLaMa\llama.py", line 245, in load_quant model.load_state_dict(torch.load(checkpoint)) File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 789, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1131, in _load result = unpickler.load() File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1101, in persistent_load load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1079, in load_tensor storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage).storage().untyped() RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 35389440 bytes.

Well, i'm quite sure i have plenty of 35 megabytes free in my vram, ram and on my hard disk... Where exactly does it try to allocate 35 MB ?

@BarfingLemurs
Copy link

The final line indicates you are out of DRAM (regular ram)to load the model, have you tried using a pagefile?

@patrickmros
Copy link
Author

No I haven't used a pagefile, because I didn't need it. The 8-bit version works without a pagefile, does the 4-bit need more DRAM?

@BarfingLemurs
Copy link

well, It only loads your model into ram initially, then loads it into vram.

@bartman081523
Copy link

No I haven't used a pagefile, because I didn't need it. The 8-bit version works without a pagefile, does the 4-bit need more DRAM?

Please create a page file/swap space. I had the same problem, until i created a swap space.

@wywywywy
Copy link
Contributor

Has anyone tried converting the 4bit weights to safetensors and load directly to GPU?

@patrickmros
Copy link
Author

patrickmros commented Mar 19, 2023

That's weird. I checked again and I do have a swap file on my second hard disk. There's 100GB of memory free space and the swap file size is automatically managed by Windows.

@HCBlackFox
Copy link

> python server.py --share --model oasst-sft-1-pythia-12b --cpu --load-in-8bit
Loading oasst-sft-1-pythia-12b...
Loading checkpoint shards:  33%|████████████████████████████████████████████████▎                                                                                                | 1/3 [07:52<15:44, 472.28s/it]
Traceback (most recent call last):
  File "D:\Study\MCU\C Learning\text-generation-webui\server.py", line 236, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Study\MCU\C Learning\text-generation-webui\modules\models.py", line 157, in load_model
    model = AutoModelForCausalLM.from_pretrained(checkpoint, **params)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\models\auto\auto_factory.py", line 471, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\modeling_utils.py", line 2646, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\modeling_utils.py", line 2969, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\modeling_utils.py", line 640, in _load_state_dict_into_meta_model
    param = param.to(dtype)
            ^^^^^^^^^^^^^^^
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 419430400 bytes.

I have this error too. I have 16 Gb of ram and 32 Gb swap file. Any ideas?

@bartman081523
Copy link

bartman081523 commented Mar 19, 2023

@patrickmros
Can you execute this command and post the output?
conda list -p "C:\Users\patri\miniconda3\envs\textgen\"

This is only an idea but since you are running cpu generation, can you temporarily uninstall Cuda and try again?
I also dont know whether overwriting bitsandbytes_cpu with the cuda version is such a good idea, if you strictly want to run cpu generation. Also make sure that you followed the cpu setup guide (or auto install script).

@patrickmros
Copy link
Author

patrickmros commented Mar 19, 2023

@patrickmros Can you execute this command and give the output? conda list -p "C:\Users\patri\miniconda3\envs\textgen\"

This is only an idea but since you are running cpu generation, can you temporarily uninstall Cuda and try again? I also dont know whether overwriting bitsandbytes_cpu with the cuda version is such a good idea, if you strictly want to run cpu generation. Also make sure that you followed the cpu setup guide (or auto install script).

Wait - there is something wrong here. I don't want to run cpu generation!

This is the output I get from conda list -p "C:\Users\patri\miniconda3\envs\textgen":

`# packages in environment at C:\Users\patri\miniconda3\envs\textgen:

Name Version Build Channel

7zip 19.00 h2d74725_2 conda-forge
accelerate 0.17.1 pypi_0 pypi
aiofiles 23.1.0 pypi_0 pypi
aiohttp 3.8.4 pypi_0 pypi
aiosignal 1.3.1 pypi_0 pypi
altair 4.2.2 pypi_0 pypi
anyio 3.6.2 pypi_0 pypi
async-timeout 4.0.2 pypi_0 pypi
attrs 22.2.0 pypi_0 pypi
bitsandbytes 0.37.1 pypi_0 pypi
blas 1.0 mkl
brotlipy 0.7.0 py310h2bbff1b_1002
bzip2 1.0.8 he774522_0
ca-certificates 2022.12.7 h5b45459_0 conda-forge
certifi 2022.12.7 pyhd8ed1ab_0 conda-forge
cffi 1.15.1 py310h2bbff1b_3
charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.1.3 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
contourpy 1.0.7 pypi_0 pypi
cryptography 39.0.1 py310h21b164f_0
cuda-cccl 12.1.55 0 nvidia
cuda-cudart 11.7.99 0 nvidia
cuda-cudart-dev 11.7.99 0 nvidia
cuda-cupti 11.7.101 0 nvidia
cuda-libraries 11.7.1 0 nvidia
cuda-libraries-dev 11.7.1 0 nvidia
cuda-nvrtc 11.7.99 0 nvidia
cuda-nvrtc-dev 11.7.99 0 nvidia
cuda-nvtx 11.7.91 0 nvidia
cuda-runtime 11.7.1 0 nvidia
cudatoolkit-dev 11.7.0 hab45a8e_5 conda-forge
curl 7.88.1 h2bbff1b_0
cycler 0.11.0 pypi_0 pypi
entrypoints 0.4 pypi_0 pypi
fastapi 0.94.1 pypi_0 pypi
ffmpy 0.3.0 pypi_0 pypi
filelock 3.10.0 pypi_0 pypi
flexgen 0.1.7 pypi_0 pypi
flit-core 3.6.0 pyhd3eb1b0_0
fonttools 4.39.2 pypi_0 pypi
freetype 2.12.1 ha860e81_0
frozenlist 1.3.3 pypi_0 pypi
fsspec 2023.3.0 pypi_0 pypi
giflib 5.2.1 h8cc25b3_3
git 2.34.1 haa95532_0
gradio 3.18.0 pypi_0 pypi
h11 0.14.0 pypi_0 pypi
httpcore 0.16.3 pypi_0 pypi
httpx 0.23.3 pypi_0 pypi
huggingface-hub 0.13.2 pypi_0 pypi
idna 3.4 py310haa95532_0
intel-openmp 2021.4.0 haa95532_3556
jinja2 3.1.2 pypi_0 pypi
jpeg 9e h2bbff1b_1
jsonschema 4.17.3 pypi_0 pypi
kiwisolver 1.4.4 pypi_0 pypi
lerc 3.0 hd77b12b_0
libcublas 11.10.3.66 0 nvidia
libcublas-dev 11.10.3.66 0 nvidia
libcufft 10.7.2.124 0 nvidia
libcufft-dev 10.7.2.124 0 nvidia
libcurand 10.3.2.56 0 nvidia
libcurand-dev 10.3.2.56 0 nvidia
libcurl 7.88.1 h86230a5_0
libcusolver 11.4.0.1 0 nvidia
libcusolver-dev 11.4.0.1 0 nvidia
libcusparse 11.7.4.91 0 nvidia
libcusparse-dev 11.7.4.91 0 nvidia
libdeflate 1.17 h2bbff1b_0
libffi 3.4.2 hd77b12b_6
libnpp 11.7.4.75 0 nvidia
libnpp-dev 11.7.4.75 0 nvidia
libnvjpeg 11.8.0.2 0 nvidia
libnvjpeg-dev 11.8.0.2 0 nvidia
libpng 1.6.39 h8cc25b3_0
libssh2 1.10.0 h680486a_2 conda-forge
libtiff 4.5.0 h6c2663c_2
libuv 1.44.2 h2bbff1b_0
libwebp 1.2.4 hbc33d0d_1
libwebp-base 1.2.4 h2bbff1b_1
linkify-it-py 2.0.0 pypi_0 pypi
lz4-c 1.9.4 h2bbff1b_0
markdown 3.4.1 pypi_0 pypi
markdown-it-py 2.2.0 pypi_0 pypi
markupsafe 2.1.2 pypi_0 pypi
matplotlib 3.7.1 pypi_0 pypi
mdit-py-plugins 0.3.5 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
mkl 2021.4.0 haa95532_640
mkl-service 2.4.0 py310h2bbff1b_0
mkl_fft 1.3.1 py310ha0764ea_0
mkl_random 1.2.2 py310h4ed8f06_0
multidict 6.0.4 pypi_0 pypi
numpy 1.23.5 py310h60c9a35_0
numpy-base 1.23.5 py310h04254f7_0
openssl 1.1.1t h2bbff1b_0
orjson 3.8.7 pypi_0 pypi
packaging 23.0 pypi_0 pypi
pandas 1.5.3 pypi_0 pypi
peft 0.2.0 pypi_0 pypi
pillow 9.4.0 py310hd77b12b_0
pip 23.0.1 py310haa95532_0
psutil 5.9.4 pypi_0 pypi
pulp 2.7.0 pypi_0 pypi
pycparser 2.21 pyhd3eb1b0_0
pycryptodome 3.17 pypi_0 pypi
pydantic 1.10.6 pypi_0 pypi
pydub 0.25.1 pypi_0 pypi
pyopenssl 23.0.0 py310haa95532_0
pyparsing 3.0.9 pypi_0 pypi
pyrsistent 0.19.3 pypi_0 pypi
pysocks 1.7.1 py310haa95532_0
python 3.10.9 h966fe2a_2
python-dateutil 2.8.2 pypi_0 pypi
python-multipart 0.0.6 pypi_0 pypi
pytorch 1.13.1 py3.10_cuda11.7_cudnn8_0 pytorch
pytorch-cuda 11.7 h16d0643_3 pytorch
pytorch-mutex 1.0 cuda pytorch
pytz 2022.7.1 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
quant-cuda 0.0.0 pypi_0 pypi
regex 2022.10.31 pypi_0 pypi
requests 2.28.1 py310haa95532_1
rfc3986 1.5.0 pypi_0 pypi
safetensors 0.3.0 pypi_0 pypi
sentencepiece 0.1.97 pypi_0 pypi
setuptools 65.6.3 py310haa95532_0
six 1.16.0 pyhd3eb1b0_1
sniffio 1.3.0 pypi_0 pypi
sqlite 3.41.1 h2bbff1b_0
starlette 0.26.1 pypi_0 pypi
tk 8.6.12 h2bbff1b_0
tokenizers 0.13.2 pypi_0 pypi
toolz 0.12.0 pypi_0 pypi
torchaudio 0.13.1 pypi_0 pypi
torchvision 0.14.1 pypi_0 pypi
tqdm 4.65.0 pypi_0 pypi
transformers 4.28.0.dev0 pypi_0 pypi
typing_extensions 4.4.0 py310haa95532_0
tzdata 2022g h04d1e81_0
uc-micro-py 1.0.1 pypi_0 pypi
urllib3 1.26.14 py310haa95532_0
uvicorn 0.21.1 pypi_0 pypi
vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
websockets 10.4 pypi_0 pypi
wheel 0.38.4 py310haa95532_0
win_inet_pton 1.1.0 py310haa95532_0
wincertstore 0.2 py310haa95532_2
xz 5.2.10 h8cc25b3_1
yarl 1.8.2 pypi_0 pypi
zlib 1.2.13 h8cc25b3_0
zstd 1.5.2 h19a0ad4_0`

@bartman081523
Copy link

bartman081523 commented Mar 19, 2023

This is the output I get from conda list -p "C:\Users\patri\miniconda3\envs\textgen":

Your env looks alright, as far as i can tell.

Wait - there is something wrong here. I don't want to run cpu generation!

Sorry, I was mixing this up with @HCBlackFox and on your side, the cpu allocator was failing.
But you have the same error. Can you try to delete bitsandbytes_cpu and reinstall bitsandbytes?
I am at the end of my knowledge here, sorry. But for me it is not clear, whether on your side bitsandbytes or cuda is functioning properly.

@whitepapercg
Copy link

This is the output I get from conda list -p "C:\Users\patri\miniconda3\envs\textgen":

Your env looks alright, as far as i can tell.

Wait - there is something wrong here. I don't want to run cpu generation!

Sorry, I was mixing this up with @HCBlackFox and on your side, the cpu allocator was failing. But you have the same error. Can you try to delete bitsandbytes_cpu and reinstall bitsandbytes? I am at the end of my knowledge here, sorry. But for me it is not clear, whether on your side bitsandbytes or cuda is functioning properly.

Thanks, I removed libbitsandbytescpu.so from the package directory and the error went away (I got the following error lul)

@github-actions github-actions bot added the stale label Apr 20, 2023
@github-actions
Copy link

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants