4-bit: "Allocator: not enough memory: you tried to allocate 35389440 bytes." #429

patrickmros · 2023-03-19T13:15:43Z

I followed the guide 4bit LLaMA Setup for Windows and it worked. One time.

The next time i tried to start it, i get

Loading llama-13b... Loading model ... Traceback (most recent call last): File "J:\LLaMA\text-generation-webui\server.py", line 236, in <module> shared.model, shared.tokenizer = load_model(shared.model_name) File "J:\LLaMA\text-generation-webui\modules\models.py", line 100, in load_model model = load_quantized(model_name) File "J:\LLaMA\text-generation-webui\modules\GPTQ_loader.py", line 55, in load_quantized model = load_quant(str(path_to_model), str(pt_path), shared.args.gptq_bits) File "J:\LLaMA\text-generation-webui\repositories\GPTQ-for-LLaMa\llama.py", line 245, in load_quant model.load_state_dict(torch.load(checkpoint)) File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 789, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1131, in _load result = unpickler.load() File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1101, in persistent_load load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) File "C:\Users\patri\miniconda3\envs\textgen\lib\site-packages\torch\serialization.py", line 1079, in load_tensor storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage).storage().untyped() RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 35389440 bytes.

Well, i'm quite sure i have plenty of 35 megabytes free in my vram, ram and on my hard disk... Where exactly does it try to allocate 35 MB ?

The text was updated successfully, but these errors were encountered:

BarfingLemurs · 2023-03-19T13:19:54Z

The final line indicates you are out of DRAM (regular ram)to load the model, have you tried using a pagefile?

patrickmros · 2023-03-19T13:23:18Z

No I haven't used a pagefile, because I didn't need it. The 8-bit version works without a pagefile, does the 4-bit need more DRAM?

BarfingLemurs · 2023-03-19T13:36:18Z

well, It only loads your model into ram initially, then loads it into vram.

bartman081523 · 2023-03-19T14:02:43Z

No I haven't used a pagefile, because I didn't need it. The 8-bit version works without a pagefile, does the 4-bit need more DRAM?

Please create a page file/swap space. I had the same problem, until i created a swap space.

wywywywy · 2023-03-19T14:18:07Z

Has anyone tried converting the 4bit weights to safetensors and load directly to GPU?

patrickmros · 2023-03-19T14:21:50Z

That's weird. I checked again and I do have a swap file on my second hard disk. There's 100GB of memory free space and the swap file size is automatically managed by Windows.

HCBlackFox · 2023-03-19T15:02:18Z

> python server.py --share --model oasst-sft-1-pythia-12b --cpu --load-in-8bit
Loading oasst-sft-1-pythia-12b...
Loading checkpoint shards:  33%|████████████████████████████████████████████████▎                                                                                                | 1/3 [07:52<15:44, 472.28s/it]
Traceback (most recent call last):
  File "D:\Study\MCU\C Learning\text-generation-webui\server.py", line 236, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Study\MCU\C Learning\text-generation-webui\modules\models.py", line 157, in load_model
    model = AutoModelForCausalLM.from_pretrained(checkpoint, **params)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\models\auto\auto_factory.py", line 471, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\modeling_utils.py", line 2646, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\modeling_utils.py", line 2969, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Python\Lib\site-packages\transformers\modeling_utils.py", line 640, in _load_state_dict_into_meta_model
    param = param.to(dtype)
            ^^^^^^^^^^^^^^^
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 419430400 bytes.

I have this error too. I have 16 Gb of ram and 32 Gb swap file. Any ideas?

bartman081523 · 2023-03-19T15:32:22Z

@patrickmros
Can you execute this command and post the output?
conda list -p "C:\Users\patri\miniconda3\envs\textgen\"

This is only an idea but since you are running cpu generation, can you temporarily uninstall Cuda and try again?
I also dont know whether overwriting bitsandbytes_cpu with the cuda version is such a good idea, if you strictly want to run cpu generation. Also make sure that you followed the cpu setup guide (or auto install script).

patrickmros · 2023-03-19T15:42:12Z

@patrickmros Can you execute this command and give the output? conda list -p "C:\Users\patri\miniconda3\envs\textgen\"

This is only an idea but since you are running cpu generation, can you temporarily uninstall Cuda and try again? I also dont know whether overwriting bitsandbytes_cpu with the cuda version is such a good idea, if you strictly want to run cpu generation. Also make sure that you followed the cpu setup guide (or auto install script).

Wait - there is something wrong here. I don't want to run cpu generation!

This is the output I get from conda list -p "C:\Users\patri\miniconda3\envs\textgen":

`# packages in environment at C:\Users\patri\miniconda3\envs\textgen:

bartman081523 · 2023-03-19T16:03:10Z

This is the output I get from conda list -p "C:\Users\patri\miniconda3\envs\textgen":

Your env looks alright, as far as i can tell.

Wait - there is something wrong here. I don't want to run cpu generation!

Sorry, I was mixing this up with @HCBlackFox and on your side, the cpu allocator was failing.
But you have the same error. Can you try to delete bitsandbytes_cpu and reinstall bitsandbytes?
I am at the end of my knowledge here, sorry. But for me it is not clear, whether on your side bitsandbytes or cuda is functioning properly.

whitepapercg · 2023-03-21T12:48:51Z

This is the output I get from conda list -p "C:\Users\patri\miniconda3\envs\textgen":

Your env looks alright, as far as i can tell.

Wait - there is something wrong here. I don't want to run cpu generation!

Sorry, I was mixing this up with @HCBlackFox and on your side, the cpu allocator was failing. But you have the same error. Can you try to delete bitsandbytes_cpu and reinstall bitsandbytes? I am at the end of my knowledge here, sorry. But for me it is not clear, whether on your side bitsandbytes or cuda is functioning properly.

Thanks, I removed libbitsandbytescpu.so from the package directory and the error went away (I got the following error lul)

github-actions · 2023-04-20T23:16:34Z

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

github-actions bot added the stale label Apr 20, 2023

github-actions bot closed this as completed Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4-bit: "Allocator: not enough memory: you tried to allocate 35389440 bytes." #429

4-bit: "Allocator: not enough memory: you tried to allocate 35389440 bytes." #429

patrickmros commented Mar 19, 2023

BarfingLemurs commented Mar 19, 2023

patrickmros commented Mar 19, 2023

BarfingLemurs commented Mar 19, 2023

bartman081523 commented Mar 19, 2023

wywywywy commented Mar 19, 2023

patrickmros commented Mar 19, 2023 •

edited

HCBlackFox commented Mar 19, 2023

bartman081523 commented Mar 19, 2023 •

edited

patrickmros commented Mar 19, 2023 •

edited

bartman081523 commented Mar 19, 2023 •

edited

whitepapercg commented Mar 21, 2023

github-actions bot commented Apr 20, 2023

4-bit: "Allocator: not enough memory: you tried to allocate 35389440 bytes." #429

4-bit: "Allocator: not enough memory: you tried to allocate 35389440 bytes." #429

Comments

patrickmros commented Mar 19, 2023

BarfingLemurs commented Mar 19, 2023

patrickmros commented Mar 19, 2023

BarfingLemurs commented Mar 19, 2023

bartman081523 commented Mar 19, 2023

wywywywy commented Mar 19, 2023

patrickmros commented Mar 19, 2023 • edited

HCBlackFox commented Mar 19, 2023

bartman081523 commented Mar 19, 2023 • edited

patrickmros commented Mar 19, 2023 • edited

Name Version Build Channel

bartman081523 commented Mar 19, 2023 • edited

whitepapercg commented Mar 21, 2023

github-actions bot commented Apr 20, 2023

patrickmros commented Mar 19, 2023 •

edited

bartman081523 commented Mar 19, 2023 •

edited

patrickmros commented Mar 19, 2023 •

edited

bartman081523 commented Mar 19, 2023 •

edited