malloc error "Unable to allocate" on 8GB RAM Mac #63

davidjoffe · 2023-12-08T01:55:23Z

I kept encountering the below error while trying the stable diffusion sample in mlx-examples on an 8GB M2 Mac Mini here. After some investigation (detailed here: ml-explore/mlx-examples#21) I found changing one line of code in MetalAllocator::MetalAllocator() in mlx/backend/metal/allocator.cpp to a much higher limit seems to have fixed the problem (this 1.5 seems maybe a bit conservative for low-RAM Macs):

block_limit_(1.5 * device_->recommendedMaxWorkingSetSize()) {}'
https://github.com/davidjoffe/mlx/blob/main/mlx/backend/metal/allocator.cpp

I made a fork with this change, and built from source to test.
I'd like to submit a Pull Request. This change should help for low-RAM Macs like 8GB Macs, though effectively just allows it to use swap instead of failing - arguably better than failing, but in the long run this behavior may need further improvement/refining, and/or giving users more control over whether/how they want this, or perhaps warning, or something.

(foo) david@Davids-Mac-mini stable_diffusion % python txt2image.py "A photo of an astronaut riding a horse on Mars." --n_images 1 --n_rows 1
/Users/david/mlx/foo/lib/python3.9/site-packages/urllib3/__init__.py:34: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
100%|
 [00:00<?, ?it/s]libc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 134217728 bytes.
zsh: abort      python txt2image.py "A photo of an astronaut riding a horse on Mars."  1  1
(foo) david@Davids-Mac-mini stable_diffusion % /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

The text was updated successfully, but these errors were encountered:

saminatorkash · 2023-12-09T04:25:46Z

Completely I agree that we should be able to use as much swap memory as we please.
same here. using a 8GB model. I haven't tried your solution of changing the value of 1.5 to more in allocation. I will play with it and let it update.

python txt2image.py "A photo of an astronaut riding a horse on Mars." --n_images 1 --n_rows 2
diffusion_pytorch_model.safetensors: 100%|█| 3.46G/3.46G [09:10<00:00, 6.29MB/s]
text_encoder/config.json: 100%|█████████████████| 613/613 [00:00<00:00, 906kB/s]
model.safetensors: 100%|███████████████████| 1.36G/1.36G [04:41<00:00, 4.83MB/s]
vae/config.json: 100%|██████████████████████████| 553/553 [00:00<00:00, 947kB/s]
diffusion_pytorch_model.safetensors: 100%|███| 335M/335M [00:57<00:00, 5.81MB/s]
tokenizer/vocab.json: 100%|████████████████| 1.06M/1.06M [00:00<00:00, 1.18MB/s]
tokenizer/merges.txt: 100%|███████████████████| 525k/525k [00:00<00:00, 882kB/s]
100%|███████████████████████████████████████████| 50/50 [05:19<00:00,  6.39s/it]
  0%|                                                                                                                                 | 0/1 [00:00<?, ?it/s]libc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 134217728 bytes.
Abort trap: 6
UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

swamyg · 2023-12-12T01:41:04Z

libc++abi: terminating due to uncaught exception of type std::runtime_error:
[malloc_or_wait] Unable to allocate 100237312 bytes.

I'm running into this error on my M3 Max, 36GB of ram when trying to run lora.py from ml-examples. Trying to fine tune mistral-7B model.

100237312 bytes doesn't seem that much, not sure why it's failing.

awni · 2023-12-12T04:06:24Z

See ml-explore/mlx-examples#70 for some ideas around how to reduce Lora memory consumption until we have quantization.

s-smits · 2024-04-24T07:29:23Z

Also got a VRAM error while ~2GB more available than 9.8GB as shown in terminal when loading Phi-3. Is is possible to put the VRAM limit to `max_available_at_initiating` or something like that? So that other applications only take up swap.

awni · 2024-04-24T13:35:06Z

There is a maximum size you can allocate into a single buffer (which is a machine specific property). I think it is less than 9.8 GB for you.

But either way the fact that you are trying to put 9GB into a single buffer is not a good sign. What are you running to get that? Is it from training or generation?

s-smits · 2024-04-24T14:52:26Z

It is a 16GB Air M1, do you happen to know a ballpark of the limit? Or is it dynamically dependent of other processes?
I was running a Phi-3-128k-mlx mlx_lm.utils load and generate function with ~6k context (when I run again it says 12.2GB needed), is it only limited to 8GB of VRAM? With PyTorch I am able to run 14GB of Python files without much of a speed loss (with around ~4-5GB swap of the top of my head).

awni · 2024-04-25T02:48:38Z

It is a 16GB Air M1, do you happen to know a ballpark of the limit?

I don't know but you could try running this until it breaks:

import mlx.core as mx

mx.metal.set_cache_limit(0)
for i in range(100):
    print(f"{i} GB")
    a = mx.zeros((2**30, i), mx.bool_)
    mx.eval(a)
    del a

awni · 2024-04-25T02:49:40Z

I'm going to close this issue as I'm not sure why it's still open. Feel free to file a new issue if you are still having issues with memory allocation.

s-smits · 2024-04-25T12:26:20Z

air@MacBook-Air-van-Air test-repo % /opt/homebrew/bin/python3.
10 /Users/air/Repositories/test-repo/test4.py
0 GB
1 GB
2 GB
3 GB
4 GB
5 GB
6 GB
7 GB
8 GB
9 GB
libc++abi: terminating due to uncaught exception of type std::runtime_error: [malloc_or_wait] Unable to allocate 9663676416 bytes.
zsh: abort      /opt/homebrew/bin/python3.10 ```
Just an FYI, no need for me to open a new issue, thank you.

This was referenced Dec 8, 2023

Prevent malloc failure on 8GB Macs #64

Closed

stable diffusion malloc error ml-explore/mlx-examples#21

Closed

awni closed this as completed Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

malloc error "Unable to allocate" on 8GB RAM Mac #63

malloc error "Unable to allocate" on 8GB RAM Mac #63

davidjoffe commented Dec 8, 2023 •

edited

Loading

saminatorkash commented Dec 9, 2023 •

edited

Loading

swamyg commented Dec 12, 2023 •

edited

Loading

awni commented Dec 12, 2023

s-smits commented Apr 24, 2024

awni commented Apr 24, 2024

s-smits commented Apr 24, 2024 •

edited

Loading

awni commented Apr 25, 2024 •

edited

Loading

awni commented Apr 25, 2024

s-smits commented Apr 25, 2024

malloc error "Unable to allocate" on 8GB RAM Mac #63

malloc error "Unable to allocate" on 8GB RAM Mac #63

Comments

davidjoffe commented Dec 8, 2023 • edited Loading

saminatorkash commented Dec 9, 2023 • edited Loading

swamyg commented Dec 12, 2023 • edited Loading

awni commented Dec 12, 2023

s-smits commented Apr 24, 2024

awni commented Apr 24, 2024

s-smits commented Apr 24, 2024 • edited Loading

awni commented Apr 25, 2024 • edited Loading

awni commented Apr 25, 2024

s-smits commented Apr 25, 2024

davidjoffe commented Dec 8, 2023 •

edited

Loading

saminatorkash commented Dec 9, 2023 •

edited

Loading

swamyg commented Dec 12, 2023 •

edited

Loading

s-smits commented Apr 24, 2024 •

edited

Loading

awni commented Apr 25, 2024 •

edited

Loading