CUDA out of memory. Tried to allocate error #82

daggs1 · 2024-05-13T15:47:01Z

Greetings,

I'm trying to run gardio_app demo like stated in the readme and I'm getting this error:

$ python gradio_app.py
/home/worker/zero123plus/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
text_encoder/model.safetensors not found
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:03<00:00, 2.43it/s]
Traceback (most recent call last):
File "/home/worker/zero123plus/gradio_app.py", line 204, in
fire.Fire(run_demo)
File "/home/worker/zero123plus/lib/python3.10/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/worker/zero123plus/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/worker/zero123plus/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/worker/zero123plus/gradio_app.py", line 137, in run_demo
pipeline.to(f'cuda:{_GPU_ID}')
File "/home/worker/zero123plus/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 727, in to
module.to(torch_device, torch_dtype)
File "/home/worker/zero123plus/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1878, in to
return super().to(*args, **kwargs)
File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1173, in to
return self._apply(convert)
File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
module._apply(fn)
File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
module._apply(fn)
File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 779, in _apply
module._apply(fn)
[Previous line repeated 3 more times]
File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 804, in _apply
param_applied = fn(param)
File "/home/worker/zero123plus/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1159, in convert
return t.to(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU

any idea what is wrong? I've ran the setup like stated in the readme file

The text was updated successfully, but these errors were encountered:

daggs1 · 2024-05-14T15:52:40Z

I understand now, your small example requires 5GB of vram, my gpu has only 4GB of vram, shame, is there any way to reduce memory consumption?

eliphatfs · 2024-05-26T05:37:55Z

Nowadays we do have more techniques to reduce inference-time memory, including model offloading, autotune compile, quantization and maybe others. We do not have the code for these ready now but you are welcome to contribute. The first one does not incur any time overhead usually, while compiling will take some time before the start. Quantization would need some extra code and more tuning.

eliphatfs closed this as completed Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memory. Tried to allocate error #82

CUDA out of memory. Tried to allocate error #82

daggs1 commented May 13, 2024 •

edited

Loading

daggs1 commented May 14, 2024

eliphatfs commented May 26, 2024

CUDA out of memory. Tried to allocate error #82

CUDA out of memory. Tried to allocate error #82

Comments

daggs1 commented May 13, 2024 • edited Loading

daggs1 commented May 14, 2024

eliphatfs commented May 26, 2024

daggs1 commented May 13, 2024 •

edited

Loading