cuda.OutOfMemoryError: CUDA out of memory #99

junzhoupro · 2024-06-13T13:38:37Z

Dear Author, thanks for your work!
I'm running the training on my computer and had out of memory error.
I'm using 4090

Training with error:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 23.64 GiB total capacity; 22.20 GiB already allocated; 70.75 MiB free; 22.38 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I'm using 4090, my configs: anything I can do to train on my computer ?

batch_size = 1 #16
logger_freq = 1000
learning_rate = 1e-5
sd_locked = True #False
only_mid_control = True #False
n_gpus = 1
accumulate_grad_batches=1

XavierCHEN34 · 2024-06-14T07:30:52Z

You could try "ddp_sharded", which requires smaller memories
trainer = pl.Trainer(gpus=1, strategy="ddp_sharded", precision=16, accelerator="gpu", callbacks=[logger], progress_bar_refresh_rate=1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda.OutOfMemoryError: CUDA out of memory #99

cuda.OutOfMemoryError: CUDA out of memory #99

junzhoupro commented Jun 13, 2024 •

edited

Loading

XavierCHEN34 commented Jun 14, 2024

cuda.OutOfMemoryError: CUDA out of memory #99

cuda.OutOfMemoryError: CUDA out of memory #99

Comments

junzhoupro commented Jun 13, 2024 • edited Loading

XavierCHEN34 commented Jun 14, 2024

junzhoupro commented Jun 13, 2024 •

edited

Loading