-
-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Management and PYTORCH_CUDA_ALLOC_CONF #124
Comments
Dreambooth revision is 4684f0f i am getting the following. It seems to be asking me to manage the memory as reserve is greater than allocated, and says to refer to documentation memory management, but as yet i cant find that? CUDA SETUP: Loading binary C:\Users\DieselMEDIA\SuperStableDiffusion\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... Steps: 0%| | 0/1200 [00:00<?, ?it/s] Exception while training: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 12.00 GiB total capacity; 9.67 GiB already allocated; 0 bytes free; 10.37 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Traceback (most recent call last): Cleanup Complete. Steps: 0%| | 0/1200 [00:16<?, ?it/s] Memory output: {'VRAM cleared.': '0.0/0.0GB', 'Training completed, reloading SD Model.': '0.0/7.7GB'} Windows 10 pro |
The error message is "RuntimeError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 12.00 GiB total capacity; 9.67 GiB already allocated; 0 bytes free; 10.37 GiB reserved in total by PyTorch)" Basically, the process to aquire more VRAM than is available. Since 10.37 + 1 Gb would be less than 12 it seems you could still reduce the base VRAM usage on your system somehow (closing all programs except the browser). Maybe some of the settings I used to run with a 10Gb 3080 work for you: #84 (comment) |
It seemed to work for one short session but the results were terrible From stuff ive read, am i right that pytorch is reserving the memory which is causing the error? from today - C:\Users\DieselMEDIA\SuperStableDiffusion\stable-diffusion-webui>git pull C:\Users\DieselMEDIA\SuperStableDiffusion\stable-diffusion-webui>pause stderr: Traceback (most recent call last): Launching Web UI with arguments: --disable-safe-unpickle To create a public link, set Loaded model. ===================================BUG REPORT===================================
|
Sorry for the delay in response. So, it looks like you're using a 12GB GPU, which should work on windows if you set all of the optimizations. I've added a wizard button to the "advanced" settings that will attempt to set the proper params for your machine so you can train. Give it a go, see if that helps. I've linked this to the main Optimization thread, happy to continue the discussion there to help get you training. |
Have you read the Readme?
Have you completely restarted the stable-diffusion-webUI, not just reloaded the UI?
Have you updated Dreambooth to the latest revision?
Please find the following lines in the console (After "Installing Requirements for WebUI") and paste them below. If you don't see these lines in the console, then update Dreambooth
Dreambooth revision is 3b3a8002da3a780c276934e8bcf308aa822f1e22 Diffusers version is 0.8.0.dev0. Torch version is 1.12.1+cu116. Torch vision version is 0.13.1+cu116.
Describe the bug
A clear and concise description of what the bug is.
Provide logs
If a crash has occurred, please provide the entire stack trace from the log, including the last few log messages before the crash occurred.
Environment
What OS?
If Windows - WSL or native?
What GPU are you using?
The text was updated successfully, but these errors were encountered: