Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constant out of memory errors on 12GB rtx 3060 no matter what settings I use. #43

Closed
Akis-M opened this issue Nov 8, 2022 · 19 comments
Closed

Comments

@Akis-M
Copy link

Akis-M commented Nov 8, 2022

Used every single "VRAM saving" setting there is. 8bit adam, dont cache latents, gradient checkpointing, fp16 mixed precision, etc. Even dropped the training resolution to abysmally low resolutions like 384 just to see if it would work. Same out of memory errors.

Isn't this supposed to be working with 12GB cards?

@rabidcopy
Copy link

Uncheck train text encoder if you haven't tried that.

@d8ahazard
Copy link
Owner

I'd also suggest unchecking "train text encoder". It uses a big chunk of VRAM. You can also set "save checkpoint every" and "save image every" to 0 to ensure that doesn't try to use any VRAM.

@Akis-M
Copy link
Author

Akis-M commented Nov 8, 2022

@d8ahazard @rabidcopy Doesn't even manage to get to the Out of Memory part if I uncheck the train text encoder option. Instead throws out:
Traceback (most recent call last):
File "C:\StableDif\stable-diffusion-webui\modules\ui.py", line 185, in f
res = list(func(*args, **kwargs))
File "C:\StableDif\stable-diffusion-webui\webui.py", line 54, in f
res = func(*args, **kwargs)
File "C:\StableDif\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\dreambooth.py", line 265, in start_training
trained_steps = main(config)
File "C:\StableDif\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 769, in main
encoder_hidden_states = text_encoder(batch["input_ids"])[0]
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs)
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 722, in forward
return self.text_model(
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs)
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 643, in forward
encoder_outputs = self.encoder(
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs)
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 574, in forward
layer_outputs = encoder_layer(
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs)
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 317, in forward
hidden_states, attn_weights = self.self_attn(
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs)
File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 257, in forward
attn_output = torch.bmm(attn_probs, value_states)
RuntimeError: expected scalar type Half but found Float

@zark119
Copy link

zark119 commented Nov 8, 2022

same card, same problem

@rasamaya
Copy link

rasamaya commented Nov 8, 2022

same problem with 080ti 11GB tried everything, for that matter cpu doesnt work either.

@entmike
Copy link

entmike commented Nov 8, 2022

@d8ahazard @rabidcopy Doesn't even manage to get to the Out of Memory part if I uncheck the train text encoder option. Instead throws out: Traceback (most recent call last): File "C:\StableDif\stable-diffusion-webui\modules\ui.py", line 185, in f res = list(func(*args, **kwargs)) File "C:\StableDif\stable-diffusion-webui\webui.py", line 54, in f res = func(*args, **kwargs) File "C:\StableDif\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\dreambooth.py", line 265, in start_training trained_steps = main(config) File "C:\StableDif\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 769, in main encoder_hidden_states = text_encoder(batch["input_ids"])[0] File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 722, in forward return self.text_model( File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 643, in forward encoder_outputs = self.encoder( File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 574, in forward layer_outputs = encoder_layer( File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 317, in forward hidden_states, attn_weights = self.self_attn( File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "C:\StableDif\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 257, in forward attn_output = torch.bmm(attn_probs, value_states) RuntimeError: expected scalar type Half but found Float

See here #37 for a temp fix.

@cholinkol
Copy link

same problem, when uncheck the train text encoder option, nothing changed still same

@AbyszOne
Copy link

AbyszOne commented Nov 8, 2022

I havnt use it yet, but with this GUI https://github.com/smy20011/dreambooth-gui it works well for 3060, even with text encoding ON.

@mykeehu
Copy link

mykeehu commented Nov 9, 2022

I get this error under RTX 3060 12 GB, when after saved ckpt file:

Exception saving preview: tensors used as indices must be long, byte or bool tensors
Pipeline cleared...
Allocated: 3.7GB
Reserved: 4.5GB

Traceback (most recent call last):
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\modules\ui.py", line 185, in f
res = list(func(*args, **kwargs))
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\webui.py", line 54, in f
res = func(*args, **kwargs)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\dreambooth.py", line 265, in start_training
trained_steps = main(config)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 791, in main
optimizer.step()
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\venv\lib\site-packages\accelerate\optimizer.py", line 134, in step
self.scaler.step(self.optimizer, closure)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 334, in step
self.unscale_(optimizer)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 279, in unscale_
optimizer_state["found_inf_per_device"] = self.unscale_grads(optimizer, inv_scale, found_inf, False)

Generate a preview image every N steps to 0 no helped. :(

@d8ahazard Didn't this latest patch break something?

@Jonseed
Copy link

Jonseed commented Nov 9, 2022

I have a RTX3060 12GB, and I'm getting the same OOM and errors...

@Jonseed
Copy link

Jonseed commented Nov 9, 2022

After making the file edit noted in #37 to delete "dtype=weight_dtype", restarting server, and unchecking don't cache latents, unchecking train text encoder, and switching mixed precision to fp16, and setting generate preview to a really high number, set it to save checkpoint at the same number as my training steps, it's finally training! First time I've been able to Dreambooth train locally on my 3060 12GB.

Will take about a half hour to train 2000 steps.

@Jonseed
Copy link

Jonseed commented Nov 9, 2022

Unfortunately I got a different error at the end of training, after reaching the final training step, about not being able to parse config.json, and unable to connect to huggingface.co... anyone know what that might be about?

@mykeehu
Copy link

mykeehu commented Nov 9, 2022

I also got an error when generating the image:

Exception saving preview: tensors used as indices must be long, byte or bool tensors
Pipeline cleared...
Allocated: 3.7GB
Reserved: 4.5GB

Error completing request
Arguments: ('myprompt', 'H:\Stable-Diffusion-Automatic\Dreambooth\destfront', 'H:\Stable-Diffusion-Automatic\stable-diffusion-webui\models\dreambooth\myprompt\classifiers', 'myprompt', '', '', '', 1.0, 7.5, 40.0, 0, 512, False, True, 1, 1, 12, 1500, 1, True, 5e-06, False, 'constant', 0, True, 0.9, 0.999, 0.01, 1e-08, 1, 100, 500, 'fp16', False, "[{'instance_prompt': 'myprompt', 'class_prompt': '', 'instance_data_dir': 'H:\\Stable-Diffusion-Automatic\\Dreambooth\\destfront', 'class_data_dir': 'H:\\Stable-Diffusion-Automatic\\stable-diffusion-webui\\models\\dreambooth\\myprompt\\classifiers'}]", False, True, True) {}
Traceback (most recent call last):
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\modules\ui.py", line 185, in f
res = list(func(*args, **kwargs))
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\webui.py", line 54, in f
res = func(*args, **kwargs)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\dreambooth.py", line 265, in start_training
trained_steps = main(config)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 791, in main
save_dir = args.output_dir
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\venv\lib\site-packages\accelerate\optimizer.py", line 134, in step
self.scaler.step(self.optimizer, closure)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 334, in step
self.unscale_(optimizer)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 279, in unscale_
optimizer_state["found_inf_per_device"] = self.unscale_grads(optimizer, inv_scale, found_inf, False)
File "H:\Stable-Diffusion-Automatic\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 207, in unscale_grads
raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.

@Slug-Cat
Copy link

Slug-Cat commented Nov 9, 2022

Unfortunately I got a different error at the end of training, after reaching the final training step, about not being able to parse config.json, and unable to connect to huggingface.co... anyone know what that might be about?

I got the same. Training starts successfully after the tweaks but if it pauses or stops for anything (including genrating sample images) it crashes:
OSError: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like E:\stable-diffusion-webui\models\dreambooth\Stolas\working is not the path to a directory containing a config.json file.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

@xXAdonesXx
Copy link

After making the file edit noted in #37 to delete "dtype=weight_dtype", restarting server, and unchecking don't cache latents, unchecking train text encoder, and switching mixed precision to fp16, and setting generate preview to a really high number, set it to save checkpoint at the same number as my training steps, it's finally training! First time I've been able to Dreambooth train locally on my 3060 12GB.

Will take about a half hour to train 2000 steps.

+check 8bit adam

@Slug-Cat
Copy link

Slug-Cat commented Nov 9, 2022

8bit adam is checked, issue persists

@tbrebant
Copy link

According to this page AUTOMATIC1111/stable-diffusion-webui#4436 there is a way to unload the VAE and save 2+ Gb, but I did not manage to find the setting.

@Jonseed
Copy link

Jonseed commented Nov 10, 2022

I want to report that I just completed my first successful training on 3060 12GB... commit c1702f1. So it can be done!

@d8ahazard
Copy link
Owner

I want to report that I just completed my first successful training on 3060 12GB... commit c1702f1. So it can be done!

Could you be so kind as to document your settings here:

#77

I'd like to create a central place for folks to discuss tuning and setup, and I think your success story might be a good starting point. :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests