-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DreamBooth Stable Diffusion training now possible in 10 GB VRAM, and it runs about 2 times faster. #35
Comments
Very cool. Doing what I can for 16gb too. |
I'm running into issues with it finding the gpus I think. 4xA10G. I'll post code tomorrow. |
Wow, Using the 8bit adam optimizer from bitsandbytes along with xformers reduces the memory usage to 12.5 GB. |
There is no such file. Edit: Issue resolved. |
Do you have a donation link? I don't have much, but you are doing great work. |
Hey, Thanks. No donation link haha. Good to hear you liked it. It has been quite fun to do for me. |
@ShivamShrirao I've been trying to run your notebook on Runpod with Pytorch and an A5000 but I'm getting an error during pip install "Building wheel for xformers (setup.py) ... error". I'd also love to donate if I can get this working. |
@Daniel-Kelvich How did you fix this? |
@pdjohntony What error are you facing ? If 404, it may be due to not being authenticated with huggingface cli. |
@ShivamShrirao I managed to get your dreambooth example working but its been running for 2 hours now on an A5000. Since thats taking so long, I spun up another instance on vast with 2 A5000's but now I'm getting the 404. It shouldn't be an auth issue with huggingface as a logged in on the CLI and it appeared to download the model for a while before getting this 404 error.
|
Great work! I managed to run it in a google colab. I was just wondering, how do I get checkpoint files that I can use later on from the model files that are stored? I could only find the |
@roar-emaus These are the diffuser version of weights. I have added an inference example in colab on how to use them in diffusers. For others you will need to convert them. |
Thank you! will test it tomorrow :) |
finally got it to work, how can we use the model to reuse in a stable colab @ShivamShrirao ? I have used the inference but how do i save my model, i havent even been able to find what folder its in lol, any info on how to convert it into a ckpt?? great work !! |
I haven't figured out yet how to convert to single ckpt to use in other repos. Currently the whole folder is your model, you can save the whole folder until someone figures it out. This needs to be reversed https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py |
@ShivamShrirao If I'm reading things right, 8bit AdamW should be a drop in replacement and the modified CrossAttention class seems like it should just be able to replace the one in ldm/modules/attention.py in this repository. Sadly can't test it myself because bitsandbytes has a C extension that uses CUDA and I'm on AMD |
successfully trained one model, but my second time training im getting an error @ShivamShrirao Steps: 2% 18/1000 [00:56<45:45, 2.80s/it, loss=0.536, lr=5e-6]Traceback (most recent call last): |
Very nice progress! Digging in more now |
@pdjohntony try to update transformers library |
@ShivamShrirao I'm assuming you mean only the items in the imv folder make up the ckpt file, I deleted my colab and only saved those items to the google drive |
in the collab
are no f strings, they should be right ? cheers |
@binarymind Not required here cause it executes as a shell command. |
ok thanks ! during this cell I got the following result
my nvidia-smi is the following
I tried also to do the
|
Lol I fixed my problem by removing the f strings I added.... sorry edit: ah nope was not that, launched again the notebook on a new repo and the problem appear again, looking at it |
I'm hoping for a (fingers crossed not too distant) future version of this that can run on requirements of a 3080. Will put it into reach of many more people including myself. Keep up the great work!! |
I'm not having any success. Trying to use V100 on colab.
|
@JoeMcGuire you will need to compile the xformers, current wheels only support T4 GPU. |
there are xformers for p100 on this colab precompiled, how to incorporate those into dreambooth ? It will cover colab pro under installing xformers |
yeah , now its kinda not useable on webuis and most people are on webuis, huggingface love their bins also default 600 steps are pretty bad, not sure why its default ? should be more like at least 2000 |
Any chances to run on 12GB rtx 3060? |
@Blucknote hopefully pretty soon. I have gotten the GPU usage to 11.187 GB, but there are a few bugs due to which the model output quality isn't good right now even for higher precision. Will update once quality gets better. |
Can we get a link to the json or description on that? |
The following values were not passed to SOLVE = pip install --upgrade transformers |
I've tried both local directories matching and making sure there are zero that match. So close. Would appreciate any help anybody has to offer. |
Hello, i have trained on an RTX 2060 with a stable consumption of 10.8GB of VRAM and at an amazing speed, between 5 and 10 minutes! These are the details of my configuration:
I obtain very good results. |
@konimaki2022 can you share your notebook? |
@guumaster sorry I haven't created a notebook in Google Colab yet, I run it on my local computer with Ubuntu 20.04, no cloud. |
I think Ubuntu is the key. Because we have to redirect Cuda drivers to invoke adam right in windows it's cause two straight days of work. Close hopefully |
I've learned a lot and I think a more stable and universal windows local solution is close. |
Awesome!! I assume this wont work with a 10GB GPU still, due to other apps using it. If anyone knows of a way to get it working with that, such as utilising shared memory (not worrying about a decrease in performance), that would be fantastic!! If not, I look forward to future progressions! |
@TheChapster It might work on linux where you can have no other application running on the GPU, or might need just a few modifications. I don't have a 10GB GPU to test it so can't confirm. |
Can we get a row or two in the table with all optimizations on except for |
@hopibel Check the last row. |
Ah, missed it somehow. Dang, looks too close to 16GB to fit |
With This is using |
Now you can convert diffusers weights to ckpt, thanks to https://gist.github.com/jachiam/8a5c0b607e38fcc585168b90c686eb05 I have updated it in my colab. |
can you push it? thanks |
Like andrae293, I too would like to see you push this to be available :) |
@Jarfeh This repo seems abandoned. Use ShivamShrirao's diffusers fork instead. It includes all the optimizations discussed here and some new ones |
I tried to run the Google Colab, I have RTX 3060 12Gb but doesnt work torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 11.75 GiB total capacity; 8.06 GiB already allocated; 1.95 GiB free; 8.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF |
Hey, So I managed to run Stable Diffusion dreambooth training in just 17.7GB GPU usage by replacing the attention with memory efficient flash attention from xformers. Along with using way less memory, it also runs 2 times faster. So it's possible to train SD in 24GB GPUs now. Tested on Nvidia A10G, took 15-20 mins to train. I hope it's helpful.
Code in my fork: https://github.com/ShivamShrirao/diffusers/blob/main/examples/dreambooth/
Can even train on batch size of 2.
With some more tweaks it might be possible to train even on 16 GB gpus.
And it works, Outputs: Me in Fortnite
huggingface/diffusers#554 (comment)
The text was updated successfully, but these errors were encountered: