Replies: 9 comments 21 replies
-
But actually I am not looking for learning about a specific subject/style. What I am trying to achieve is make the model better at some general concepts, like how to draw proper feet or how to properly place two people in the same figure. I prepared ~3k samples. |
Beta Was this translation helpful? Give feedback.
-
Where i execute this command?
In base folder: C:\Users\ZeroCool22\Desktop\Auto Or in DB folder?: C:\Users\ZeroCool22\Desktop\Auto\extensions\sd_dreambooth_extension |
Beta Was this translation helpful? Give feedback.
-
I'm getting the following error when Dreambooth tries to save a checkpoint, more specifically the sample images:
|
Beta Was this translation helpful? Give feedback.
-
hi, can this model of training can use multiple model at once like on fastmode in google colab ? |
Beta Was this translation helpful? Give feedback.
-
If anyone needs class images / regularization images, I've created a ton of datasets here for people to use with DreamBooth training: https://huggingface.co/datasets/ProGamerGov/StableDiffusion-v1-5-Regularization-Images |
Beta Was this translation helpful? Give feedback.
-
Almost crying: "Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity;" |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Do not forget enable and increase the windows swap file to ~ 32GB, to avoid CUDA_ERROR_OUT_OF_MEMORY. I've been fiddling with this dried fly for a very long time!!! |
Beta Was this translation helpful? Give feedback.
-
For the Dataset Directory field under the Concepts tab, is the format of the directory path something like. 1: "C:_CODE_Github\stable-diffusion-webui\ __inputs\images_and_captions" OR 2: "__inputs\images_and_captions" OR 3: "/__inputs/images_and_captions" I'm in Windows 11 |
Beta Was this translation helpful? Give feedback.
-
Security concerns
This method involves using dlls from a third party repo, which is inherently not safe. I am still figuring out the way to build these dlls myself, but as I know nothing about Windows development, marchine learning pipelines or tools in the first place, it might be difficulty for me to do it.
Steps
0. Preparations
Clean up your VRAM
It is reported to work with only 8GB VRAM, but it is at least not for me. My 3080Ti is struggling with 12GB VRAM, and while it finally works, it used up all the 12G, so save as much as VRAM for training as possible. Close all other programs, unload vae (you can do it in settings easily with the latest webui), and if possible, connect your monitor to and start Windows desktop on your integrated intel graphics. This saves the last straw of several hundreds of MBs for me.
Prepare a large disk
Like training textual inversion and hypernetwork, you want to save a checkpoint every several steps, to verify results/prevent overfit/resume from an earlier point using smaller lr. Unlike textual inversion or hypernetwork, each checkpoint is 2GB large if using half precision, or 4GB if full. Be prepared if you choose to train 10000 steps and save a 2G file every 500 steps!
Prepare your data
You can use Train -> Preprocess images in webui to easily crop and create flipped copies. Just remember this verion of dreambooth does not read tags in a separate txt file and expect every file in the folder to be a image, so do not tag your images. You might be able to use a concept json file though, and I am still figuring out how.
1. Installation
First install the awesome sd_dreambooth_extension by d8ahazard. You need to restart webui after installation to install its new requirements. If you are on Linux, that's done.
But if you use Windows, you need an extra step. This is because the 8bit AdamW optimiazer by bitsandbytes is not officially supported on Windows. I think the latest version of sd_dreambooth_extension packs pre-compiled windows dlls from https://github.com/bmaltais/kohya_ss now, but only for cu116, and by default webui install pytorch+cu113 for you. So you need to manually upgrade:
2. Already Fixed
A little code fixupThe dreambooth plugin should now work for you if you are using SD v1.4 or v1.5. However, if you are using NAI, when you try to create a model you may encounter d8ahazard/sd_dreambooth_extension#6, and got the following error:~~ RuntimeError: Error(s) in loading state_dict for CLIPTextModel:~~ Missing key(s) in state_dict: "text_model.embeddings.position_ids", ... Unexpected key(s) in state_dict: "embeddings.position_ids", ... ~~
If so, before d8ahazard address this issue, you need to edit dreambooth\conversion.py line 715 fromtext_model_dict[key[len("cond_stage_model.transformer."):]] = checkpoint[key]
totext_model_dict["text_model." + key[len("cond_stage_model.transformer."):]] = checkpoint[key]
And restart webui. After that you can create and train your model.3. Training settings
You only need to fill in
Instance prompt
andDataset directory
, I think the rest defaults are good enough.Then scroll down and expand Advanced, it is important to change the following:
Then, happy training! I got a speed around 3it/s, which is identical to training textual inversion and hypernetwork. You can tinker around steps and other settings, as I am also still experimenting.
Some quick results
This is a bear girl drawn by me, and I call her Mishka. It is the only image of this original character. I created a flipped copy for her, so the input is a total of 2 samples. I do not use classification.
I start from animefull-final-pruned from NAI, and after 500 steps, the model can already remeber her well:
compared to ~3000k steps required in TI. And I think its results are much better than vanilla TI, because when training with only 2 samples, vanilla TI is always giving me a lot of periodic noise in the backgournd, and it is very difficult to change the composure. On the other hand, #2945 can give similarly good results, but require ~30k steps.
However it seems 500 steps is already too many, and the model is screwed that is can no longer create a girl that doesn't look like Mishka.
Possible optimization - Done
After using diffusers by ShivamShrirao xformers is automatically enabled, reducing memory usage from 12G to 11.5G for me. Not much, but helpful to leave a safe margin.
I find in dreambooth/dreambooth.py line 198 that before doing the training, xformers is unloaded, similar to the behavior before TI and HN training. However, in the latest webui, it is possible to keep the xformers optimization before TI to allow TI on 6GB, and after xformers attention block fix the results are no longer bad. So maybe it is possible to keep the xformers optimization before doing dreambooth to save some extra VRAM? I tested keeping xformers, and the Mishka results are still good, but further experiments are required.Beta Was this translation helpful? Give feedback.
All reactions