HowTo: Dreambooth on 3080Ti 12G #4436

sgsdxzy · 2022-11-07T16:58:42Z

sgsdxzy
Nov 7, 2022

Security concerns

This method involves using dlls from a third party repo, which is inherently not safe. I am still figuring out the way to build these dlls myself, but as I know nothing about Windows development, marchine learning pipelines or tools in the first place, it might be difficulty for me to do it.

Steps

0. Preparations

Clean up your VRAM

It is reported to work with only 8GB VRAM, but it is at least not for me. My 3080Ti is struggling with 12GB VRAM, and while it finally works, it used up all the 12G, so save as much as VRAM for training as possible. Close all other programs, unload vae (you can do it in settings easily with the latest webui), and if possible, connect your monitor to and start Windows desktop on your integrated intel graphics. This saves the last straw of several hundreds of MBs for me.

Prepare a large disk

Like training textual inversion and hypernetwork, you want to save a checkpoint every several steps, to verify results/prevent overfit/resume from an earlier point using smaller lr. Unlike textual inversion or hypernetwork, each checkpoint is 2GB large if using half precision, or 4GB if full. Be prepared if you choose to train 10000 steps and save a 2G file every 500 steps!

Prepare your data

You can use Train -> Preprocess images in webui to easily crop and create flipped copies. Just remember this verion of dreambooth does not read tags in a separate txt file and expect every file in the folder to be a image, so do not tag your images. You might be able to use a concept json file though, and I am still figuring out how.

1. Installation

First install the awesome sd_dreambooth_extension by d8ahazard. You need to restart webui after installation to install its new requirements. If you are on Linux, that's done.
But if you use Windows, you need an extra step. This is because the 8bit AdamW optimiazer by bitsandbytes is not officially supported on Windows. I think the latest version of sd_dreambooth_extension packs pre-compiled windows dlls from https://github.com/bmaltais/kohya_ss now, but only for cu116, and by default webui install pytorch+cu113 for you. So you need to manually upgrade:

.\venv\Scripts\Activate.ps1 (or activate.bat if you are using cmd)
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

2. Already Fixed A little code fixup

The dreambooth plugin should now work for you if you are using SD v1.4 or v1.5. However, if you are using NAI, when you try to create a model you may encounter d8ahazard/sd_dreambooth_extension#6, and got the following error:
~~ RuntimeError: Error(s) in loading state_dict for CLIPTextModel:~~ Missing key(s) in state_dict: "text_model.embeddings.position_ids", ... Unexpected key(s) in state_dict: "embeddings.position_ids", ... ~~
~~If so, before d8ahazard address this issue, you need to edit dreambooth\conversion.py line 715 from~~
~~text_model_dict[key[len("cond_stage_model.transformer."):]] = checkpoint[key]~~
to
~~text_model_dict["text_model." + key[len("cond_stage_model.transformer."):]] = checkpoint[key]~~
~~And restart webui. After that you can create and train your model.~~

3. Training settings

You only need to fill in Instance prompt and Dataset directory, I think the rest defaults are good enough.
Then scroll down and expand Advanced, it is important to change the following:

uncheck Don't cache latents. I need to cache latents or I will get OOM.
check Use 8bit Adam. That's our sole purpose of using this, allowing 12G VRAM to train.
change Mixed precision to fp16.

Then, happy training! I got a speed around 3it/s, which is identical to training textual inversion and hypernetwork. You can tinker around steps and other settings, as I am also still experimenting.

Some quick results

This is a bear girl drawn by me, and I call her Mishka. It is the only image of this original character. I created a flipped copy for her, so the input is a total of 2 samples. I do not use classification.

I start from animefull-final-pruned from NAI, and after 500 steps, the model can already remeber her well:

compared to ~3000k steps required in TI. And I think its results are much better than vanilla TI, because when training with only 2 samples, vanilla TI is always giving me a lot of periodic noise in the backgournd, and it is very difficult to change the composure. On the other hand, #2945 can give similarly good results, but require ~30k steps.
However it seems 500 steps is already too many, and the model is screwed that is can no longer create a girl that doesn't look like Mishka.

Possible optimization - Done

After using diffusers by ShivamShrirao xformers is automatically enabled, reducing memory usage from 12G to 11.5G for me. Not much, but helpful to leave a safe margin.
I find in dreambooth/dreambooth.py line 198 that before doing the training, xformers is unloaded, similar to the behavior before TI and HN training. However, in the latest webui, it is possible to keep the xformers optimization before TI to allow TI on 6GB, and after xformers attention block fix the results are no longer bad. So maybe it is possible to keep the xformers optimization before doing dreambooth to save some extra VRAM? I tested keeping xformers, and the Mishka results are still good, but further experiments are required.

sgsdxzy · 2022-11-07T17:35:09Z

sgsdxzy
Nov 7, 2022
Author

But actually I am not looking for learning about a specific subject/style. What I am trying to achieve is make the model better at some general concepts, like how to draw proper feet or how to properly place two people in the same figure. I prepared ~3k samples.
At first I want to finetune, or train the entire NAI model, like SD v1.4 to waifu diffusion. However I just can't get 3080Ti 12G to work. Now that dreambooth can run on my 3080Ti, I am experimenting with concepts list to make an image-list of tags training, just like how SD is trained. Can someone tell me if I can achieve the similar results using dreambooth + concepts list as normal training (-t in stable diffusion)?

7 replies

cerega66 Nov 9, 2022

This is about AND and NOT: https://energy-based-model.github.io/Compositional-Visual-Generation-with-Composable-Diffusion-Models/

About 3090 I can't say for sure, but I have seen instructions with some methods.

sgsdxzy Nov 9, 2022
Author

This is about AND and NOT: https://energy-based-model.github.io/Compositional-Visual-Generation-with-Composable-Diffusion-Models/

About 3090 I can't say for sure, but I have seen instructions with some methods.

It seems NOT is just another form of negative prompt?

sgsdxzy Nov 9, 2022
Author

I read through the code of train_dreambooth.py and the diffusers version of training model, I find they are not that much different. The basic workflow is the same, and while dreambooth uses a dataset of the same caption, training just uses different caption for different image. I can implement my own loader to do that, right?

cerega66 Nov 9, 2022

More yes than no. I didn't see the difference between a negative promt and a NOT. But I have assumptions which I have not tested. A negative prompt applies to the entire composition (this is a fact), and NOT can be used on a separate object of the composition (this must be checked). For example, if you specify a color in a negative prompt, it will save the entire composition from this color. And NOT can be applied to the composition object. (need to check)

cerega66 Nov 9, 2022

I read through the code of train_dreambooth.py and the diffusers version of training model, I find they are not that much different. The basic workflow is the same, and while dreambooth uses a dataset of the same caption, training just uses different caption for different image. I can implement my own loader to do that, right?

Can't say about it, I didn't go that deep into the technical part. In theory, this is possible, but we must understand that training with metadata is many times harder. For example, there are already versions of Dreambooth training for several concepts in one training, but the number of steps for training there increases in direct proportion to the number of concepts (steps per concept multiplied by the number of concepts).

ZeroCool22 · 2022-11-09T02:38:17Z

ZeroCool22
Nov 9, 2022

Where i execute this command?

.\venv\Scripts\Activate.ps1 (or activate.bat if you are using cmd)
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

In base folder:

C:\Users\ZeroCool22\Desktop\Auto

Or in DB folder?:

C:\Users\ZeroCool22\Desktop\Auto\extensions\sd_dreambooth_extension

4 replies

sgsdxzy Nov 9, 2022
Author

Where i execute this command?
.\venv\Scripts\Activate.ps1 (or activate.bat if you are using cmd)
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
In base folder:

C:\Users\ZeroCool22\Desktop\Auto

Or in DB folder?:

C:\Users\ZeroCool22\Desktop\Auto\extensions\sd_dreambooth_extension

In webui folder.
In fact, you are looking for the activate.ps1 file, just point to it.

rpillala Nov 9, 2022

the venv command worked, but the pip command gave me this error:

Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/wh1/cu116
ERROR: Could not find a version that satisfies the requirement torch==1.12.1+cu116 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0)
ERROR: No matching distribution found for torch==1.12.1+cu116

I'm not sure how to deal with that one!

sgsdxzy Nov 9, 2022
Author

the venv command worked, but the pip command gave me this error:

Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/wh1/cu116 ERROR: Could not find a version that satisfies the requirement torch==1.12.1+cu116 (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0) ERROR: No matching distribution found for torch==1.12.1+cu116

I'm not sure how to deal with that one!

I think that might be related to your python version. What python version are you using?

mykeehu Nov 9, 2022

What worked for me was what they wrote on their website, that I modified the .bat file once, deleted it after installation:

Edit your webui-user.bat file, add this line after 'set COMMANDLINE_ARGS=':

set TORCH_COMMAND="pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116"

Once installed, restart the SD-WebUI entirely, not just the UI. This will ensure all the necessary requirements are installed.

Gman0909 · 2022-11-09T12:18:17Z

Gman0909
Nov 9, 2022

I'm getting the following error when Dreambooth tries to save a checkpoint, more specifically the sample images:

Exception with the stupid image again: tensors used as indices must be long, byte or bool tensors

1 reply

mykeehu Nov 9, 2022

Probably because of the latest change, it happened to me too. @d8ahazard ?

dugemkakek · 2022-11-09T14:17:23Z

dugemkakek
Nov 9, 2022

hi, can this model of training can use multiple model at once like on fastmode in google colab ?

6 replies

sgsdxzy Nov 9, 2022
Author

All right, you can use a Concepts list, as describe in https://github.com/d8ahazard/sd_dreambooth_extension#training-basic-settings
Use a json like:

[
    {
        "instance_prompt":      "photo of zwx dog",
        "class_prompt":         "photo of a dog",
        "instance_data_dir":    "../../../data/alvan",
        "class_data_dir":       "../../../data/dog"
    },
    {
        "instance_prompt":      "photo of bhm cat",
        "class_prompt":         "photo of a cat",
        "instance_data_dir":    "../../../data/bob",
        "class_data_dir":       "../../../data/cat"
    }
]

kuratatomoaki Nov 9, 2022

two girls
How would you write a class?

[
    {
        "instance_prompt":      "photo of zwx girl1",
        "class_prompt":         "photo of a girl",
        "instance_data_dir":    "../../../data/girl1",
        "class_data_dir":       "../../../data/girl"
    },
    {
        "instance_prompt":      "photo of bhm girl2",
        "class_prompt":         "photo of a girl",
        "instance_data_dir":    "../../../data/girl2",
        "class_data_dir":       "../../../data/girl"
    }
]

Is the above correct?

[
    {
        "instance_prompt":      "sls chitanda eru",
        "instance_data_dir":    "../../HN/titanda eru/result"
    },
    {
        "instance_prompt":      "cpc eris greyrat",
        "instance_data_dir":    "../../HN/musyokutensei/result"
    }
]

The above
cpc eris greyrat
only enabled

sgsdxzy Nov 10, 2022
Author

I never used class so no experience here. But multiple concepts without prior preservation works, and this is how finetuning works.

kuratatomoaki Nov 10, 2022

thank you for your reply

The default value for sample steps is 40, but
What should I use as a guideline for this value?

sgsdxzy Nov 10, 2022
Author

It's just for generating preview image, does not affect training.

ProGamerGov · 2022-11-09T15:24:18Z

ProGamerGov
Nov 9, 2022

If anyone needs class images / regularization images, I've created a ton of datasets here for people to use with DreamBooth training: https://huggingface.co/datasets/ProGamerGov/StableDiffusion-v1-5-Regularization-Images

0 replies

Centurion-Rome · 2022-11-09T20:54:32Z

Centurion-Rome
Nov 9, 2022

Almost crying: "Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity;"
on my super 2080 8GB :(

1 reply

sgsdxzy Nov 10, 2022
Author

I barely to get it work for around 11G. But you can follow the progress here: d8ahazard/sd_dreambooth_extension#13 (comment)

tbrebant · 2022-11-10T14:37:36Z

tbrebant
Nov 10, 2022

unload vae (you can do it in settings easily with the latest webui)
How do you do that? I can't find the setting

2 replies

mykeehu Nov 10, 2022

Under Settings tab find this:

tbrebant Nov 11, 2022

Oh great! I thought this was only for hyper-network training and that dreambooth was an independant thing.

Thank you!

RomixERR · 2022-11-30T14:00:54Z

RomixERR
Nov 30, 2022

Do not forget enable and increase the windows swap file to ~ 32GB, to avoid CUDA_ERROR_OUT_OF_MEMORY. I've been fiddling with this dried fly for a very long time!!!

0 replies

phutngo · 2022-12-14T18:42:54Z

phutngo
Dec 14, 2022

For the Dataset Directory field under the Concepts tab, is the format of the directory path something like.

1: "C:_CODE_Github\stable-diffusion-webui\ __inputs\images_and_captions"

OR

2: "__inputs\images_and_captions"

OR

3: "/__inputs/images_and_captions"

I'm in Windows 11

0 replies

HowTo: Dreambooth on 3080Ti 12G #4436

Security concerns

Steps

0. Preparations

Clean up your VRAM

Prepare a large disk

Prepare your data

1. Installation

2. Already Fixed A little code fixup

3. Training settings

Some quick results

Possible optimization - Done

Replies: 9 comments · 21 replies

sgsdxzy Nov 7, 2022 Author

sgsdxzy Nov 9, 2022 Author

sgsdxzy Nov 9, 2022 Author

sgsdxzy Nov 9, 2022 Author

sgsdxzy Nov 9, 2022 Author

sgsdxzy Nov 9, 2022 Author

sgsdxzy Nov 10, 2022 Author

sgsdxzy Nov 10, 2022 Author

sgsdxzy Nov 10, 2022 Author

Replies: 9 comments 21 replies

sgsdxzy
Nov 7, 2022
Author

sgsdxzy Nov 9, 2022
Author

sgsdxzy Nov 9, 2022
Author

sgsdxzy Nov 9, 2022
Author

sgsdxzy Nov 9, 2022
Author

sgsdxzy Nov 9, 2022
Author

sgsdxzy Nov 10, 2022
Author

sgsdxzy Nov 10, 2022
Author

sgsdxzy Nov 10, 2022
Author