Add training example for DreamBooth. #554

Victarry · 2022-09-18T18:17:28Z

Add DreamBooth training example.

One question is how to specify the identifier [V] of input prompt to bind with the concept of subject.
The original paper says using random sampling of rare-tokens to generate the identifier. Should we include this logic in the training script?
Currently I use sks as in https://github.com/XavierXiao/Dreambooth-Stable-Diffusion.

HuggingFaceDocBuilderDev · 2022-09-18T18:21:35Z

The documentation is not available anymore as the PR was closed or merged.

ghpkishore · 2022-09-19T08:50:44Z

Hey @victarray I had asked the author of the paper and he replied this : The special token we create is different from Gal et al. - we create a rare token and then finetune the model instead of the text embedding.

Regarding incorporating the logic into this script, my guess is it is not required. This is a response given by @patrickvonplaten when I asked a similar query:

According to our philosophy: https://github.com/huggingface/diffusers/tree/main/examples#-diffusers-examples we don't want to provide "one-script-fits-it-all" examples but rather relatively simple scripts that one can easily tweak. In your case I'd highly recommend to go into the examples code to make the model trainable as well

patil-suraj

Very cool @Victarry !
Great start, the script is looking good, but we need to address few things before merging.
Let some comment below. More specifically

We need to handle the class image generation in multi-gpu setting. I can help with this.
Wrap the text_encoder and vae in torch.no_grad as we don't train them.
Check if concatenating the batch for prior preservation loss causes issues in low memory GPUs. I can help here.

Apart from this, we can add a helper script to do rare token detection, that will be useful. I will look into it.

Let me know if you have any questions, thanks a lot!

examples/dreambooth/train_dreambooth.py

patil-suraj

Very cool, the PR looks good! Will run it on both single and multi-gpu to verify and then it should be good to merge. Thanks a lot for working on this.

examples/dreambooth/train_dreambooth.py

…dreambooth-example

ShivamShrirao · 2022-09-27T13:36:58Z

Wow, Using the 8bit adam optimizer from bitsandbytes along with xformers reduces the memory usage to 12.5 GB.
Colab: https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/dreambooth/DreamBooth_Stable_Diffusion.ipynb

Code: https://github.com/ShivamShrirao/diffusers/blob/main/examples/dreambooth/

chavinlo · 2022-09-27T15:51:31Z

Wow, Using the 8bit adam optimizer from bitsandbytes along with xformers reduces the memory usage to 12.5 GB. Code: https://github.com/ShivamShrirao/diffusers/blob/main/examples/dreambooth/

I can confirm in even runs on colab free tier, T4 GPU.
nvidia-smi (before training):

training, note the peak in VRAM:

Edit: Failed on saving checkpoint, likely due to how it gets executed, but at least it works:

Traceback (most recent call last):
  File "train_dreambooth.py", line 606, in <module>
    main()
  File "train_dreambooth.py", line 408, in main
    unet.enable_gradient_checkpointing()
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1208, in __getattr__
    type(self).__name__, name))
AttributeError: 'UNet2DConditionModel' object has no attribute 'enable_gradient_checkpointing'
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.7/dist-packages/accelerate/commands/launch.py", line 354, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'train_dreambooth.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--use_auth_token', '--instance_data_dir=dog', '--class_data_dir=dog', '--output_dir=model', '--with_prior_preservation', '--prior_loss_weight=1.0', '--instance_prompt=a photo of sks dog', '--class_prompt=a photo of dog', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=2', '--gradient_checkpointing', '--use_8bit_adam', '--learning_rate=5e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--num_class_images=200', '--max_train_steps=800']' returned non-zero exit status 1.

Thomas-MMJ · 2022-09-27T16:42:20Z

Hmm I'd of thought you'd mention my pull request on this...

ShivamShrirao#1

ShivamShrirao · 2022-09-27T17:11:28Z

@Thomas-MMJ oh hey, sorry I didn't see your pull request. I had it in mind to try it, just needed to sleep lol.

vakker · 2022-09-27T22:39:14Z

This is really great, I played around a bit with it.
However, it produces quite low quality results with the default settings.
E.g. I tried to reproduce the results from the paper with this dog:

I used all 5 reference images from here.

The output for "A sleeping sks dog" is like:

What made it work well for you?

Edit: better image montage

rjadr · 2022-09-28T05:41:22Z

Same here. Basically only A photo of sks delivers reasonable results. Any attempt to enhance quality, eg A photo of sks, trending on artstation already changes the semantics so sks is unidentifyable.

n00mkrad · 2022-09-28T08:50:02Z

Same here. Basically only A photo of sks delivers reasonable results. Any attempt to enhance quality, eg A photo of sks, trending on artstation already changes the semantics so sks is unidentifyable.

Same problem, my trained subject does not appear in 80% of results if I alter the prompt just a tiny bit.

vakker · 2022-09-28T09:12:18Z

For the record, the correct prompt should be A photo of sks dog.
For me the results for that are (random 20 samples, the black pics are labelled as NSFW):

1blackbar · 2022-09-29T02:33:58Z

sounds like overfitting which dreambooth was made to combat

patil-suraj · 2022-09-29T09:23:16Z

Hi! If you see any issue in the script then please open an issue and for general discussions like this feel free to join the discord https://discord.gg/G7tWnz98XR.

jslegers · 2022-09-29T19:13:57Z

Same problem, my trained subject does not appear in 80% of results if I alter the prompt just a tiny bit.

Same here as well...

I tried to upload between 50 and 80 pics of myself, with 800 training steps.

A very basic prompt like photo of sks guy or Detailed portrait of sks guy produces somewhat reasonable results. It's kind-of hit-and-miss with some results looking a lot like me and some not at all, and most somewhere in between. When I use anything beyond such basic prompts, however, the results don't look even closely like me even a single time.

It seems there some missling link between the CompVis/stable-diffusion-v1-4 that was used to do the training and the model that was produced...

patil-suraj · 2022-09-30T09:57:15Z

HI @jslegers and @n00mkrad could you please open an issue. Would be happy to take a look.

jslegers · 2022-10-02T00:31:49Z

HI @jslegers and @n00mkrad could you please open an issue. Would be happy to take a look.

The issue may just be related to poor choice of parameters.

Are you aware of any best practices regarding number of learning steps, number of class images generated, choice of class name & class prompt, choice of concept name, etc? I suspect my issues are more a matter of this than an issue with the actual code...

Duemellon · 2022-10-02T18:12:55Z

Are there any webui for Dreambooth yet? I mean, to use it to train.

n00mkrad · 2022-10-02T21:29:04Z

Are there any webui for Dreambooth yet? I mean, to use it to train.

Why would you want a GUI for training?

jd-3d · 2022-10-02T21:33:05Z

Are there any webui for Dreambooth yet? I mean, to use it to train.

Why would you want a GUI for training?

I certainly would love one as I hate using the command prompt and there are a lot of steps to using dream booth. The GUI could even handle re-sizing of input photos to make things easier. And it could manage all the custom trained models.

jhsu888 · 2022-10-02T22:31:58Z

Just wanted to say I've had a chance to try Shavim's colab notebook, and it worked great! I took 7 photos of myself around the house from different angles and in different lighting. I even changed my shirts. I was really surprised how fast the training took, ~10 min on a V100.

For anyone having trouble, I would suggest changing the token to your own name, or something that invokes what your subject is, I simply used my "firstnamlastname" as a token. And don't forget to change the name of the destination folder as well. I think some of the problems people might be having is from using the default "sks" which is actually a term for a type of rifle.

I had tried some initial tests using another set of random letters as the token, because I thought I would want something totally unique, but I feel like using my name actually gave more context for the model to draw from other faces associated with my name to help fill in the blanks.

I also didn't use the class at all, even though JoePenna's version says too use it, and I thought my results were very strong. Everything else was default for me.

My big request is having a .chkpt output so I can use it in other notebooks like Deforum and Warpfusion. I've heard rumor it's being worked on, so just want to add my support for the idea.

Another bonus would be to include a pruning function to compress the model further and take up less storage space, I have no idea if something is being implemented already. JoePenna's version is able to be compressed to 2GB, but I've heard his notebook takes more like ~1 hr to train. It would be great to have the best of both worlds.

Thanks for developing this, it's pretty amazing!

jslegers · 2022-10-03T11:35:07Z

My big request is having a .chkpt output so I can use it in other notebooks like Deforum and Warpfusion. I've heard rumor it's being worked on, so just want to add my support for the idea.

The first converter scripts are popping up already. See AUTOMATIC1111/stable-diffusion-webui#1429 (comment)_

I haven't tried any yet, but they look promising...

jhsu888 · 2022-10-03T15:32:59Z

My big request is having a .chkpt output so I can use it in other notebooks like Deforum and Warpfusion. I've heard rumor it's being worked on, so just want to add my support for the idea.

The first converter scripts are popping up already. See AUTOMATIC1111/stable-diffusion-webui#1429 (comment)_

I haven't tried any yet, but they look promising...

Awesome, I'll check them out. Thanks!

Duemellon · 2022-10-03T20:16:09Z

Are there any webui for Dreambooth yet? I mean, to use it to train.

Why would you want a GUI for training?

#1 - Command line entries are archaic. We moved past those for the typical user in the late 80s. Syntax errors, bad prompt "grammar", etc. just inhibits wide use

#2 - There are a lot of features that can be automated that way as mentioned by jd-3d there are image conversions, data validations, runtime estimations, and batching you can do that way.

#3 - The more people who can use this...
The more get created...
The quicker everything get cataloged...
The sooner things are mostly converted to be at your fingertips

It's a win, all around
The quicker

affableroots · 2022-10-04T16:38:40Z

To users claiming bad results, I wonder if the dreambooth example is actually flawed,
I wrote some findings here #712
And maybe someone brighter than me can comment?

jslegers · 2022-10-04T19:56:01Z

I've been testing ShivamShrirao's fork for several days now, with my own Google Colab notebook to add some sprinkles on top.

For initial learning I jused used my full name "johnslegers" as a concept and "man" as a class. Then I tried to retrain the output model with different input pics, different class pics & different prompt settings to see test if it's possible to finetune a finetuned model further for the same concept.

Using this strategy, I've managed to generate some pretty decent renders of my younger self...

Some of the renders generated

Actual photos of me used as input for the training process

patil-suraj · 2022-10-05T10:04:40Z

Thanks a lot for sharing this @jslegers !

vakker · 2022-10-18T22:02:22Z

There's this site for running dreambooth: http://fine-tune-sd.com/

* Add training example for DreamBooth. * Fix bugs. * Update readme and default hyperparameters. * Reformatting code with black. * Update for multi-gpu trianing. * Apply suggestions from code review * improgve sampling * fix autocast * improve sampling more * fix saving * actuallu fix saving * fix saving * improve dataset * fix collate fun * fix collate_fn * fix collate fn * fix key name * fix dataset * fix collate fn * concat batch in collate fn * add grad ckpt * add option for 8bit adam * do two forward passes for prior preservation * Revert "do two forward passes for prior preservation" This reverts commit 661ca46. * add option for prior_loss_weight * add option for clip grad norm * add more comments * update readme * update readme * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add docstr for dataset * update the saving logic * Update examples/dreambooth/README.md * remove unused imports Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

rmac85 · 2022-11-25T20:50:19Z

The inference cell on this colab is broken

/usr/local/lib/python3.7/dist-packages/diffusers/utils/deprecation_utils.py:35: FutureWarning: The configuration file of this scheduler: DDIMScheduler {
"_class_name": "DDIMScheduler",
"_diffusers_version": "0.9.0.dev0",
"beta_end": 0.012,
"beta_schedule": "scaled_linear",
"beta_start": 0.00085,
"clip_sample": false,
"num_train_timesteps": 1000,
"prediction_type": "epsilon",
"set_alpha_to_one": false,
"steps_offset": 0,
"trained_betas": null
}
is outdated. steps_offset should be set to 1 instead of 0. Please make sure to update the config accordingly as leaving steps_offset might led to incorrect results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it would be very nice if you could open a Pull request for the scheduler/scheduler_config.json file
warnings.warn(warning + message, FutureWarning)
/usr/local/lib/python3.7/dist-packages/diffusers/utils/deprecation_utils.py:35: FutureWarning: The configuration file of the unet has set the default sample_size to smaller than 64 which seems highly unlikely .If you're checkpoint is a fine-tuned version of any of the following:

CompVis/stable-diffusion-v1-4
CompVis/stable-diffusion-v1-3
CompVis/stable-diffusion-v1-2
CompVis/stable-diffusion-v1-1
runwayml/stable-diffusion-v1-5
runwayml/stable-diffusion-inpainting
you should change 'sample_size' to 64 in the configuration file. Please make sure to update the config accordingly as leaving sample_size=32 in the config might lead to incorrect results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it would be very nice if you could open a Pull request for the unet/config.json file
warnings.warn(warning + message, FutureWarning)

AttributeError Traceback (most recent call last)
in
7
8 scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False)
----> 9 pipe = StableDiffusionPipeline.from_pretrained(model_path, scheduler=scheduler, safety_checker=None, torch_dtype=torch.float16).to("cuda")
10
11 g_cuda = None

2 frames
/usr/local/lib/python3.7/dist-packages/diffusers/pipeline_utils.py in register_modules(self, **kwargs)
147 register_dict = {name: (None, None)}
148 else:
--> 149 library = module.module.split(".")[0]
150
151 # check if the module is a pipeline module

AttributeError: 'list' object has no attribute 'module'

It was working fine yesterday. I have not yet tried training a new model, but I assume generating samples may cause the same issues.

patrickvonplaten · 2022-11-30T11:47:30Z

Hey @rmac85,

Could you please open a new issue? Note that this is a merged PR so we won't look into comments here anymore. If you open a new issue we're more than happy to take a look!

abhinavsrepository

DreamBooth is a deep learning-based tool that can be used to personalize existing text-to-image models. It works by fine-tuning a text-to-image model on a few images of a specific subject. This allows the model to learn the unique characteristics of the subject and generate more personalized and realistic images of it.

Add training example for DreamBooth.

bd4d674

patil-suraj reviewed Sep 22, 2022

View reviewed changes

Victarry added 4 commits September 22, 2022 10:19

Fix bugs.

51340f9

Update readme and default hyperparameters.

88ab347

Reformatting code with black.

5bb534b

Update for multi-gpu trianing.

faffe23

patil-suraj approved these changes Sep 26, 2022

View reviewed changes

examples/dreambooth/train_dreambooth.py Outdated Show resolved Hide resolved

examples/dreambooth/train_dreambooth.py Outdated Show resolved Hide resolved

patil-suraj added 18 commits September 26, 2022 10:34

Apply suggestions from code review

2eeabe7

improgve sampling

195cd46

fix autocast

1acc678

improve sampling more

627cc49

fix saving

f1c3c8e

actuallu fix saving

509e4e3

fix saving

eafc000

improve dataset

6f99f29

fix collate fun

392fbf3

fix collate_fn

d6c88f4

fix collate fn

a3d604e

fix key name

f4a91a6

fix dataset

8e92d69

fix collate fn

ef01331

concat batch in collate fn

c66cf4d

Merge branch 'main' of https://github.com/huggingface/diffusers into …

2894a92

…dreambooth-example

add grad ckpt

16ecc08

add option for 8bit adam

87bc752

patil-suraj requested review from patil-suraj, anton-l and patrickvonplaten September 26, 2022 15:05

remove unused imports

d72c659

patil-suraj merged commit 3b747de into huggingface:main Sep 27, 2022

abhinavsrepository reviewed Dec 4, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add training example for DreamBooth. #554

Add training example for DreamBooth. #554

Victarry commented Sep 18, 2022

HuggingFaceDocBuilderDev commented Sep 18, 2022 •

edited

Loading

ghpkishore commented Sep 19, 2022 •

edited

Loading

patil-suraj left a comment

patil-suraj left a comment

ShivamShrirao commented Sep 27, 2022 •

edited

Loading

chavinlo commented Sep 27, 2022 •

edited

Loading

Thomas-MMJ commented Sep 27, 2022

ShivamShrirao commented Sep 27, 2022

vakker commented Sep 27, 2022 •

edited

Loading

rjadr commented Sep 28, 2022

n00mkrad commented Sep 28, 2022

vakker commented Sep 28, 2022

1blackbar commented Sep 29, 2022

patil-suraj commented Sep 29, 2022

jslegers commented Sep 29, 2022 •

edited

Loading

patil-suraj commented Sep 30, 2022

jslegers commented Oct 2, 2022

Duemellon commented Oct 2, 2022

n00mkrad commented Oct 2, 2022

jd-3d commented Oct 2, 2022

jhsu888 commented Oct 2, 2022 •

edited

Loading

jslegers commented Oct 3, 2022

jhsu888 commented Oct 3, 2022

Duemellon commented Oct 3, 2022

affableroots commented Oct 4, 2022

jslegers commented Oct 4, 2022

patil-suraj commented Oct 5, 2022

vakker commented Oct 18, 2022

rmac85 commented Nov 25, 2022 •

edited

Loading

patrickvonplaten commented Nov 30, 2022

abhinavsrepository left a comment

Add training example for DreamBooth. #554

Add training example for DreamBooth. #554

Conversation

Victarry commented Sep 18, 2022

HuggingFaceDocBuilderDev commented Sep 18, 2022 • edited Loading

ghpkishore commented Sep 19, 2022 • edited Loading

patil-suraj left a comment

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

ShivamShrirao commented Sep 27, 2022 • edited Loading

chavinlo commented Sep 27, 2022 • edited Loading

Thomas-MMJ commented Sep 27, 2022

ShivamShrirao commented Sep 27, 2022

vakker commented Sep 27, 2022 • edited Loading

rjadr commented Sep 28, 2022

n00mkrad commented Sep 28, 2022

vakker commented Sep 28, 2022

1blackbar commented Sep 29, 2022

patil-suraj commented Sep 29, 2022

jslegers commented Sep 29, 2022 • edited Loading

patil-suraj commented Sep 30, 2022

jslegers commented Oct 2, 2022

Duemellon commented Oct 2, 2022

n00mkrad commented Oct 2, 2022

jd-3d commented Oct 2, 2022

jhsu888 commented Oct 2, 2022 • edited Loading

jslegers commented Oct 3, 2022

jhsu888 commented Oct 3, 2022

Duemellon commented Oct 3, 2022

affableroots commented Oct 4, 2022

jslegers commented Oct 4, 2022

Some of the renders generated

Actual photos of me used as input for the training process

patil-suraj commented Oct 5, 2022

vakker commented Oct 18, 2022

rmac85 commented Nov 25, 2022 • edited Loading

patrickvonplaten commented Nov 30, 2022

abhinavsrepository left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Sep 18, 2022 •

edited

Loading

ghpkishore commented Sep 19, 2022 •

edited

Loading

ShivamShrirao commented Sep 27, 2022 •

edited

Loading

chavinlo commented Sep 27, 2022 •

edited

Loading

vakker commented Sep 27, 2022 •

edited

Loading

jslegers commented Sep 29, 2022 •

edited

Loading

jhsu888 commented Oct 2, 2022 •

edited

Loading

rmac85 commented Nov 25, 2022 •

edited

Loading