Dreambooth support #995

Any-Winter-4079 · 2022-10-08T13:09:27Z

This issue is to discuss Dreambooth (whether fully integrating it in this repo -training and inference-, or training via 3rd party, for example Colab, and doing inference in this repo).
Discussion and comparison with regular Textual Inversion is also encouraged.

hipsterusername · 2022-10-08T14:00:24Z

+1

Will also note there have been discussions of making it easy to generate (or import) new concepts from the WebUI. Should support both textual inversion & dreambooth, and plans include having a "library" of these for ongoing use.

I think, given the purpose and intent of this repo, full integration should be the aim.

Any-Winter-4079 · 2022-10-09T14:51:54Z

Interesting comparison:
https://www.reddit.com/r/StableDiffusion/comments/xjlv19/comparison_of_dreambooth_and_textual_inversion/

What I've noticed:
Textual inversion:

Excels at style transfer. "elephant in the style of Marsey"

May benefit from more images. My run with 74 images performed better than the one with 3

Best results (both in terms of style transfer and character preservation) at ~25,000 steps

DreamBooth (model download):

Far, far better for my use case. The character is more editable and the composition improves. It doesn't match the art style quite as well, though.

3 images worked better than 72

works extremely well with cross-attention prompt2prompt (the "img2img alternative test" script in automatic1111's UI)

1,000 steps (~30min on an A6000) is sufficient for good results

Worth mentioning - it's usable with deforum for animations

Combining the two doesn't seem to work, unfortunately. The next step might be either to directly finetune the network itself and apply one of these techniques afterwards, or possibly training the classifier.

Any-Winter-4079 · 2022-10-09T14:57:22Z

My best success at Textual Inversion was with 3-5 images, so I'll try with many more.
Also, it seems Dreambooth may better preserve the character.
Results from post above:
Dreambooth

Textual Inversion

tildebyte · 2022-10-09T16:08:42Z

Even better, an actual implementation in 10G VRAM https://www.reddit.com/r/StableDiffusion/comments/xtc25y/dreambooth_stable_diffusion_training_in_10_gb/?context=3

tildebyte · 2022-10-09T16:17:29Z

If you're me (or like me), and you're wondering about the difference between the two, AFAUI (big caveat), TI can't do things like taking a fine-tuned (Dreambooth) model which is capable of creating these

and creating this

which is just utterly amazing

Any-Winter-4079 · 2022-10-09T16:46:46Z

@tildebyte I tried this Colab (the one in the reddit post you share), but in the free tier, you have to pass --use_8bit_adam and not use full precision, which seemingly affects quality (I tried training with images of myself and it performed worse than TI).

It was a few days ago, so I may try again, just to see if they've introduced any other improvement.
Did you have any success with that Colab?

tildebyte · 2022-10-09T17:17:11Z

~~Colab doesn't have a free tier anymore (AFAIU); I haven't touched it since the change.~~

Oh, nvm. There's still a free tier, but they've added a "pay-as-you-go" tier.

TL;DR - I haven't done anything in Colab since SD dropped 😁

Any-Winter-4079 · 2022-10-09T20:27:58Z

I think there is potential in TI, but I want to get Dreambooth to work, and compare them.

@tildebyte Here are some TI results, for comparison with what you refer to in #995 (comment)
Original images: #517 (comment). I used 5 of the 7 images to train.
After training TI with this repo (changing to num_vectors_per_token: 6 in v1-inference.yaml), these are some of the results I can achieve.

Workflow 1

Use txt2img playing with prompt weighting (N)

txt2img
"a painting of * :N on the beach in the style of van gogh" -s 50 -S 989419747 -W 512 -H 512 -C 7.5 -A k_lms for several N values.

Workflow 2

Obtain a concept with txt2img and swap it with * using img2img playing with different -f (strength) values

txt2img
"close-up of a man low poly" -s 50 -S 3319463269 -W 512 -H 512 -C 7.5 -A k_lms

img2img
"close-up of * low poly" -I outputs/preflight/000310.3319463269.png -S3319463269 for several -f values

txt2img
"Funky pop african man face figurine, product studio shot, on a white background, diffused lighting, centered" -s 50 -S 3231549968 -W 512 -H 512 -C 7.5 -A k_lms

inspired by https://publicprompts.art/funky-pop/
img2img
Funky pop * face figurine, product studio shot, on a white background, diffused lighting, centered -I outputs/preflight/000384.3231549968.png -S3231549968 for several -f values

(By the way, the training images in #517 (comment) were personally created by the author of the comment, and the character will appear on some project -which I find pretty cool- so you can use them to test TI but don't share them as training set)

I've also managed to train using a couple other concepts (hamburger, dog cartoon and myself), but results weren't that good. For people, I wonder if using some 'beautify' filter on the training set wold help, to make sure the face looks closer to a 3D character -very smooth, so it doesn't have to learn patterns inside the face, e.g. cheek colors, smiles, wrinkles, etc. and can focus on learning more generally the face shape).

Any-Winter-4079 · 2022-10-09T20:30:39Z

All in all, I tend to prefer the "txt2img then img2img" workflow

Any-Winter-4079 · 2022-10-09T21:01:19Z

@tildebyte But yeah, the word out there seems to be Dreambooth is better or easier to use. For example, this is a quick attempt (using txt2img && img2img) at something akin to your pictures above, and you can see how the style is starting to get lost, and colors start to appear.

(Probably having specified a male face in txt2mg would have helped, to be fair)

It may still be possible to do these things with Textual Inversion, but it may require a more complex workflow, while Dreambooth may be easier to use.

Still, if someone has successfully used free Colab for Dreambooth and has had success, you are encouraged to share it here! The more info we have (what Colab, obviously, but also number of training images, lighting, closeness...), the better for us to implement it in the repo and document it.

Update:
Quick example with male (same problem)

bbecausereasonss · 2022-10-10T20:32:56Z

Dreambooth is INCREDIBLE. No contest.

Any-Winter-4079 · 2022-10-11T01:24:34Z

A 2nd Colab option.
https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast-DreamBooth.ipynb

Any-Winter-4079 · 2022-10-17T12:14:58Z

Update: I've tried https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/dreambooth/DreamBooth_Stable_Diffusion.ipynb a bunch of times and I can't get it to work well. It does run, and the result has some resemblance, but that's all.

Last attempt was with 50 training images and 150 reg images. --max_train_steps=2000, --use_8bit_adam and --gradient_checkpointing as well for the free Colab.

Any-Winter-4079 added the enhancement New feature or request label Oct 8, 2022

psychedelicious added the dreambooth label Nov 22, 2022

JohnTigue mentioned this issue May 14, 2023

Dreambooth MountaintopLotus/braintrust#35

Open

Millu closed this as not planned Won't fix, can't repro, duplicate, stale Nov 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dreambooth support #995

Dreambooth support #995

Any-Winter-4079 commented Oct 8, 2022

hipsterusername commented Oct 8, 2022

Any-Winter-4079 commented Oct 9, 2022

Any-Winter-4079 commented Oct 9, 2022

tildebyte commented Oct 9, 2022

tildebyte commented Oct 9, 2022

Any-Winter-4079 commented Oct 9, 2022

tildebyte commented Oct 9, 2022 •

edited

Any-Winter-4079 commented Oct 9, 2022

Any-Winter-4079 commented Oct 9, 2022

Any-Winter-4079 commented Oct 9, 2022 •

edited

bbecausereasonss commented Oct 10, 2022

Any-Winter-4079 commented Oct 11, 2022

Any-Winter-4079 commented Oct 17, 2022 •

edited

Dreambooth support #995

Dreambooth support #995

Comments

Any-Winter-4079 commented Oct 8, 2022

hipsterusername commented Oct 8, 2022

Any-Winter-4079 commented Oct 9, 2022

Any-Winter-4079 commented Oct 9, 2022

tildebyte commented Oct 9, 2022

tildebyte commented Oct 9, 2022

Any-Winter-4079 commented Oct 9, 2022

tildebyte commented Oct 9, 2022 • edited

Any-Winter-4079 commented Oct 9, 2022

Workflow 1

Workflow 2

Any-Winter-4079 commented Oct 9, 2022

Any-Winter-4079 commented Oct 9, 2022 • edited

bbecausereasonss commented Oct 10, 2022

Any-Winter-4079 commented Oct 11, 2022

Any-Winter-4079 commented Oct 17, 2022 • edited

tildebyte commented Oct 9, 2022 •

edited

Any-Winter-4079 commented Oct 9, 2022 •

edited

Any-Winter-4079 commented Oct 17, 2022 •

edited