-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add training example for DreamBooth. #554
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Hey @victarray I had asked the author of the paper and he replied this : The special token we create is different from Gal et al. - we create a rare token and then finetune the model instead of the text embedding. Regarding incorporating the logic into this script, my guess is it is not required. This is a response given by @patrickvonplaten when I asked a similar query:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool @Victarry !
Great start, the script is looking good, but we need to address few things before merging.
Let some comment below. More specifically
- We need to handle the class image generation in multi-gpu setting. I can help with this.
- Wrap the
text_encoder
andvae
intorch.no_grad
as we don't train them. - Check if concatenating the batch for prior preservation loss causes issues in low memory GPUs. I can help here.
Apart from this, we can add a helper script to do rare token detection, that will be useful. I will look into it.
Let me know if you have any questions, thanks a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool, the PR looks good! Will run it on both single and multi-gpu to verify and then it should be good to merge. Thanks a lot for working on this.
…dreambooth-example
Wow, Using the 8bit adam optimizer from bitsandbytes along with xformers reduces the memory usage to 12.5 GB. |
I can confirm in even runs on colab free tier, T4 GPU. Edit: Failed on saving checkpoint, likely due to how it gets executed, but at least it works:
|
Hmm I'd of thought you'd mention my pull request on this... |
@Thomas-MMJ oh hey, sorry I didn't see your pull request. I had it in mind to try it, just needed to sleep lol. |
This is really great, I played around a bit with it. I used all 5 reference images from here. The output for "A sleeping sks dog" is like: What made it work well for you? Edit: better image montage |
Same here. Basically only |
Same problem, my trained subject does not appear in 80% of results if I alter the prompt just a tiny bit. |
sounds like overfitting which dreambooth was made to combat |
Hi! If you see any issue in the script then please open an issue and for general discussions like this feel free to join the discord https://discord.gg/G7tWnz98XR. |
Same here as well... I tried to upload between 50 and 80 pics of myself, with 800 training steps. A very basic prompt like It seems there some missling link between the |
The issue may just be related to poor choice of parameters. Are you aware of any best practices regarding number of learning steps, number of class images generated, choice of class name & class prompt, choice of concept name, etc? I suspect my issues are more a matter of this than an issue with the actual code... |
Are there any webui for Dreambooth yet? I mean, to use it to train. |
Why would you want a GUI for training? |
I certainly would love one as I hate using the command prompt and there are a lot of steps to using dream booth. The GUI could even handle re-sizing of input photos to make things easier. And it could manage all the custom trained models. |
Just wanted to say I've had a chance to try Shavim's colab notebook, and it worked great! I took 7 photos of myself around the house from different angles and in different lighting. I even changed my shirts. I was really surprised how fast the training took, ~10 min on a V100. For anyone having trouble, I would suggest changing the token to your own name, or something that invokes what your subject is, I simply used my "firstnamlastname" as a token. And don't forget to change the name of the destination folder as well. I think some of the problems people might be having is from using the default "sks" which is actually a term for a type of rifle. I had tried some initial tests using another set of random letters as the token, because I thought I would want something totally unique, but I feel like using my name actually gave more context for the model to draw from other faces associated with my name to help fill in the blanks. I also didn't use the class at all, even though JoePenna's version says too use it, and I thought my results were very strong. Everything else was default for me. My big request is having a .chkpt output so I can use it in other notebooks like Deforum and Warpfusion. I've heard rumor it's being worked on, so just want to add my support for the idea. Another bonus would be to include a pruning function to compress the model further and take up less storage space, I have no idea if something is being implemented already. JoePenna's version is able to be compressed to 2GB, but I've heard his notebook takes more like ~1 hr to train. It would be great to have the best of both worlds. Thanks for developing this, it's pretty amazing! |
The first converter scripts are popping up already. See AUTOMATIC1111/stable-diffusion-webui#1429 (comment)_ I haven't tried any yet, but they look promising... |
Awesome, I'll check them out. Thanks! |
#1 - Command line entries are archaic. We moved past those for the typical user in the late 80s. Syntax errors, bad prompt "grammar", etc. just inhibits wide use #2 - There are a lot of features that can be automated that way as mentioned by jd-3d there are image conversions, data validations, runtime estimations, and batching you can do that way. #3 - The more people who can use this... It's a win, all around |
To users claiming bad results, I wonder if the dreambooth example is actually flawed, |
I've been testing ShivamShrirao's fork for several days now, with my own Google Colab notebook to add some sprinkles on top. For initial learning I jused used my full name "johnslegers" as a concept and "man" as a class. Then I tried to retrain the output model with different input pics, different class pics & different prompt settings to see test if it's possible to finetune a finetuned model further for the same concept. Using this strategy, I've managed to generate some pretty decent renders of my younger self... Some of the renders generatedActual photos of me used as input for the training process |
Thanks a lot for sharing this @jslegers ! |
There's this site for running dreambooth: http://fine-tune-sd.com/ |
* Add training example for DreamBooth. * Fix bugs. * Update readme and default hyperparameters. * Reformatting code with black. * Update for multi-gpu trianing. * Apply suggestions from code review * improgve sampling * fix autocast * improve sampling more * fix saving * actuallu fix saving * fix saving * improve dataset * fix collate fun * fix collate_fn * fix collate fn * fix key name * fix dataset * fix collate fn * concat batch in collate fn * add grad ckpt * add option for 8bit adam * do two forward passes for prior preservation * Revert "do two forward passes for prior preservation" This reverts commit 661ca46. * add option for prior_loss_weight * add option for clip grad norm * add more comments * update readme * update readme * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * add docstr for dataset * update the saving logic * Update examples/dreambooth/README.md * remove unused imports Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
The inference cell on this colab is broken /usr/local/lib/python3.7/dist-packages/diffusers/utils/deprecation_utils.py:35: FutureWarning: The configuration file of this scheduler: DDIMScheduler {
AttributeError Traceback (most recent call last) 2 frames AttributeError: 'list' object has no attribute 'module' It was working fine yesterday. I have not yet tried training a new model, but I assume generating samples may cause the same issues. |
Hey @rmac85, Could you please open a new issue? Note that this is a merged PR so we won't look into comments here anymore. If you open a new issue we're more than happy to take a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DreamBooth is a deep learning-based tool that can be used to personalize existing text-to-image models. It works by fine-tuning a text-to-image model on a few images of a specific subject. This allows the model to learn the unique characteristics of the subject and generate more personalized and realistic images of it.
Add DreamBooth training example.
One question is how to specify the identifier
[V]
of input prompt to bind with the concept of subject.The original paper says using random sampling of rare-tokens to generate the identifier. Should we include this logic in the training script?
Currently I use
sks
as in https://github.com/XavierXiao/Dreambooth-Stable-Diffusion.