stable diffusion fine-tuning #356

patil-suraj · 2022-09-05T06:13:11Z

This PR adds fine-tuning script for StableDiffusion

HuggingFaceDocBuilderDev · 2022-09-05T06:15:51Z

The documentation is not available anymore as the PR was closed or merged.

examples/text_to_image/train_text_to_image.py

minimaxir · 2022-09-07T04:52:03Z

Is it a good idea to finetune the entire unet? It seems like there is a risk of catastrophic forgetting, particularly if the input data is small and/or uniform. (this was an issue when I finetuned ruDALL-E on Pokemon; too long training and it ignored any specificity in the prompts)

It might be worth it to allow parametric freezing of embeddings and/or early layers of the unet as CLI args down the line, depending on how well this finetuning works.

patil-suraj · 2022-09-07T05:00:41Z

Hey @minimaxir ! That's a good point. We are doing some initial runs now and and we could indeed allow parametric freezing of certain layers if we run into these issues.
Also, some community members have tried fine-tuning the full model and observed that the model can still retain its ‘general’ knowledge. cc @justinpinkney

justinpinkney · 2022-09-07T08:57:47Z

@minimaxir After too long fine-tuning catastrophic forgetting def does happen, but long enough that you can represent the target domain well before forgetting everything. I also think borrowing some of the training tricks from Dream Booth is probably a good way to try and retain the generality.

ezhang7423 · 2022-09-07T18:49:47Z

Hi, could we get a sample command that we could use to try this script out?

minimaxir · 2022-09-07T19:19:14Z

@justinpinkney thanks for the info! Will be curious to see how things pan out (and of course thanks to the well-commented code it's pretty easy for the user to manually freeze layers if needed)

ghost · 2022-09-11T03:48:18Z

I am eager to do some fine-tuning, but I guess this is still non-functional/in early stages? I am only able to produce black images, even after bypassing the safety filters. I will have to do a lot of catching up before I would be able to meaningfully contribute towards progress on this, but I'll keep an eye out on progress for this !

…finetune-txt2img

examples/text_to_image/requirements.txt

examples/text_to_image/README.md

examples/text_to_image/train_text_to_image.py

examples/text_to_image/requirements.txt

examples/text_to_image/README.md

zcrypt0 · 2022-10-07T19:20:50Z

examples/text_to_image/train_text_to_image.py

+            if global_step >= args.max_train_steps:
+                break
+
+    # Create the pipeline using the trained modules and save it.


Does the default configuration save a checkpoint on any type of interval in case of interruptions?

I looked through the script and I don't see any config for it, but I'm not familiar with the accelerator training lib.

It's not added yet, but will add support for it in a follow-up PR.

examples/text_to_image/train_text_to_image.py

patrickvonplaten · 2022-10-10T13:56:25Z

This is good to merge for me. @anton-l @pcuenca do you want to take a final look?

…finetune-txt2img

pcuenca

Looks good to me, I just have some questions about the types of datasets we support.

examples/text_to_image/train_text_to_image.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

…fusers into finetune-txt2img

* begin text2image script * loading the datasets, preprocessing & transforms * handle input features correctly * add gradient checkpointing support * fix output names * run unet in train mode not text encoder * use no_grad instead of freezing params * default max steps None * pad to longest * don't pad when tokenizing * fix encode on multi gpu * fix stupid bug * add random flip * add ema * fix ema * put ema on cpu * improve EMA model * contiguous_format * don't warp vae and text encode in accelerate * remove no_grad * use randn_like * fix resize * improve few things * log epoch loss * set log level * don't log each step * remove max_length from collate * style * add report_to option * make scale_lr false by default * add grad clipping * add an option to use 8bit adam * fix logging in multi-gpu, log every step * more comments * remove eval for now * adress review comments * add requirements file * begin readme * begin readme * fix typo * fix push to hub * populate readme * update readme * remove use_auth_token from the script * address some review comments * better mixed precision support * remove redundant to * create ema model early * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * better description for train_data_dir * add diffusers in requirements * update dataset_name_mapping * update readme * add inference example Co-authored-by: anton-l <anton@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

j-min · 2022-11-30T05:46:17Z

Hi, thanks for adding the finetuning script!
Could you please guide how I should edit the script to jointly fine-tune both unet and text encoder? I'm not familiar with the accelerator package.

tengshaofeng · 2023-02-06T08:28:51Z

@patil-suraj @HuggingFaceDocBuilderDev @minimaxir @justinpinkney @ezhang7423 , Thanks for your ideas. I have learned so much from that. I still have a question. How many steps should I finetune the stable diffusion model with 10000images? To observe the loss changing, and when loss is not descend, then it is time to end the training? But I do not think the smaller the loss, the better performs.

aihao2000 · 2023-09-17T13:43:35Z

@patil-suraj I have some doubts as to why model.train() should be added at the beginning of each epoch.

begin text2image script

66a51ed

patil-suraj changed the title ~~[WIP] stabel diffusion finetuning~~ [WIP] stable diffusion fine-tuning Sep 5, 2022

loading the datasets, preprocessing & transforms

d062da6

patil-suraj commented Sep 5, 2022

View reviewed changes

examples/text_to_image/train_text_to_image.py Outdated Show resolved Hide resolved

patil-suraj commented Sep 5, 2022

View reviewed changes

examples/text_to_image/train_text_to_image.py Show resolved Hide resolved

anton-l mentioned this pull request Sep 5, 2022

[WIP][NEED HELP] feat(example): add conditional image generation #342

Closed

handle input features correctly

3ed3a34

patil-suraj mentioned this pull request Sep 9, 2022

Latent diffusion model #163

Closed

patil-suraj added 15 commits September 22, 2022 16:56

Merge branch 'main' of https://github.com/huggingface/diffusers into …

066af65

…finetune-txt2img

add gradient checkpointing support

ce569a1

fix output names

837a586

run unet in train mode not text encoder

3893029

use no_grad instead of freezing params

61513b0

default max steps None

ed8f4dd

pad to longest

e4fb478

don't pad when tokenizing

7414de1

fix encode on multi gpu

ce4a7a2

fix stupid bug

95d7836

add random flip

54b700d

add ema

725fb96

fix ema

584b3f7

put ema on cpu

0f0b098

improve EMA model

56a9fd0

Merge branch 'main' of https://github.com/huggingface/diffusers into …

4ffbf57

…finetune-txt2img

patrickvonplaten reviewed Oct 7, 2022

View reviewed changes

examples/text_to_image/requirements.txt Show resolved Hide resolved

patrickvonplaten approved these changes Oct 7, 2022

View reviewed changes

zcrypt0 reviewed Oct 7, 2022

View reviewed changes

examples/text_to_image/README.md Outdated Show resolved Hide resolved

zcrypt0 reviewed Oct 7, 2022

View reviewed changes

examples/text_to_image/train_text_to_image.py Show resolved Hide resolved

zcrypt0 reviewed Oct 7, 2022

View reviewed changes

examples/text_to_image/requirements.txt Show resolved Hide resolved

zcrypt0 reviewed Oct 7, 2022

View reviewed changes

examples/text_to_image/README.md Show resolved Hide resolved

zcrypt0 reviewed Oct 7, 2022

View reviewed changes

examples/text_to_image/train_text_to_image.py Outdated Show resolved Hide resolved

patil-suraj added 4 commits October 11, 2022 12:06

address some review comments

b08d85d

Merge branch 'main' of https://github.com/huggingface/diffusers into …

64338d2

…finetune-txt2img

better mixed precision support

db8e31a

remove redundant to

17cb6e7

pcuenca reviewed Oct 11, 2022

View reviewed changes

patil-suraj and others added 8 commits October 11, 2022 16:13

create ema model early

25625ce

Apply suggestions from code review

5d71880

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

better description for train_data_dir

3a6e4f2

Merge branch 'finetune-txt2img' of https://github.com/huggingface/dif…

b05a860

…fusers into finetune-txt2img

add diffusers in requirements

1c8b026

update dataset_name_mapping

5b22178

update readme

f0b4357

add inference example

f9a4025

patil-suraj merged commit 66a5279 into main Oct 11, 2022

patil-suraj deleted the finetune-txt2img branch October 11, 2022 17:03

usuyama mentioned this pull request Nov 16, 2022

Add FLAVA mlfoundations/open_clip#218

Closed

stable diffusion fine-tuning #356

stable diffusion fine-tuning #356

Uh oh!

Conversation

patil-suraj commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

minimaxir commented Sep 7, 2022

Uh oh!

patil-suraj commented Sep 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinpinkney commented Sep 7, 2022

Uh oh!

ezhang7423 commented Sep 7, 2022

Uh oh!

minimaxir commented Sep 7, 2022

Uh oh!

ghost commented Sep 11, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zcrypt0 Oct 7, 2022

Choose a reason for hiding this comment

Uh oh!

patil-suraj Oct 11, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

patrickvonplaten commented Oct 10, 2022

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

j-min commented Nov 30, 2022

Uh oh!

tengshaofeng commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aihao2000 commented Sep 17, 2023

Uh oh!

Uh oh!

patil-suraj commented Sep 5, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 5, 2022 •

edited

Loading

patil-suraj commented Sep 7, 2022 •

edited

Loading

tengshaofeng commented Feb 6, 2023 •

edited

Loading