[WIP][Examples] Support LoRA DreamBooth training with SD XL by sayakpaul · Pull Request #3896 · huggingface/diffusers

sayakpaul · 2023-06-29T09:03:06Z

What does this PR do?

This PR is for me to gather feedback on the structure and the modifications to accommodate the DreamBooth LoRA training with SD XL.

Keep in mind

This PR adds an example to show how to conduct DreamBooth LoRA training with SDXL. Builds on top of #3859.

While reviewing the PR, please only restrict yourself to the train_dreambooth_lora_sd_xl.py script.

Others

The script should be ready more or less as a good first draft.
No support for text encoder yet as the modifications are reasonably sized.
I have not run make style && make quality to not mess with the other files.
There are many use_auth_token=True in the script because of obvious reasons.
No documentation and test cases yet. Need to gather good enough results and findings first. But getting the script to work took more time than I thought. So, I think it would be good to have some 👀 from the get-go.

I am currently training with the following command:

export MODEL_NAME="diffusers/stable-diffusion-xl-base-0.9"
export INSTANCE_DIR="dog"
export CLASS_DIR="dog-class"
export OUTPUT_DIR="lora-trained-xl"

accelerate launch train_dreambooth_lora_sd_xl.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of sks dog" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=1e-5 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=100 \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=50 \
  --seed="0" \
  --push_to_hub

The dog dataset was downloaded using the following code:

from huggingface_hub import snapshot_download

local_dir = "./dog"
snapshot_download(
    "diffusers/dog-example",
    local_dir=local_dir, repo_type="dataset",
    ignore_patterns=".gitattributes",
)

The training artifacts are available here: https://huggingface.co/diffusers/lora-trained-xl (private, only visible to the diffusers team members for now).

… sd_xl

Fix embeddings for classic SD models.

bghira · 2023-07-04T16:10:28Z

examples/dreambooth/train_dreambooth_lora_sd_xl.py

+    def compute_embeddings(prompt, text_encoders, tokenizers):
+        original_size = (args.resolution, args.resolution)
+        target_size = (args.resolution, args.resolution)
+        crops_coords_top_left = (0, 0)


we should pass in crop coords as an optional value, because forcing this to 0,0 while also forcing a square resolution and cropping the input data is kind of going against the results of the SDXL paper, and the purpose of these conditioning inputs.

bghira · 2023-07-04T16:22:27Z

i would like for these examples to begin using a common module for shared functions, the code duplication is pretty high and the maintenance cost of updating all of these scripts just grows over time, now that each new model seems to get its own examples with possibly 4 different scripts.

if we use a common module, i would like for it to have the data bucketing implemented that was used in the original technical report, so that we use a conditioning method to scale the smaller side to a fixed value and then use a two-decimal rounded aspect ratio (eg. 1.78 for 16:9) to put these images into buckets.

it gets tricky, because the original model was trained on 256x256, 512x512, and 1024 based multi-aspect.

if we had multiple dataloaders for SDXL (1 for each base resolution) and a common module to handle this, then general finetune, LoRA and TI scripts can all benefit from that.

i implemented aspect bucketing into the diffusers finetuning script for SD2.1 and can share whatever lessons from that.

sayakpaul · 2023-07-05T01:50:04Z

Hey @bghira, thanks for sharing your insights!

i would like for these examples to begin using a common module for shared functions, the code duplication is pretty high and the maintenance cost of updating all of these scripts just grows over time, now that each new model seems to get its own examples with possibly 4 different scripts.

Please note that we purposefully don't do it as stated in our doc. Also, we don't want to maintain too many examples here. If you checked, the DreamBooth example we currently have in main only caters to SD and IF. We decided to add a separate one for SDXL based on the potential impact. So, until and unless there's something significantly more impactful, we won't likely be adding anything new.

For aspect-ratio bucketing, we purposefully don't want to add it because it introduces complexity we want to avoid to respect the goals of having these examples officially maintained, i.e., to provide not just the examples but also achieve readability and simplicity baked inside of them. But more than happy to welcome a PR from you if you want to maybe add that as a community example. Happy for it to be also included from our official docs :)

sayakpaul · 2023-07-10T07:44:58Z

Closing in favor of #4016.

patrickvonplaten and others added 30 commits June 23, 2023 10:37

Add new text encoder

57b8406

add transformers depth

39b0b97

More

50df26c

Correct conversion script

4309a2c

Fix more

51ab97a

Fix more

dd48802

Correct more

7b76780

correct text encoder

e0a0e36

Finish all

277bc9d

proof that in works in run local xl

62a151d

clean up

ea4cf25

Get refiner to work

48d203e

Add red castle

4216826

Fix batch size

13107bb

Improve pipelines more

cb23c61

Finish text2image tests

0f1d17c

Add img2img test

7850ef3

Fix more

a0621fd

fix import

60cea8e

Merge branch 'sd_xl' of https://github.com/huggingface/diffusers into…

6217c36

… sd_xl

Fix embeddings for classic models (#3888)

fb7ee3a

Fix embeddings for classic SD models.

add: dreambooth training script for LoRA for SD XL.

5333505

Allow multiple prompts to be passed to the refiner (#3895)

62df284

add use_auth_token=True

65c37ac

remove extra use_auth_token.

8df6673

correct resolution tuple.

49e8cf8

debug.

77d25e7

wrap in tuples.

ada1b16

add: debuggin to pipeline

75ae2ec

correct

de95a2b

sayakpaul added 17 commits June 29, 2023 09:04

debug

dd09b9f

cat instead of stack.

164c287

correct namespace.

6c9c639

correct cli args

5212462

clean up and add comments.

8a31358

subclass from LoraLoaderMixin in SD XL pipeline.

5fb4268

remove prints.

a2637ea

remove unneeded pipeline arg.

8e9827a

add mean std debugger.

6ef0247

focus on down and up blocks for now.

6372b99

mid blocks are needed.

0276b0c

add comments, clean up and replace the scheduler.

c9657f7

add: debugging statement for dtype.

6133592

upcast vae to float32.

c6fe3ab

mode debugging to support fp16 without nan sneaking in.

a26bfc4

create hub repos private by default

d92f886

remove the print statements.

d015d0c

sayakpaul requested review from patrickvonplaten and pcuenca and removed request for patrickvonplaten June 29, 2023 09:09

sayakpaul added 3 commits June 29, 2023 14:41

change to DDPMscheduler.

e2a91ba

precompute validation prompt embeddings too.

6cca537

remove code for precomputing validation prompt embeds.

eb3b911

sayakpaul requested a review from patrickvonplaten June 29, 2023 12:34

bghira reviewed Jul 4, 2023

View reviewed changes

sayakpaul mentioned this pull request Jul 10, 2023

[Examples] Add a training script for SDXL DreamBooth LoRA #4016

Merged

2 tasks

sayakpaul closed this Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][Examples] Support LoRA DreamBooth training with SD XL#3896

[WIP][Examples] Support LoRA DreamBooth training with SD XL#3896
sayakpaul wants to merge 58 commits intomainfrom
dreambooth/sd-xl-2

sayakpaul commented Jun 29, 2023 •

edited

Loading

Uh oh!

bghira Jul 4, 2023

Uh oh!

bghira commented Jul 4, 2023

Uh oh!

sayakpaul commented Jul 5, 2023

Uh oh!

sayakpaul commented Jul 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sayakpaul commented Jun 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Keep in mind

Others

Uh oh!

bghira Jul 4, 2023

Choose a reason for hiding this comment

Uh oh!

bghira commented Jul 4, 2023

Uh oh!

sayakpaul commented Jul 5, 2023

Uh oh!

sayakpaul commented Jul 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sayakpaul commented Jun 29, 2023 •

edited

Loading