[LoRA] Add LoRA training script #1884

patrickvonplaten · 2023-01-02T07:39:40Z

Update:

Training seems to work fine -> see some results here (after 4min of training on a A100): https://wandb.ai/patrickvonplaten/stable_diffusion_lora/reports/LoRA-training-results--VmlldzozMzI4MTI3?accessToken=d7x29esww3nvbrilo18hyto784w4oep721jiqgophgzdhztytwko1stcscp38gld

Possible API:

The premise of LoRA is to add weights to the model and only train those so that the fine-tuned weights result in some very small portable weights.

Therefore it is important to add a new "LoRA weights loading API" which is currently implemented as follows:

#!/usr/bin/env python3
from diffusers import StableDiffusionPipeline
import torch

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.unet.load_attn_procs("patrickvonplaten/lora")
pipeline.to("cuda")

prompt = "A photo of sks dog in a bucket"

images = pipeline(prompt, num_images_per_prompt=4).images
    
for i, image in enumerate(images):
    image.save(f"/home/patrick_huggingface_co/images/dog_{i}.png")

The idea is the following. During training only the loRA layers are saved which for the default rank=4 are only around 3MB: https://huggingface.co/patrickvonplaten/lora/blob/main/pytorch_attn_procs.bin

Those weights can then be downloaded easily from the Hub via a novel load_lora loading function as implemented here:
https://github.com/huggingface/diffusers/pull/1884/files#r1069869084

Co-authors:
Co-authored by: https://github.com/cloneofsimo - the first that came up with the idea of using LoRA for stable diffusion in the popular "lora" repo: https://github.com/cloneofsimo/lora

HuggingFaceDocBuilderDev · 2023-01-02T07:43:58Z

The documentation is not available anymore as the PR was closed or merged.

jrd-rocks · 2023-01-03T12:56:13Z

This is great, but should it be part of diffusers? Why not have this as an external library. Maybe this is more of a meta-comment, but imho there is no need for diffusers to be everything. It should be the base where other libraries can build on. To me this seems to be both easier to the contributors/maintainers of the "advanced" libraries and also for diffusers as such, as there's bound to be a difference in development speed/cadence of these new and shiny methods and the core, it won't be pleasant to have to update an amalgamation of code just because the one of the new shiny embedded libraries advances. The cloneofsimo/lora repo works very well with diffusers, wouldn't it be better to do all lora related development there (or if for this implementation is incompatible, just a new repo so that there are two lora flavored libraries, which i think is preferable over just putting everything in diffusers)

…add_lora_fine_tuning

cloneofsimo · 2023-01-03T20:31:08Z

I would say it is honor to have my project be an inspiration to become one of official huggingface source code, but I do have a feeling that we are reinventing the wheel here...

patrickvonplaten · 2023-01-04T13:21:34Z

Hey @jrd-rocks and @cloneofsimo,

Thanks for your comments - it's super nice to see that other repositories such as https://github.com/cloneofsimo/lora are using diffusers.

@cloneofsimo, would it be ok if we state you as one of the authors of this script and link to your GitHub repo? (or would you maybe like to help with this PR to make you an author by commit?)

This example script will have a couple of differences:

1.) We will load the new "set cross attention" method so that one can load LoRA checkpoints directly with from_pretrained(...)
2.) We won't add all the features (such as hacking CLIP and Feed-forwards) to begin with

=> We intend this script rather as a long-term maintained example script of how to use LoRA, we're happy to refer to yours as "the" LoRA training script if you'd like :-)

patrickvonplaten · 2023-01-04T13:26:28Z

Actually, the main reason we opened this PR was because the community asked for it here: #1715

brian6091 · 2023-01-04T13:29:12Z

Hi @patrickvonplaten

Super cool to see this development. However, I'm wondering why it was necessary to create new CrossAttention classes for LoRA? I can't figure out how this differs from how people have been applying @cloneofsimo 's repo. In case you don't want to pollute the PR, I've posted the question in another discussion here: cloneofsimo/lora#107

Thanks for all your efforts and any insight(s) you can offer!

patrickvonplaten · 2023-01-04T21:32:56Z

Hey @brian6091,

The CrossAttention mechanism was not (just) introduced for LoRA, it's main usage is to be able to tweak attention weights at runtime as explained in: #1639

brian6091 · 2023-01-04T22:32:56Z

Hey @brian6091,

The CrossAttention mechanism was not (just) introduced for LoRA, it's main usage is to be able to tweak attention weights at runtime as explained in: #1639

Thanks for the context @patrickvonplaten, I better understand the design decision now.

cloneofsimo · 2023-01-05T00:06:13Z

Hi @patrickvonplaten , thank you for your kind explanations! I would love if you would reference like that for me. Thanks for the hard work!

src/diffusers/models/unet_2d_condition.py

sayakpaul · 2023-01-13T14:42:20Z

examples/lora/train_lora.py

+        optimizer_class = torch.optim.AdamW
+
+    # Optimizer creation
+    params_to_optimize = itertools.chain(*[v.parameters() for v in unet.attn_processors.values()])


Shouldn't this also contain the text encoder parameters if train_text_encoder is set to True?

sayakpaul · 2023-01-13T14:43:20Z

examples/lora/train_lora.py

+            pipeline = pipeline.to(accelerator.device)
+            generator = torch.Generator(device=accelerator.device).manual_seed(args.seed)
+            pipeline.set_progress_bar_config(disable=True)
+            sample_dir = "/home/patrick_huggingface_co/lora-tryout/samples"


You probably have noted it already but this needs to be more generic I guess.

patrickvonplaten · 2023-01-13T15:27:31Z

script seems to work quite well: https://wandb.ai/patrickvonplaten/stable_diffusion_lora/reports/LoRA-training-results--VmlldzozMzI4MTI3?accessToken=d7x29esww3nvbrilo18hyto784w4oep721jiqgophgzdhztytwko1stcscp38gld

sayakpaul · 2023-01-17T16:07:12Z

examples/dreambooth/train_dreambooth_lora.py

+            for tracker in accelerator.trackers:
+                if tracker.name == "wandb":


A cleaner way might be to

if wandb.run is not None: ...

@sayakpaul This comes from a code snippet where I did different things depending on whether the tracker was tensorflow, wandb or something else (there can be different trackers enabled). But yes, if we are only considering the case of wandb we could maybe simplify it.

@patrickvonplaten the report does not include prompts for the logged images, did you prepare it from a previous run?

patrickvonplaten · 2023-01-17T16:12:17Z

@pcuenca @patil-suraj @sayakpaul - I think this is ready for a final review :-)

patil-suraj

Looks great to me, I like the loaders API! I just have two questions:

Why is the casted to weight_dtype ?
Does generation work when mixed-precision training is enabled ?

examples/dreambooth/README.md

examples/dreambooth/train_dreambooth_lora.py

patil-suraj · 2023-01-17T18:05:20Z

examples/dreambooth/train_dreambooth_lora.py

+            if global_step >= args.max_train_steps:
+                break
+
+        if args.validation_prompt is not None and epoch % 10 == 0:


should we let the user control how often to generate, rather than hardcoding the value here ?

Yes good point adding validation_epochs

patil-suraj · 2023-01-17T18:06:20Z

examples/dreambooth/train_dreambooth_lora.py

+            # run inference
+            generator = torch.Generator(device=accelerator.device).manual_seed(args.seed)
+            prompt = args.num_validation_images * [args.validation_prompt]
+            images = pipeline(prompt, num_inference_steps=25, generator=generator).images


Maybe use autocast here to do generation in fp16. Have you verified this with mixed-precision ?

examples/dreambooth/train_dreambooth_lora.py

src/diffusers/loaders.py

pcuenca

Impressive work!

examples/dreambooth/README.md

pcuenca · 2023-01-17T18:21:56Z

examples/dreambooth/README.md

+```
+
+**___Note: When using LoRA we can use a much higher learning rate compared to vanilla dreambooth. Here we 
+use *1e-4* instead of the usual *2e-6*.___**


👍 Perfect

examples/dreambooth/README.md

pcuenca · 2023-01-17T18:37:02Z

examples/dreambooth/train_dreambooth_lora.py

+            for tracker in accelerator.trackers:
+                if tracker.name == "wandb":


@sayakpaul This comes from a code snippet where I did different things depending on whether the tracker was tensorflow, wandb or something else (there can be different trackers enabled). But yes, if we are only considering the case of wandb we could maybe simplify it.

pcuenca · 2023-01-17T18:38:35Z

examples/dreambooth/train_dreambooth_lora.py

+            for tracker in accelerator.trackers:
+                if tracker.name == "wandb":


@patrickvonplaten the report does not include prompts for the logged images, did you prepare it from a previous run?

src/diffusers/loaders.py

src/diffusers/models/cross_attention.py

patrickvonplaten · 2023-01-17T18:54:23Z

src/diffusers/loaders.py

+logger = logging.get_logger(__name__)
+
+
+ATTN_WEIGHT_NAME = "pytorch_attn_procs.bin"


Actually maybe it would make more sense to make the naming more general here:

Suggested change

ATTN_WEIGHT_NAME = "pytorch_attn_procs.bin"

ATTN_WEIGHT_NAME = "embeddings.bin"

So that multiple loaders could be applied on the same file? cc @pcuenca @patil-suraj

I thought about it, but wasn't sure. Another idea would be to make it more specific and descriptive, like lora_embeddings.bin and then use different names for others. Not sure what would be easiest to deal with going forward.

Not sure if embeddings is a good name, since these aren't embeddings, agree with Pedro, maybe make it specific to procs, for example lora_layers.bin or lora_weights.bin.

Hmm, not a big fan of making it super specific in case, we will want to expand the functionality to more "adapter" layers in the future.

E.g. if someone wants to use both LoRA and textual inversion it'd be nicer to have everything in one file no?
=> going for adapter_weights.bin now, ok for you?

Alright, @patil-suraj convinced me to go for LoRA weights specific name - this means in the longer run:

We have different files for different parts of the pipeline

The user will call multiple loading methods for different parts of the pipeline, e.g.:

from diffusers import DiffusionPipeline pipe = DiffusionPipeline.from_pretrained("...") pipe.unet.load_attn_procs("...") pipe.load_text_embeddings("...")

But I think that's fine!

sayakpaul

This is going to be an enabler. I am telling you!

Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>

…/diffusers into add_lora_fine_tuning

patrickvonplaten · 2023-01-18T17:08:24Z

Update

When running the script in mixed precision, it's running at 6.5 GB GPU RAM, but actually goes up to 14GB GPU for inference, but one can just generate fewer samples during inference or run the pipeline in a loop.

jtoy · 2023-01-24T15:28:11Z

14gb for inference!?!?! I do think this should be part of diffusers

jtoy · 2023-01-24T18:55:59Z

wandb is a requirement in the current code,that seems like a bug...

pcuenca · 2023-01-24T19:02:53Z

@jtoy I could complete a fine-tuning run using a 2080 Ti (11 GB of RAM) :) And yes, I agree that wandb should not be in requirements.txt, will open a PR.

jtoy · 2023-01-24T19:15:28Z

@pcuenca what args did you use? I used my titan X 1080 with 12 gb and it dies with OOM. I used the example in the README:

accelerate launch train_dreambooth_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--instance_data_dir=$INSTANCE_DIR
--output_dir=$OUTPUT_DIR
--instance_prompt="a photo of sks dog"
--resolution=512
--train_batch_size=1
--gradient_accumulation_steps=1
--checkpointing_steps=100
--learning_rate=1e-4
--lr_scheduler="constant"
--lr_warmup_steps=0
--max_train_steps=500
--validation_prompt="A photo of sks dog in a bucket"
--validation_epochs=50
--seed="0"

pcuenca · 2023-01-24T19:31:21Z

@jtoy Sorry for the confusion, I was referring to a complete fine-tuning using the train_text_to_image_lora.py script, not Dreambooth. I haven't verified memory consumption on Dreambooth yet.

FBehrad · 2023-01-31T13:19:26Z

Update:

Training seems to work fine -> see some results here (after 4min of training on a A100): https://wandb.ai/patrickvonplaten/stable_diffusion_lora/reports/LoRA-training-results--VmlldzozMzI4MTI3?accessToken=d7x29esww3nvbrilo18hyto784w4oep721jiqgophgzdhztytwko1stcscp38gld

Possible API:

The premise of LoRA is to add weights to the model and only train those so that the fine-tuned weights result in some very small portable weights.

Therefore it is important to add a new "LoRA weights loading API" which is currently implemented as follows:
#!/usr/bin/env python3
from diffusers import StableDiffusionPipeline
import torch

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.unet.load_attn_procs("patrickvonplaten/lora")
pipeline.to("cuda")

prompt = "A photo of sks dog in a bucket"

images = pipeline(prompt, num_images_per_prompt=4).images
    
for i, image in enumerate(images):
    image.save(f"/home/patrick_huggingface_co/images/dog_{i}.png")
The idea is the following. During training only the loRA layers are saved which for the default rank=4 are only around 3MB: https://huggingface.co/patrickvonplaten/lora/blob/main/pytorch_attn_procs.bin

Those weights can then be downloaded easily from the Hub via a novel load_lora loading function as implemented here: https://github.com/huggingface/diffusers/pull/1884/files#r1069869084

Co-authors: Co-authored by: https://github.com/cloneofsimo - the first that came up with the idea of using LoRA for stable diffusion in the popular "lora" repo: https://github.com/cloneofsimo/lora

Thank you for adding LoRa.
You said the default rank is 4, but when I was checking LoRACrossAttnProcessor, I saw rank is not used at all.
Therefore, how can I change the rank?

asadm · 2023-02-01T17:22:32Z

@FBehrad I sent a fix for this in #2191

* [Lora] first upload * add first lora version * upload * more * first training * up * correct * improve * finish loaders and inference * up * up * fix more * up * finish more * finish more * up * up * change year * revert year change * Change lines * Add cloneofsimo as co-author. Co-authored-by: Simo Ryu <cloneofsimo@gmail.com> * finish * fix docs * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com> * upload * finish Co-authored-by: Simo Ryu <cloneofsimo@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>

[Lora] first upload

4eb297e

patrickvonplaten added 3 commits January 2, 2023 13:21

add first lora version

67f4e5a

upload

24993c4

more

943e7f4

patrickvonplaten changed the title ~~[Lora] first upload~~ [WIP, Lora] Add lora training script Jan 2, 2023

patrickvonplaten changed the title ~~[WIP, Lora] Add lora training script~~ [WIP, LoRA] Add lora training script Jan 2, 2023

patrickvonplaten mentioned this pull request Jan 2, 2023

LoRA for Dreambooth #1715

Closed

yasyf mentioned this pull request Jan 3, 2023

[WIP] Add Flax LoRA Support to Dreambooth #1894

Closed

first training

e7293d0

patrickvonplaten mentioned this pull request Jan 3, 2023

[Community] Hypernetworks #1140

Open

patrickvonplaten added 3 commits January 3, 2023 15:25

Merge branch 'main' of https://github.com/huggingface/diffusers into …

0baadb1

…add_lora_fine_tuning

up

b8e9ce4

correct

f7719e0

patrickvonplaten commented Jan 10, 2023

View reviewed changes

src/diffusers/models/unet_2d_condition.py Show resolved Hide resolved

patrickvonplaten mentioned this pull request Jan 13, 2023

Please create simple way to load new embeddings on Stable Diffusion Pipeline #1985

Closed

sayakpaul reviewed Jan 13, 2023

View reviewed changes

improve

b69f276

patrickvonplaten added 2 commits January 13, 2023 17:52

finish loaders and inference

5d6ee56

up

bc15289

sayakpaul reviewed Jan 17, 2023

View reviewed changes

fix docs

f53d962

patrickvonplaten changed the title ~~[WIP, LoRA] Add lora training script~~ [LoRA] Add LoRA training script Jan 17, 2023

sayakpaul self-requested a review January 17, 2023 16:48

patil-suraj approved these changes Jan 17, 2023

View reviewed changes

pcuenca approved these changes Jan 17, 2023

View reviewed changes

patrickvonplaten commented Jan 17, 2023

View reviewed changes

sayakpaul approved these changes Jan 17, 2023

View reviewed changes

sayakpaul mentioned this pull request Jan 18, 2023

Write a doc for centralizing our LoRA efforts #2028

Closed

patrickvonplaten and others added 4 commits January 18, 2023 15:11

Apply suggestions from code review

5def85c

Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Suraj Patil <surajp815@gmail.com>

upload

6f8f610

Merge branch 'add_lora_fine_tuning' of https://github.com/huggingface…

d137d62

…/diffusers into add_lora_fine_tuning

finish

dd60ad8

patrickvonplaten merged commit ed616bd into main Jan 18, 2023

patrickvonplaten deleted the add_lora_fine_tuning branch January 18, 2023 17:08

sayakpaul mentioned this pull request Jan 18, 2023

[LoRA] Adds example on text2image fine-tuning with LoRA #2031

Merged

pcuenca mentioned this pull request Jan 24, 2023

Remove wandb from text_to_image requirements.txt #2092

Merged

orenwang mentioned this pull request Jan 25, 2023

[lora] Fix bug with training without validation #2106

Merged

patrickvonplaten mentioned this pull request Feb 3, 2023

How to load a custom fine-tuned embeds into the inpainting pipeline？ #2199

Closed

sayakpaul mentioned this pull request Feb 20, 2023

[Pipelines] Support for T2I-Adapter #2390

Closed

2 tasks

		for tracker in accelerator.trackers:
		if tracker.name == "wandb":

		logger = logging.get_logger(__name__)


		ATTN_WEIGHT_NAME = "pytorch_attn_procs.bin"

	ATTN_WEIGHT_NAME = "pytorch_attn_procs.bin"
	ATTN_WEIGHT_NAME = "embeddings.bin"

[LoRA] Add LoRA training script #1884

[LoRA] Add LoRA training script #1884

Conversation

patrickvonplaten commented Jan 2, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Jan 2, 2023 • edited Loading

jrd-rocks commented Jan 3, 2023 • edited Loading

cloneofsimo commented Jan 3, 2023

patrickvonplaten commented Jan 4, 2023

patrickvonplaten commented Jan 4, 2023

brian6091 commented Jan 4, 2023 • edited Loading

patrickvonplaten commented Jan 4, 2023

brian6091 commented Jan 4, 2023

cloneofsimo commented Jan 5, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten commented Jan 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten commented Jan 17, 2023

patil-suraj left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pcuenca left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten Jan 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sayakpaul left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Jan 18, 2023 • edited Loading

jtoy commented Jan 24, 2023

jtoy commented Jan 24, 2023

pcuenca commented Jan 24, 2023

jtoy commented Jan 24, 2023 • edited Loading

pcuenca commented Jan 24, 2023

FBehrad commented Jan 31, 2023 • edited Loading

asadm commented Feb 1, 2023

patrickvonplaten commented Jan 2, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 2, 2023 •

edited

Loading

jrd-rocks commented Jan 3, 2023 •

edited

Loading

brian6091 commented Jan 4, 2023 •

edited

Loading

patil-suraj left a comment •

edited

Loading

patrickvonplaten Jan 18, 2023 •

edited

Loading

patrickvonplaten commented Jan 18, 2023 •

edited

Loading

jtoy commented Jan 24, 2023 •

edited

Loading

FBehrad commented Jan 31, 2023 •

edited

Loading