[Feat] Support SDXL Kohya-style LoRA #4287

sayakpaul · 2023-07-26T13:01:35Z

Introduces support for loading SDXL Kohya-style LoRAs in diffusers through load_lora_weights().

Currently, it doesn't work. Opening this PR so that we can discuss it further (as discussed with @patrickvonplaten internally).

If we try to do:

from diffusers import DiffusionPipeline
import torch 

base_model_id = "stabilityai/stable-diffusion-xl-base-0.9"
pipeline = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16).to("cuda:0")
pipeline.load_lora_weights(".", weight_name="Kamepan.safetensors")

prompt = "anime screencap, glint, drawing, best quality, light smile, shy, a full body of a girl wearing wedding dress in the middle of the forest beneath the trees, fireflies, big eyes, 2d, cute, anime girl, waifu, cel shading, magical girl, vivid colors, (outline:1.1), manga anime artstyle, masterpiece, offical wallpaper, glint <lora:kame_sdxl_v2:1>"
negative_prompt = "(deformed, bad quality, sketch, depth of field, blurry:1.1), grainy, bad anatomy, bad perspective, old, ugly, realistic, cartoon, disney, bad propotions"
generator = torch.manual_seed(2947883060)
num_inference_steps = 30
guidance_scale = 7

image = pipeline(
    prompt=prompt, negative_prompt=negative_prompt, num_inference_steps=num_inference_steps,
    generator=generator, guidance_scale=guidance_scale
).images[0]
image.save("Kamepan.png")

(The checkpoint was downloaded from https://civitai.com/models/22279?modelVersionId=118556).

This will lead to the following problem:

Traceback (most recent call last):
  File "/home/sayak/test_sdxl_lora.py", line 6, in <module>
    pipeline.load_lora_weights(".", weight_name="Kamepan.safetensors")
  File "/home/sayak/diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 857, in load_lora_weights
    self.load_lora_into_unet(state_dict, network_alphas=network_alphas, unet=self.unet)
  File "/home/sayak/diffusers/src/diffusers/loaders.py", line 1064, in load_lora_into_unet
    unet.load_attn_procs(unet_lora_state_dict, network_alphas=network_alphas)
  File "/home/sayak/diffusers/src/diffusers/loaders.py", line 433, in load_attn_procs
    raise ValueError(
ValueError: None does not seem to be in the correct format expected by LoRA or Custom Diffusion training.

And rightfully so. Let's see why.

I investigated a bit and indeed there are some entrants here. If we do:

from safetensors.torch import load_file

sd_lora_state_dict = load_file("Kamepan.safetensors")

print("Just loaded.\n")
for k in sd_lora_state_dict:
    new_k = k.replace("lora_unet_", "")
    new_k = new_k.replace("lora_te_", "")
    if "lora" not in new_k:
        print(new_k)

It will print some checkpoints like so:

input_blocks_4_1_proj_in.alpha
input_blocks_4_1_proj_out.alpha
input_blocks_4_1_transformer_blocks_0_attn1_to_k.alpha
input_blocks_4_1_transformer_blocks_0_attn1_to_out_0.alpha
input_blocks_4_1_transformer_blocks_0_attn1_to_q.alpha
input_blocks_4_1_transformer_blocks_0_attn1_to_v.alpha
input_blocks_4_1_transformer_blocks_0_attn2_to_k.alpha
input_blocks_4_1_transformer_blocks_0_attn2_to_out_0.alpha
input_blocks_4_1_transformer_blocks_0_attn2_to_q.alpha
input_blocks_4_1_transformer_blocks_0_attn2_to_v.alpha
input_blocks_4_1_transformer_blocks_0_ff_net_0_proj.alpha
input_blocks_4_1_transformer_blocks_0_ff_net_2.alpha
input_blocks_4_1_transformer_blocks_1_attn1_to_k.alpha
input_blocks_4_1_transformer_blocks_1_attn1_to_out_0.alpha
input_blocks_4_1_transformer_blocks_1_attn1_to_q.alpha
input_blocks_4_1_transformer_blocks_1_attn1_to_v.alpha
input_blocks_4_1_transformer_blocks_1_attn2_to_k.alpha
input_blocks_4_1_transformer_blocks_1_attn2_to_out_0.alpha
input_blocks_4_1_transformer_blocks_1_attn2_to_q.alpha
input_blocks_4_1_transformer_blocks_1_attn2_to_v.alpha
input_blocks_4_1_transformer_blocks_1_ff_net_0_proj.alpha
input_blocks_4_1_transformer_blocks_1_ff_net_2.alpha
input_blocks_5_1_proj_in.alpha
...

We need to figure out how to deal with this new block(s) as we always assume the lora identifier will always be present in the state dict we finally populate in the diffusers modules.

diffusers/src/diffusers/loaders.py

Line 325 in b3e5cd6

is_lora = all("lora" in k for k in state_dict.keys())

There are three checkpoints worth checking out:

https://civitai.com/models/22279?modelVersionId=118556 (UNet attention LoRA)
https://civitai.com/models/104515/sdxlor30costumesrevue-starlight-saijoclaudine-lora (UNet attention and conv LoRA)
https://civitai.com/models/108448/daiton-sdxl-test (everything combined LoRA)

All of these LoRAs have multiple network_alpha values for which I have added support in the PR.

A couple things need to be also discussed:

How do we handle state dict munging in the _convert_kohya_lora_to_diffusers() method for the second text encoder if it's present? We probably shouldn't override this in the SDXL pipeline source code to avoid duplication of code. I suggest we do the changes necessary in _convert_kohya_lora_to_diffusers() of LoraLoaderMixin.
Support hada and skip connection LoRAs should be revisited in a future PR.

HuggingFaceDocBuilderDev · 2023-07-26T13:15:15Z

The documentation is not available anymore as the PR was closed or merged.

src/diffusers/loaders.py

patrickvonplaten · 2023-07-26T16:08:16Z

Related: #4286

isidentical · 2023-07-26T18:19:18Z

How do we handle state dict munging in the _convert_kohya_lora_to_diffusers() method for the second text encoder if it's present? We probably shouldn't override this in the SDXL pipeline source code to avoid duplication of code. I suggest we do the changes necessary in _convert_kohya_lora_to_diffusers() of LoraLoaderMixin.

I was also working on a parallel PR (sorry, didn't see this one 😄) where for this I concluded the easiest way was to simply group lora_te1_ into the same group as lora_te_ (so te_state_dict) and create an in place state-dict for lora_te2_ which gets merged to the final text encoder state dict with the correct prefix (te_state_dict adds the TEXT_ENCODER prefix which is for the TE1, I just modified the code to do the same for te2_state_dict with TEXT_ENCODER_2` and merged the final result).

The LoRA loading logic in the SDXL pipeline already handled the rest so I agree this is the simplest way.

patrickvonplaten · 2023-07-26T22:29:23Z

I'm currently testing with the official SD-XL 1.0 LoRA: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_offset_example-lora_1.0.safetensors

I've added some general block structure SGM<>Diffusers renaming as already noticed here.

Also refactored some code to make it easier to use/read (I think).

I'm using the following code snippet for testing:

from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16)
pipe.load_lora_weights("./sd_xl_offset_example-lora_1.0.safetensors")

sayakpaul · 2023-07-27T00:50:43Z

@isidentical I will be offline for some time today. But let's maybe chat internally a bit and figure it out so that we can collaborate effectively and ship this. WDYT?

@patrickvonplaten thanks a lot for chiming in.

isidentical · 2023-07-27T01:37:25Z

I will be offline for some time today. But let's maybe chat internally a bit and figure it out so that we can collaborate effectively and ship this. WDYT?

Would love that! I sent you an internal message about the details for the chat.

BitPhinix · 2023-07-28T18:00:45Z

You guys rock!

* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (#4322) * Use lora compatible layers for linear proj_in/proj_out (#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>

* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (huggingface#4322) * Use lora compatible layers for linear proj_in/proj_out (huggingface#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>

LWprogramming · 2023-08-22T01:27:06Z

I'm a bit confused by the q, k, and v lora layer shapes here. It seems like you can manually set q_hidden_size and v_hidden_size (but not k_hidden_size?), but the code assumes that the resulting dimensions equal hidden_size anyways (e.g. here for query, here for key and value, because you need to be able to add things directly onto attn.to_q/k/v, which isn't possible if the dimensions don't match.

You can show this by trying the following:

d_q = 6
d_attn = 16
n_heads = 2
d_head = d_attn // n_heads
rank = 4
attn = Attention(query_dim=d_q, heads=n_heads, dim_head=d_head) # so attn.to_q is linear from 6 to 16
lora_processor = LoRAAttnProcessor(hidden_size=d_attn, rank=rank, q_hidden_size=d_q)
# but q_hidden_size is 6, so lora_processor.to_q_lora is functionally 6 to 6 and you will fail to add them together

# Create some placeholder data
batch = 4
channels = d_q
height = 5
width = 5
hidden_states = torch.randn(batch, channels, height, width)
# Call the LoRA processor on the placeholder data
processed_data = lora_processor(attn, hidden_states) # crashes

You can fix the immediate error in here by changing to_q_lora to q_hidden_size -> hidden_size LoRALinearLayer, but then you run into similar problems with k and v. I might be missing some detail with the shapes, and given that it works for the specific SDXL network this looks like it's meant for, it probably happens to be ok, but wanted to check anyways in case there was something either I or the code is missing.

gonzalojaimovitch · 2023-08-23T10:50:14Z

Hello there! I am a little bit confused reading the documentation. Sorry in advance if this is mentioned anywhere, I just couldn't find it.

In the documentation about using Lora trained with Kohya for SD XL this is the example provided:

import torch 

base_model_id = "stabilityai/stable-diffusion-xl-base-0.9"
pipeline = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16).to("cuda")
pipeline.load_lora_weights(".", weight_name="Kamepan.safetensors")

prompt = "anime screencap, glint, drawing, best quality, light smile, shy, a full body of a girl wearing wedding dress in the middle of the forest beneath the trees, fireflies, big eyes, 2d, cute, anime girl, waifu, cel shading, magical girl, vivid colors, (outline:1.1), manga anime artstyle, masterpiece, offical wallpaper, glint <lora:kame_sdxl_v2:1>"
negative_prompt = "(deformed, bad quality, sketch, depth of field, blurry:1.1), grainy, bad anatomy, bad perspective, old, ugly, realistic, cartoon, disney, bad propotions"
generator = torch.manual_seed(2947883060)
num_inference_steps = 30
guidance_scale = 7

image = pipeline(
    prompt=prompt, negative_prompt=negative_prompt, num_inference_steps=num_inference_steps,
    generator=generator, guidance_scale=guidance_scale
).images[0]
image.save("Kamepan.png")

As you can see in the prompt, the prompt weight style is that of the Automatic1111 environment. However, to my understanding, on the official Diffusers documentation for prompt weighting it says this must be done with the Compel library. A similar thing happens when mentioning the Lora at the end of the prompt "lora:kame_sdxl_v2:1". I understand this is not the way of doing this with Diffusers.

Could you please confirm if this prompt is or not correct for Diffusers?

It would be great to have a part of the documentation where the differences on the syntaxes for both A1111 and Diffusers are explained. I see a little confusion in this topic and that would be helpful to avoid incorrectly performing operations such as prompt weighting in a way is not really effective when using Diffusers and not A1111.

sayakpaul · 2023-08-23T10:52:02Z

We copy-pasted the prompt from civit. The weighting bits in the prompt doesn't influence the generation quality in diffusers.

gonzalojaimovitch · 2023-08-23T13:23:32Z

Thank you very much! @sayakpaul Does that mean that diffusers understands also this type of weighting?

sayakpaul · 2023-08-23T13:28:34Z

It doesn't. We don't support prompt weighting as a part of the library.

gonzalojaimovitch · 2023-08-23T13:52:33Z

Thanks a lot. That was really helpful!

linnanwang · 2023-08-24T09:31:09Z

@patrickvonplaten Hey thanks for the great work. I found many new Civital LoRA models based on SDXL 1.0 are not supported in diffuers, and would be much appreciate if you guys can make the diffuers to be compatible. Thanks.

sayakpaul · 2023-08-24T09:36:44Z

Feel free to post links about which LoRAs are not compatible with fully reproducible code snippets. At the moment, we natively support Kohya-style LoRAs.

linnanwang · 2023-08-24T09:38:49Z

Ahh I see, thanks for the quick reply.
Let's say this one as an example.
https://civitai.com/models/113488/library-bookshelf

Many of Loras come with an error of "ValueError: Checkpoint not supported" in loaders.

sayakpaul · 2023-08-24T09:41:19Z

Then that might be the reason. Kohya is one of them ost popular libraries to have provided support for LoRA, so we prioritized that.

The format of the different LoRA trainers is quite scattered at the moment, and it's very hard to centralize that effort at the moment.

sayakpaul · 2023-08-24T09:44:12Z

Also in the linked LoRA, I see the following:

The base model doesn't seem to be SDXL here IIUC.

linnanwang · 2023-08-24T09:49:50Z

@sayakpaul Ah yeah, you're correct. After paying more attention to these settings, it starts working with a few LoRA models on Civital already. Thanks for the great work and the detailed explanations!

linnanwang · 2023-08-24T17:16:51Z

@sayakpaul another quick question, I saw civital has some amazing checkpoints (yeah sdxl checkpoints) and not sure if there is a conversion tool that brings the civital checkpoints to be compatible with diffusrs? Thanks.

* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (huggingface#4322) * Use lora compatible layers for linear proj_in/proj_out (huggingface#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>

sayakpaul added 13 commits July 26, 2023 16:01

sdxl lora changes.

9ae25cf

better name replacement.

37f5598

better replacement.

be43f12

debugging

431a7ef

debugging

177c4fe

debugging

d4e1c22

debugging

ed74dfe

debugging

3667888

remove print.

40cb790

print state dict keys.

cbc21f7

print

e12b128

distingisuih better

4ea13cf

debuggable.

e0a946b

sayakpaul requested a review from patrickvonplaten July 26, 2023 13:01

fxi: tyests

508bd78

duongna21 reviewed Jul 26, 2023

View reviewed changes

src/diffusers/loaders.py Outdated Show resolved Hide resolved

fix: arg from training script.

b6f0ba0

sayakpaul mentioned this pull request Jul 26, 2023

SD-XL 0.9 LoRAs from CivitAI cannot be loaded #4279

Closed

sayakpaul and others added 3 commits July 26, 2023 21:46

access from class.

ca35339

run style

6b535bc

debug

3e72de3

patrickvonplaten added 2 commits July 26, 2023 18:34

save intermediate

0ce0544

some simplifications for SDXL LoRA

456eba5

patrickvonplaten mentioned this pull request Jul 27, 2023

load_lora_weight() doesn't work for sdxl lora trained model from https://github.com/TheLastBen/fast-stable-diffusion #4302

Closed

sayakpaul added 3 commits July 28, 2023 22:50

clean up.

060f9f7

debugging

98cada2

fix: text

cc4a019

patrickvonplaten merged commit 4a4cdd6 into main Jul 28, 2023
10 checks passed

patrickvonplaten deleted the feat/sdxl-lora-1 branch July 28, 2023 17:50

sayakpaul mentioned this pull request Jul 28, 2023

Improved support for working with Kohya-style LoRAs in diffusers 🤗 #4348

Closed

Gynjn mentioned this pull request Jul 31, 2023

Anyone tried convert sdxl .safetensor lora yet? #4386

Closed

sayakpaul mentioned this pull request Aug 22, 2023

[LoRA] default to None when fc alphas are not available. #4706

Merged

MaxTran96 mentioned this pull request Aug 25, 2023

Loading CivitAI Lora #4773

Closed

patrickvonplaten mentioned this pull request Aug 25, 2023

[LoRA Attn Processors] Refactor LoRA Attn Processors #4765

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Support SDXL Kohya-style LoRA #4287

[Feat] Support SDXL Kohya-style LoRA #4287

sayakpaul commented Jul 26, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 26, 2023 •

edited

Loading

patrickvonplaten commented Jul 26, 2023

isidentical commented Jul 26, 2023

patrickvonplaten commented Jul 26, 2023

sayakpaul commented Jul 27, 2023

isidentical commented Jul 27, 2023

BitPhinix commented Jul 28, 2023

LWprogramming commented Aug 22, 2023

gonzalojaimovitch commented Aug 23, 2023 •

edited

Loading

sayakpaul commented Aug 23, 2023

gonzalojaimovitch commented Aug 23, 2023

sayakpaul commented Aug 23, 2023

gonzalojaimovitch commented Aug 23, 2023

linnanwang commented Aug 24, 2023

sayakpaul commented Aug 24, 2023

linnanwang commented Aug 24, 2023

sayakpaul commented Aug 24, 2023

sayakpaul commented Aug 24, 2023

linnanwang commented Aug 24, 2023

linnanwang commented Aug 24, 2023

[Feat] Support SDXL Kohya-style LoRA #4287

[Feat] Support SDXL Kohya-style LoRA #4287

Conversation

sayakpaul commented Jul 26, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Jul 26, 2023 • edited Loading

patrickvonplaten commented Jul 26, 2023

isidentical commented Jul 26, 2023

patrickvonplaten commented Jul 26, 2023

sayakpaul commented Jul 27, 2023

isidentical commented Jul 27, 2023

BitPhinix commented Jul 28, 2023

LWprogramming commented Aug 22, 2023

gonzalojaimovitch commented Aug 23, 2023 • edited Loading

sayakpaul commented Aug 23, 2023

gonzalojaimovitch commented Aug 23, 2023

sayakpaul commented Aug 23, 2023

gonzalojaimovitch commented Aug 23, 2023

linnanwang commented Aug 24, 2023

sayakpaul commented Aug 24, 2023

linnanwang commented Aug 24, 2023

sayakpaul commented Aug 24, 2023

sayakpaul commented Aug 24, 2023

linnanwang commented Aug 24, 2023

linnanwang commented Aug 24, 2023

sayakpaul commented Jul 26, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 26, 2023 •

edited

Loading

gonzalojaimovitch commented Aug 23, 2023 •

edited

Loading