-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] Support SDXL Kohya-style LoRA #4287
Conversation
The documentation is not available anymore as the PR was closed or merged. |
Related: #4286 |
I was also working on a parallel PR (sorry, didn't see this one 😄) where for this I concluded the easiest way was to simply group The LoRA loading logic in the SDXL pipeline already handled the rest so I agree this is the simplest way. |
I'm currently testing with the official SD-XL 1.0 LoRA: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_offset_example-lora_1.0.safetensors I've added some general block structure SGM<>Diffusers renaming as already noticed here. Also refactored some code to make it easier to use/read (I think). I'm using the following code snippet for testing: from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16)
pipe.load_lora_weights("./sd_xl_offset_example-lora_1.0.safetensors") |
@isidentical I will be offline for some time today. But let's maybe chat internally a bit and figure it out so that we can collaborate effectively and ship this. WDYT? @patrickvonplaten thanks a lot for chiming in. |
Would love that! I sent you an internal message about the details for the chat. |
You guys rock! |
* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (#4322) * Use lora compatible layers for linear proj_in/proj_out (#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>
* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (huggingface#4322) * Use lora compatible layers for linear proj_in/proj_out (huggingface#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>
* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (huggingface#4322) * Use lora compatible layers for linear proj_in/proj_out (huggingface#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>
* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (huggingface#4322) * Use lora compatible layers for linear proj_in/proj_out (huggingface#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>
I'm a bit confused by the q, k, and v lora layer shapes here. It seems like you can manually set You can show this by trying the following:
You can fix the immediate error in here by changing |
Hello there! I am a little bit confused reading the documentation. Sorry in advance if this is mentioned anywhere, I just couldn't find it. In the documentation about using Lora trained with Kohya for SD XL this is the example provided:
As you can see in the prompt, the prompt weight style is that of the Automatic1111 environment. However, to my understanding, on the official Diffusers documentation for prompt weighting it says this must be done with the Compel library. A similar thing happens when mentioning the Lora at the end of the prompt "lora:kame_sdxl_v2:1". I understand this is not the way of doing this with Diffusers. Could you please confirm if this prompt is or not correct for Diffusers? It would be great to have a part of the documentation where the differences on the syntaxes for both A1111 and Diffusers are explained. I see a little confusion in this topic and that would be helpful to avoid incorrectly performing operations such as prompt weighting in a way is not really effective when using Diffusers and not A1111. |
We copy-pasted the prompt from civit. The weighting bits in the prompt doesn't influence the generation quality in |
Thank you very much! @sayakpaul Does that mean that diffusers understands also this type of weighting? |
It doesn't. We don't support prompt weighting as a part of the library. |
Thanks a lot. That was really helpful! |
@patrickvonplaten Hey thanks for the great work. I found many new Civital LoRA models based on SDXL 1.0 are not supported in diffuers, and would be much appreciate if you guys can make the diffuers to be compatible. Thanks. |
Feel free to post links about which LoRAs are not compatible with fully reproducible code snippets. At the moment, we natively support Kohya-style LoRAs. |
Ahh I see, thanks for the quick reply. Many of Loras come with an error of "ValueError: Checkpoint not supported" in loaders. |
Then that might be the reason. Kohya is one of them ost popular libraries to have provided support for LoRA, so we prioritized that. The format of the different LoRA trainers is quite scattered at the moment, and it's very hard to centralize that effort at the moment. |
@sayakpaul Ah yeah, you're correct. After paying more attention to these settings, it starts working with a few LoRA models on Civital already. Thanks for the great work and the detailed explanations! |
@sayakpaul another quick question, I saw civital has some amazing checkpoints (yeah sdxl checkpoints) and not sure if there is a conversion tool that brings the civital checkpoints to be compatible with diffusrs? Thanks. |
* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (huggingface#4322) * Use lora compatible layers for linear proj_in/proj_out (huggingface#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>
* sdxl lora changes. * better name replacement. * better replacement. * debugging * debugging * debugging * debugging * debugging * remove print. * print state dict keys. * print * distingisuih better * debuggable. * fxi: tyests * fix: arg from training script. * access from class. * run style * debug * save intermediate * some simplifications for SDXL LoRA * styling * unet config is not needed in diffusers format. * fix: dynamic SGM block mapping for SDXL kohya loras (huggingface#4322) * Use lora compatible layers for linear proj_in/proj_out (huggingface#4323) * improve condition for using the sgm_diffusers mapping * informative comment. * load compatible keys and embedding layer maaping. * Get SDXL 1.0 example lora to load * simplify * specif ranks and hidden sizes. * better handling of k rank and hidden * debug * debug * debug * debug * debug * fix: alpha keys * add check for handling LoRAAttnAddedKVProcessor * sanity comment * modifications for text encoder SDXL * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * denugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * debugging * up * up * up * up * up * up * unneeded comments. * unneeded comments. * kwargs for the other attention processors. * kwargs for the other attention processors. * debugging * debugging * debugging * debugging * improve * debugging * debugging * more print * Fix alphas * debugging * debugging * debugging * debugging * debugging * debugging * clean up * clean up. * debugging * fix: text --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Batuhan Taskaya <batuhan@python.org>
Introduces support for loading SDXL Kohya-style LoRAs in
diffusers
throughload_lora_weights()
.Currently, it doesn't work. Opening this PR so that we can discuss it further (as discussed with @patrickvonplaten internally).
If we try to do:
(The checkpoint was downloaded from https://civitai.com/models/22279?modelVersionId=118556).
This will lead to the following problem:
And rightfully so. Let's see why.
I investigated a bit and indeed there are some entrants here. If we do:
It will print some checkpoints like so:
We need to figure out how to deal with this new block(s) as we always assume the
lora
identifier will always be present in the state dict we finally populate in thediffusers
modules.diffusers/src/diffusers/loaders.py
Line 325 in b3e5cd6
There are three checkpoints worth checking out:
All of these LoRAs have multiple
network_alpha
values for which I have added support in the PR.A couple things need to be also discussed:
_convert_kohya_lora_to_diffusers()
method for the second text encoder if it's present? We probably shouldn't override this in the SDXL pipeline source code to avoid duplication of code. I suggest we do the changes necessary in_convert_kohya_lora_to_diffusers()
ofLoraLoaderMixin
.hada
and skip connection LoRAs should be revisited in a future PR.