Initial setup of lora support #2712

felorhik · 2023-02-18T12:35:20Z

I wanted to get Lora support going, so I decided to take a crack at it.

It uses the same syntax as https://github.com/AUTOMATIC1111/stable-diffusion-webui like so:

prompt: main prompt, <lora:name_of_lora:1>

Right now the weight does nothing, it will always be 1.
Also the lora must be trained in diffusers format https://huggingface.co/docs/diffusers/training/lora

This also requires the base diffusers version to be raised from 0.11 to 0.13

it should be able to support multiple, but I have not tested it that deeply. This is far from ready, but might be useful for anyone wanting to experiment with lora in InvokeUI

blessedcoolant · 2023-02-18T18:15:19Z

Turns out finding a LoRA model online that is in the diffusers format is ending up being harder than I expected. Most of them are shared as safetensors. I think I'll train my own Lora and save as diffusers and then try this PR out.

ChrGriffin · 2023-02-18T18:47:08Z

Getting LoRA support going is great, but as you point out, the vast majority are in safetensors format -- might be better to aim to support that up front?

felorhik · 2023-02-19T03:38:14Z

Working on getting safetensors to work, however there is some complexity to it.

https://github.com/cloneofsimo/lora uses a format compatible with diffusers, and can be used to load safetensors based lora files, provided some patching is done on the pipeline. There are guides in the repo for that I have been experimenting with. https://github.com/cloneofsimo/lora/blob/71c8c1dba595d77d0eabdf9c278630168e5a8ce1/scripts/run_inference.ipynb

Most have been trained with https://github.com/kohya-ss/sd-scripts which uses its own format for the keys. I have been attempting to convert them and save in diffusers format, but no luck yet.

Given that, I am leaning more towards needing a conversion for the kohya scripts, rather then supporting them natively. Though knowing which method was used to train may be difficult. Still experimenting but overall diffusers can make a 3mb lora file, vs 150mb of the kohya method. I have not seen a difference in the quality either.

For training as it currently is:

Using the following diffusers training scripts I have had success loading them
https://github.com/huggingface/diffusers/blob/b2c1e0d6d4ffbd93fc0c381e5b9cdf316ca4f99f/examples/dreambooth/train_dreambooth_lora.py
https://github.com/huggingface/diffusers/blob/b2c1e0d6d4ffbd93fc0c381e5b9cdf316ca4f99f/examples/text_to_image/train_text_to_image_lora.py

blessedcoolant · 2023-02-19T03:47:40Z

Is there a script somewhere that allows us to convert Lora models from safetensors or whatever format to diffusers? If there is one, we could integrate that and load through that maybe?

felorhik · 2023-02-19T04:08:55Z

Added an adjustment which should be able to load safetensors made by https://github.com/cloneofsimo/lora though still working on testing / debugging it.

pip install git+https://github.com/cloneofsimo/lora.git is needed as a dependency for it though.

ChrGriffin · 2023-02-19T04:19:16Z

I have no idea if any of this information will be useful to you, but maybe it'll at least inspire some fresh Google terms!

I've been using the HuggingFace Diffusers package, specifically the LoRA Dreambooth example with basically zero changes, in order to train my LoRAs (the same one you linked). Unfortunately, the package produces a .bin file, not compatible with A1111 (changing the extension isn't enough). A little Googling did bring me to this thread with the script by ignacfetser at the bottom.

It's hardly the most... uh... structurally sound solution, but I can confirm that for now at least, it does work.

Of course, this isn't the exact issue you're encountering, but I figured I'd drop it here if any of the information was helpful.

felorhik · 2023-02-19T06:22:47Z

@ChrGriffin It did help, gave me a good direction on a conversion script.

It saves as diffusers, and can be loaded after running

python ./scripts/convert_lora.py --lora_file=path/to/lora_file_name.safetensors

Although i have not tested it yet, it should work with ckpt too

python ./scripts/convert_lora.py --lora_file=path/to/lora_file_name.ckpt

It should save to ./models/lora/lora_file_name and be usable by <lora:lora_file_name:1>

Not seeing great results out of it yet, but it is loading into the pipeline at least.

blessedcoolant · 2023-02-19T06:28:58Z

If the conversion time is short, I think we can effectively do a one time conversion of the safetensor/ckpt model to diffusers. I think that might be more ideal because in the long run, we want to do Diffusers. And that way, we can avoid installing the original repo as a dependency and fully have it work through the Diffusers pipeline.

neecapp · 2023-02-19T06:46:18Z

huggingface/diffusers#2403 may be of some interest--though, you've already figured out the key mapping part.

I tried a variant of that in place of the cloneofismo monkey patch with better results (no errors for missing alphas). Still not perfect, seems like either the math is off somewhere or this lora I am testing with is junk...

Using the conversion script does not work for me. Likely due to missing text model encoder layers (lora_te_text_model_encoder_layers). There's also nothing updating text encoding in LoraManager for diffusers.

Edit: Conversion times from .safetensors to diffusers is fast, so I see no QOL impacts of a one-time conversion.

blessedcoolant · 2023-02-19T06:53:00Z

Okay. Managed to load a diffusers version of the Lora model but it doesn't seem to be working.

damian0815 · 2023-02-19T09:44:39Z

re: the prompt syntax. is this the way LORA's are going to be activated, as a prompt term? if so, i'd suggest a more explicit syntax in-line with the rest of Invoke. something like withLora(lora_name [,optional weight]) so eg a cat running in the forest withLora(tiger, 0.5) to apply the tiger.lora or whatever model at 50% strength

@jordanramstad where is the prompt parser logic happening in your code? i didn't see it somewhere obvious.

ChrGriffin · 2023-02-19T22:24:06Z

I believe the reasoning for the syntax is that it mimics A1111's syntax for loading LoRAs, making transitioning back and forth easier.

felorhik · 2023-02-20T00:35:32Z

success!

@neecapp the PR you linked really helped.

Now supports safetensors made in other formats, it will load and merge into the current model when it runs, rather then try to convert or force it to load in diffusers format.

Converting may still be done, but this allows for the text encoder to be supported as well.

EDIT: got a little excited with it working, it will re-apply the weights on each execution without clearing, leading to gradual burning. Working on resetting it after each run to get around that.

neecapp · 2023-02-20T01:50:13Z

I'll try to look at it again if I have some time, but it looks like you'll load the lora repeatedly for every successive run, which you likely do not want.

On mobile at the moment, but it may be better to update hijack text_encode.forward and unet.forward to apply weights in a callback to a set of functions controlled by lora_manager, so that the prompt multiplier can be dynamically changed and the lora is only loaded at-most-once (depending on prompts).

felorhik · 2023-02-20T09:40:19Z

Latest commit will break what was working before.

I have started to adjust it to apply layers with the weight data, but running into issues with matching tensors.
I don't think ill solve it tonight, so I have committed it if anyone else wants to take a look.

Rewrite lora manager with hooks

…nvokeAI into add_lora_support

…d format prompt for compel

neecapp · 2023-02-23T23:05:35Z

Never mind, looks like they want people to just use peft.

…ompts

tweaks and small refactors

felorhik · 2023-02-26T03:26:29Z

@neecapp @damian0815

Added a peft setup, it does not work yet, but with a lora trained with peft (https://github.com/huggingface/peft) it will try to use it.

The issue is something related to sending to the right device, but the error just dumps out the entire model, making it hard to diagnose.

It should only take affect with withLora(lora_name,1) when the folder contains a lora_config.json and lora.pt file. Otherwise it will use diffusers. The config is a little different then standard peft, since the training they set puts the instance prompt infront of the file name, though I don't think that is necessary and makes it hard to scan the dir properly.

On another note, this makes 3 variation of Lora. diffusers, peft and "legacy" (kohya scripts). While support for all is nice, it does feel like we should focus on one, the code is getting kind of messy having support for the different types atm.

neecapp · 2023-02-26T03:52:28Z

I've been reading when I can. To be honest, as much as people may want the "legacy" variant to go away, that format covers 99.99% of all existing LoRA models. I don't think I've even seen a diffuser LoRA outside of engineer repos--with only unet support, they aren't very good. I doubt the Kohya variant is going anywhere anytime soon.

When the legacy format dominates and "just works" with A1111, most people will not think of it as legacy, they will see it as Invoke not having good LoRA support.

Haven't had much availability to think about a proper design for all of this, but there needs to be a good, low friction, way to support Kohya LoRA. In all reality, peft isn't a solution to the problem--at least not yet.

Probably an unpopular opinion, but I see supporting legacy as the immediate need with anything else as secondary, which can be done with follow-on PRs as PEFT et al. mature.

All of this is up to the Invoke team, of course. Apart from this PR, I'm not really active on here.

Edit: Source of the use peft for lora comment: huggingface/transformers#21770

ChrGriffin · 2023-02-26T04:45:44Z

That may be an unpopular opinion within the devs, but as a user, I'm here saying I don't care -- at all -- about supporting the diffuser format of LoRAs. Effectively none of the LoRAs available to "average users" are in the diffuser format. For example, take a quick browse of the LoRAs available to users on Civitai and you'll find that all of them are in Safetensor format. It's less easily searchable, but similarly, you'll find that most or all LoRAs on HuggingFace are in Safetensor format. And finally, Automatic1111, the de facto Stable Diffusion webui, uses Safetensor format LoRAs.

Maybe it's "legacy", but it's what the Stable Diffusion community uses, overwhelmingly. Choosing not to support Safetensor LoRAs is effectively choosing not to support LoRAs at all.

simonfuhrmann · 2023-03-04T18:51:28Z

With all the work that has already been done here, and all the knowledge gathered, what is the state of things? I am really looking forward to using Loras with Invoke, as personally, I much prefer the Invoke UI over A1111.

FWIW, I do agree with the general sentiment here, that supporting the de-facto standard format is lot more valuable, and efforts to support diffusers can be done in follow up changes.

felorhik · 2023-03-09T22:41:41Z

Just wanted to post an update.

With the talk of code freeze with the implementation of nodes. I have paused on doing much here.

There also appears to be another evolution of LoRA worth keeping an eye on https://github.com/KohakuBlueleaf/LyCORIS
Right now there is various implementations, but the kohya method seems to be the standard, even LyCORIS is utilizing it as a base. I am going to keep an eye on things for the time being, and will make revisions here once things settle a little.

ChrGriffin · 2023-03-10T00:33:41Z

Disappointing to see Invoke lagging so far behind other solutions in the SD space, but I do understand the perspective of waiting for a "settled" solution before implementing anything. I wonder if a solution for user-created extensions, like A1111 has, would ease over these issues. Then users could develop or install extensions for whichever LoRA implementation they use and the core Invoke codebase wouldn't be polluted with three or four different ways of loading LoRAs.

knoopx · 2023-03-18T11:38:38Z

I see some confusion here, some clarifications:

not sure what you refer to LEGACY here. everything LoRA related is edge, and if anything you could call it unadopted. compviz torch state dumps (ckpts) are neither legacy. diffusers is just an alternative implementation.
safetensors is a serialization format, it has nothing to do with LoRA. Can be used regardless what strategy was used to train.
different LoRA sizes depends on the rank (how many effective parameters are trained). huge LoRA's > 256mb usually come from people lacking understanding as it barely offers any improvement in quality.
https://github.com/cloneofsimo/lora/ was the first implementation, based on diffusers. spits out two files, one for the unet and one for the text encoder. it is also used by the popular (and buggy) webui extension
https://github.com/kohya-ss/sd-scripts took cloneofsimo work (and others) and evolved it. trains both unet and text encoder and spits a single file with combined weights. probably the most popular implementation given it is the most flexible and portable.
https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py diffusers example, nobody uses it. trains only the unet.
https://github.com/huggingface/peft huggingface library meant to be universal, it is great but nobody uses it for stable diffusion. HF-style, messes up key names and doesn't have a proper "specification", it is a user choice how it is serialized.
https://github.com/KohakuBlueleaf/LoCon an enhanced method to train additional layers, some people already using it. kohya experimented with it in the past and did not see any advantage.
https://github.com/KohakuBlueleaf/LyCORIS too new, not adopted yet. needs to be validated
webui supports kohya-ss and diffusers-based LoRA natively. this extension acts as a platform to support additional "hypernetworks", including LoCon
on-the-fly conversion of formats is feasible, even whole checkpoints (as far as you can fit them on ram, it is fast).
LoRA is probably the future for running/training models on consumer-grade hardware.

Void2258 · 2023-03-19T01:50:40Z

Quick comment on the syntax. Just to be consistent with the rest of invoke, the weight should probably go outside ie <lora:name_of_lora>1 (or even (<lora:name_of_lora>)1 which is how textual inversions currently work though I hope it's changed since it's a bit heavy on the delimiters) if you go this way and not with the withlora version. This would still be close enough to A1111 not to be a huge hassle swapping back and forth. (if this was already settled, please ignore me; this is a lot to read through).

Regarding safetensor format, if there is a ton in that format, it needs support. That's why .doc is still supported even though it's been over a decade since it was technically superseded by .docx. Failing to support a very common format is equivalent to something not being supported for many people.

lstein · 2023-03-28T23:27:13Z

It’s going to be a while before main is ready to receive LORA support. Do you have any interest in putting just the kohya support into the legacy 2.3 branch?

lstein · 2023-03-29T03:42:47Z

I just did an experimental merge of this PR into the 2.3 branch, and it went in cleanly. I will test the code out tomorrow, but if it works as advertised I would propose that we go with it in that branch until the nodes refactor stabililzes.

lzardy · 2023-03-30T09:34:07Z

Hello, just passing by to say I think this PR is an example of what could be part of a plugin system.
Considering the high likelyhood that we will see new formats of models, inversions, dreambooth, and now LoRA, I believe you will be well rewarded for implementing some kind of API/Hooks at the very least.
This kind of thing is for the long term and best to be done sooner than later.
Regards, lizard man.

lstein · 2023-03-30T11:35:38Z

@felorhik @neecapp After some minor fixes to the way the lora paths are assigned in legacy_lora_manager I was able to get a Kohya lora model to load. Unfortunately when I try to generate an image I get a freeze before the first denoising step executes. Was the "legacy" support ever working in this PR? I'd be grateful for any help with this. As noted above, the 3.0 main release is still a few weeks away from release and it would be nice to have a way to run LoRA files in the interim.

I've made a new PR rebased against 2.3: #3072

felorhik · 2023-03-31T06:02:35Z

Thank you @lstein

Given your work, I am going to close this PR. I have been working mostly with LyCORIS now so I may open another PR for that in the future.

initial setup of lora support

141be95

felorhik requested review from damian0815, keturn, lstein, mauwii and ebr as code owners February 18, 2023 12:35

add pending support for safetensors with cloneofsimo/lora

afc8639

Create convert_lora.py

5a7145c

change to new method to load safetensors

82e4d5a

start of rewrite for add / remove

096e1d3

lstein requested a review from GreggHelt2 February 20, 2023 12:36

neecapp and others added 2 commits February 20, 2023 13:49

Rewrite lora manager with hooks

e744774

Merge pull request #1 from neecapp/add_lora_support

404000b

Rewrite lora manager with hooks

felorhik requested a review from blessedcoolant as a code owner February 20, 2023 20:31

code cleanup

8f6e43d

felorhik added 3 commits February 23, 2023 01:44

re-enable load attn procs support (no multiplier)

71972c3

Merge branch 'add_lora_support' of https://github.com/jordanramstad/I…

3f477da

…nvokeAI into add_lora_support

setup legacy class to abstract hacky logic for none diffusers lora an…

f64a4db

…d format prompt for compel

felorhik and others added 12 commits February 23, 2023 15:15

Merge branch 'main' into add_lora_support

8e1fd92

switch all none diffusers stuff to legacy, and load through compel pr…

6a1129a

…ompts

initial setup of cross attention

b69f9d4

move legacy lora manager to its own file

68a3132

setup cross conditioning for lora

4ce8b1b

Merge branch 'main' into add_lora_support

6a79484

simplify manager

523e44c

tweaks and small refactors

7dbe027

Merge pull request #4 from damian0815/pr/2712

036ca31

tweaks and small refactors

Merge branch 'main' into add_lora_support

ef82290

add peft setup (need to install huggingface/peft)

d9c4627

Merge branch 'main' into add_lora_support

9cf7e5f

felorhik closed this Mar 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial setup of lora support #2712

Initial setup of lora support #2712

felorhik commented Feb 18, 2023 •

edited

Loading

blessedcoolant commented Feb 18, 2023

ChrGriffin commented Feb 18, 2023

felorhik commented Feb 19, 2023

blessedcoolant commented Feb 19, 2023

felorhik commented Feb 19, 2023

ChrGriffin commented Feb 19, 2023 •

edited

Loading

felorhik commented Feb 19, 2023

blessedcoolant commented Feb 19, 2023

neecapp commented Feb 19, 2023 •

edited

Loading

blessedcoolant commented Feb 19, 2023 •

edited

Loading

damian0815 commented Feb 19, 2023 •

edited

Loading

ChrGriffin commented Feb 19, 2023

felorhik commented Feb 20, 2023 •

edited

Loading

neecapp commented Feb 20, 2023

felorhik commented Feb 20, 2023

neecapp commented Feb 23, 2023 •

edited

Loading

felorhik commented Feb 26, 2023

neecapp commented Feb 26, 2023 •

edited

Loading

ChrGriffin commented Feb 26, 2023 •

edited

Loading

simonfuhrmann commented Mar 4, 2023

felorhik commented Mar 9, 2023

ChrGriffin commented Mar 10, 2023

knoopx commented Mar 18, 2023 •

edited

Loading

Void2258 commented Mar 19, 2023 •

edited

Loading

lstein commented Mar 28, 2023

lstein commented Mar 29, 2023

lzardy commented Mar 30, 2023

lstein commented Mar 30, 2023 •

edited

Loading

felorhik commented Mar 31, 2023

Initial setup of lora support #2712

Initial setup of lora support #2712

Conversation

felorhik commented Feb 18, 2023 • edited Loading

blessedcoolant commented Feb 18, 2023

ChrGriffin commented Feb 18, 2023

felorhik commented Feb 19, 2023

blessedcoolant commented Feb 19, 2023

felorhik commented Feb 19, 2023

ChrGriffin commented Feb 19, 2023 • edited Loading

felorhik commented Feb 19, 2023

blessedcoolant commented Feb 19, 2023

neecapp commented Feb 19, 2023 • edited Loading

blessedcoolant commented Feb 19, 2023 • edited Loading

damian0815 commented Feb 19, 2023 • edited Loading

ChrGriffin commented Feb 19, 2023

felorhik commented Feb 20, 2023 • edited Loading

neecapp commented Feb 20, 2023

felorhik commented Feb 20, 2023

neecapp commented Feb 23, 2023 • edited Loading

felorhik commented Feb 26, 2023

neecapp commented Feb 26, 2023 • edited Loading

ChrGriffin commented Feb 26, 2023 • edited Loading

simonfuhrmann commented Mar 4, 2023

felorhik commented Mar 9, 2023

ChrGriffin commented Mar 10, 2023

knoopx commented Mar 18, 2023 • edited Loading

Void2258 commented Mar 19, 2023 • edited Loading

lstein commented Mar 28, 2023

lstein commented Mar 29, 2023

lzardy commented Mar 30, 2023

lstein commented Mar 30, 2023 • edited Loading

felorhik commented Mar 31, 2023

felorhik commented Feb 18, 2023 •

edited

Loading

ChrGriffin commented Feb 19, 2023 •

edited

Loading

neecapp commented Feb 19, 2023 •

edited

Loading

blessedcoolant commented Feb 19, 2023 •

edited

Loading

damian0815 commented Feb 19, 2023 •

edited

Loading

felorhik commented Feb 20, 2023 •

edited

Loading

neecapp commented Feb 23, 2023 •

edited

Loading

neecapp commented Feb 26, 2023 •

edited

Loading

ChrGriffin commented Feb 26, 2023 •

edited

Loading

knoopx commented Mar 18, 2023 •

edited

Loading

Void2258 commented Mar 19, 2023 •

edited

Loading

lstein commented Mar 30, 2023 •

edited

Loading