-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial setup of lora support #2712
Conversation
Turns out finding a LoRA model online that is in the diffusers format is ending up being harder than I expected. Most of them are shared as safetensors. I think I'll train my own Lora and save as diffusers and then try this PR out. |
Getting LoRA support going is great, but as you point out, the vast majority are in safetensors format -- might be better to aim to support that up front? |
Working on getting safetensors to work, however there is some complexity to it. https://github.com/cloneofsimo/lora uses a format compatible with diffusers, and can be used to load safetensors based lora files, provided some patching is done on the pipeline. There are guides in the repo for that I have been experimenting with. https://github.com/cloneofsimo/lora/blob/71c8c1dba595d77d0eabdf9c278630168e5a8ce1/scripts/run_inference.ipynb Most have been trained with https://github.com/kohya-ss/sd-scripts which uses its own format for the keys. I have been attempting to convert them and save in diffusers format, but no luck yet. Given that, I am leaning more towards needing a conversion for the kohya scripts, rather then supporting them natively. Though knowing which method was used to train may be difficult. Still experimenting but overall diffusers can make a 3mb lora file, vs 150mb of the kohya method. I have not seen a difference in the quality either. For training as it currently is: Using the following diffusers training scripts I have had success loading them |
Is there a script somewhere that allows us to convert Lora models from safetensors or whatever format to diffusers? If there is one, we could integrate that and load through that maybe? |
Added an adjustment which should be able to load safetensors made by https://github.com/cloneofsimo/lora though still working on testing / debugging it.
|
I have no idea if any of this information will be useful to you, but maybe it'll at least inspire some fresh Google terms! I've been using the HuggingFace Diffusers package, specifically the LoRA Dreambooth example with basically zero changes, in order to train my LoRAs (the same one you linked). Unfortunately, the package produces a .bin file, not compatible with A1111 (changing the extension isn't enough). A little Googling did bring me to this thread with the script by ignacfetser at the bottom. It's hardly the most... uh... structurally sound solution, but I can confirm that for now at least, it does work. Of course, this isn't the exact issue you're encountering, but I figured I'd drop it here if any of the information was helpful. |
@ChrGriffin It did help, gave me a good direction on a conversion script. It saves as diffusers, and can be loaded after running
Although i have not tested it yet, it should work with ckpt too
It should save to ./models/lora/lora_file_name and be usable by Not seeing great results out of it yet, but it is loading into the pipeline at least. |
If the conversion time is short, I think we can effectively do a one time conversion of the safetensor/ckpt model to diffusers. I think that might be more ideal because in the long run, we want to do Diffusers. And that way, we can avoid installing the original repo as a dependency and fully have it work through the Diffusers pipeline. |
huggingface/diffusers#2403 may be of some interest--though, you've already figured out the key mapping part. I tried a variant of that in place of the cloneofismo monkey patch with better results (no errors for missing alphas). Still not perfect, seems like either the math is off somewhere or this lora I am testing with is junk... Using the conversion script does not work for me. Likely due to missing text model encoder layers ( Edit: Conversion times from |
Okay. Managed to load a diffusers version of the Lora model but it doesn't seem to be working. |
re: the prompt syntax. is this the way LORA's are going to be activated, as a prompt term? if so, i'd suggest a more explicit syntax in-line with the rest of Invoke. something like @jordanramstad where is the prompt parser logic happening in your code? i didn't see it somewhere obvious. |
I believe the reasoning for the syntax is that it mimics A1111's syntax for loading LoRAs, making transitioning back and forth easier. |
success! @neecapp the PR you linked really helped. Now supports safetensors made in other formats, it will load and merge into the current model when it runs, rather then try to convert or force it to load in diffusers format. Converting may still be done, but this allows for the text encoder to be supported as well. EDIT: got a little excited with it working, it will re-apply the weights on each execution without clearing, leading to gradual burning. Working on resetting it after each run to get around that. |
I'll try to look at it again if I have some time, but it looks like you'll load the lora repeatedly for every successive run, which you likely do not want. On mobile at the moment, but it may be better to update hijack |
Latest commit will break what was working before. I have started to adjust it to apply layers with the weight data, but running into issues with matching tensors. |
Rewrite lora manager with hooks
…nvokeAI into add_lora_support
…d format prompt for compel
Never mind, looks like they want people to just use peft. |
tweaks and small refactors
Added a peft setup, it does not work yet, but with a lora trained with peft (https://github.com/huggingface/peft) it will try to use it. The issue is something related to sending to the right device, but the error just dumps out the entire model, making it hard to diagnose. It should only take affect with On another note, this makes 3 variation of Lora. diffusers, peft and "legacy" (kohya scripts). While support for all is nice, it does feel like we should focus on one, the code is getting kind of messy having support for the different types atm. |
I've been reading when I can. To be honest, as much as people may want the "legacy" variant to go away, that format covers 99.99% of all existing LoRA models. I don't think I've even seen a diffuser LoRA outside of engineer repos--with only unet support, they aren't very good. I doubt the Kohya variant is going anywhere anytime soon. When the legacy format dominates and "just works" with A1111, most people will not think of it as legacy, they will see it as Invoke not having good LoRA support. Haven't had much availability to think about a proper design for all of this, but there needs to be a good, low friction, way to support Kohya LoRA. In all reality, peft isn't a solution to the problem--at least not yet. Probably an unpopular opinion, but I see supporting legacy as the immediate need with anything else as secondary, which can be done with follow-on PRs as PEFT et al. mature. All of this is up to the Invoke team, of course. Apart from this PR, I'm not really active on here. Edit: Source of the use peft for lora comment: huggingface/transformers#21770 |
That may be an unpopular opinion within the devs, but as a user, I'm here saying I don't care -- at all -- about supporting the diffuser format of LoRAs. Effectively none of the LoRAs available to "average users" are in the diffuser format. For example, take a quick browse of the LoRAs available to users on Civitai and you'll find that all of them are in Safetensor format. It's less easily searchable, but similarly, you'll find that most or all LoRAs on HuggingFace are in Safetensor format. And finally, Automatic1111, the de facto Stable Diffusion webui, uses Safetensor format LoRAs. Maybe it's "legacy", but it's what the Stable Diffusion community uses, overwhelmingly. Choosing not to support Safetensor LoRAs is effectively choosing not to support LoRAs at all. |
With all the work that has already been done here, and all the knowledge gathered, what is the state of things? I am really looking forward to using Loras with Invoke, as personally, I much prefer the Invoke UI over A1111. FWIW, I do agree with the general sentiment here, that supporting the de-facto standard format is lot more valuable, and efforts to support diffusers can be done in follow up changes. |
Just wanted to post an update. With the talk of code freeze with the implementation of nodes. I have paused on doing much here. There also appears to be another evolution of LoRA worth keeping an eye on https://github.com/KohakuBlueleaf/LyCORIS |
Disappointing to see Invoke lagging so far behind other solutions in the SD space, but I do understand the perspective of waiting for a "settled" solution before implementing anything. I wonder if a solution for user-created extensions, like A1111 has, would ease over these issues. Then users could develop or install extensions for whichever LoRA implementation they use and the core Invoke codebase wouldn't be polluted with three or four different ways of loading LoRAs. |
I see some confusion here, some clarifications:
|
Quick comment on the syntax. Just to be consistent with the rest of invoke, the weight should probably go outside ie <lora:name_of_lora>1 (or even (<lora:name_of_lora>)1 which is how textual inversions currently work though I hope it's changed since it's a bit heavy on the delimiters) if you go this way and not with the Regarding safetensor format, if there is a ton in that format, it needs support. That's why .doc is still supported even though it's been over a decade since it was technically superseded by .docx. Failing to support a very common format is equivalent to something not being supported for many people. |
It’s going to be a while before |
I just did an experimental merge of this PR into the 2.3 branch, and it went in cleanly. I will test the code out tomorrow, but if it works as advertised I would propose that we go with it in that branch until the nodes refactor stabililzes. |
Hello, just passing by to say I think this PR is an example of what could be part of a plugin system. |
@felorhik @neecapp After some minor fixes to the way the lora paths are assigned in I've made a new PR rebased against 2.3: #3072 |
Thank you @lstein Given your work, I am going to close this PR. I have been working mostly with LyCORIS now so I may open another PR for that in the future. |
I wanted to get Lora support going, so I decided to take a crack at it.
It uses the same syntax as https://github.com/AUTOMATIC1111/stable-diffusion-webui like so:
prompt: main prompt, <lora:name_of_lora:1>
Right now the weight does nothing, it will always be 1.
Also the lora must be trained in diffusers format https://huggingface.co/docs/diffusers/training/lora
This also requires the base diffusers version to be raised from 0.11 to 0.13
it should be able to support multiple, but I have not tested it that deeply. This is far from ready, but might be useful for anyone wanting to experiment with lora in InvokeUI