-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache IPAdapter instances to avoid expensive KV extraction on every generation #335
Conversation
I think that it's working, now when I use IpAdapters (in this example I go for instantID so I get 2 IpAdapters) and I generate images over and over with the same model, it starts without much delay, here's my logs
I still have those Do you also intend to add a feature to unload the previous checkpoint when you switch models? That's the biggest weakness of |
Thanks for trying this out!!
Those are unrelated and are not expected to be changed by this PR.
Not sure I fully understand: this change isn't about checkpoints, it's about caching some of the data derived from the ipadapter models. Perhaps unloading previous checkpoints is a separate concern we can tackle under a different PR. |
will take a look soon |
hi we are going to close PRs before forge's recent major revision |
Hi all, I'm not deeply familiar with the code I've touched here, so I'm very open to feedback on this PR.
Description
Currently
apply_ipadapter
reconstructs anIPAdapter
instance on every invocation, even when reusing the same model. This is an expensive operation primarily due to the call toTo_KV()
, and to a lesser degree due to the calls toinit_proj*()
.This PR caches the IPAdapter() keyed off of the IPAdapter model filename if running with
--always-high-vram
.Why?
This has a significant performance impact on Deforum, which I'm currently porting to Forge as per #96).
Without this change, Forge is slower than A1111 on a simple Deforum run with IPAdapter enabled, despite having higher it/s. With this change, it is substantially faster that A1111.
The attached 120 frame Deforum settings file runs as follows on my 3090 / i5-4590 @ 3.30GHz:
deforum_settings.txt
Outside of Deforum, this also benefits runs with batch size>1 or any repeated gens using IPAdapter (shaves a few seconds off the initialisation time before you see the it/s gauge).
Screenshots/videos:
n/a
Checklist: