-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: support TAESD - Tiny Autoencoder for Stable Diffusion #4316
Conversation
I was testing out the Tiny AutoEncoder last night. It definitely is a massive time and speed improvement but the quality of the output is a little bit inferior to the full encoder as expected. Especially prevalent in realistic models where areas like the eyes that need more detail just do not translate well. Will give this PR a run in a bit. Maybe we should also consider adding TAESD progress previews? |
Yep, I don't expect it to be a finishing step for any production work, but it'll be useful for previews and I think also certain kinds of intermediate steps that go on to to be further mashed or re-noised. And I guess I could imagine a workflow where, if the normal VAE is slow enough for some people, they might elect to do a quick batch generation and then only use the full VAE on the ones they want to keep? |
Yeah. Definitely handy to have. Esp if we can do progress previews with TAESD too. That would cut down the generation times coz each step callback decoding will take considerably lesser time. I'm actually half in mind to actually ship the TAESD models as core models coz they're pretty small and provide the user with an option next to VAE to check Tiny. In which case, we just use the Tiny AutoEncoder otherwise we use the regular. Unless we anticipate more Tiny AE models and having it locked to just TAESD might be an issue later. |
# Conflicts: # invokeai/app/invocations/latent.py
I think the encoder is buggy, and I think the bug is upstream: huggingface/diffusers#4676 Update: just waiting on a diffusers release > 0.20 for the fix (diffusers 0.20.2 is out but seems to be a narrowly focused release that doesn't include this; I assume that means we're waiting for 0.21): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@singledispatchmethod
- cool!
I agree - we should make TAESD a core model. No reason not to.
Then the only question is how to manage preview images with it? The step_callback
has access to the InvocationContext
and therefore has access to the model manager, so it could easily decode the image in there. I'm just concerned about the overhead of doing this up to 40 times per second.
Anyways, that's for a followup PR.
# Conflicts: # invokeai/app/invocations/latent.py
@lstein @StAlKeR7779 could you take a look at this? would be awesome if we could get this in for 3.1 |
If we are adding this, then @lstein will need to add the model support during installation. |
We can have this PR merged, and have included TASED as a core model as a follow-up. It's my understanding this PR supports the use of TASED in Invoke and if a user wants it, they'll have to install it manually |
…port. Now available in diffusers 0.21: huggingface/diffusers#4627
Doing some manual testing after upgrading to diffusers 0.21. TAESD encoding and decoding looks to be working okay, but the normal VAE is failing with
|
but I've just confirmed that same failure is on |
I think this should be fixed by: #4534 |
Oh, in that |
# Conflicts: # invokeai/backend/model_management/model_probe.py
The code that determines whether a VAE's base model is SD or SDXL is here: InvokeAI/invokeai/backend/model_management/model_probe.py Lines 470 to 481 in f222b87
it checks certain values inside the VAE's config.json. However, for TAESD, the config.json files for taesd and taesdxl are identical. Indeed, there's no reason for them to have any different parameters, as the shape of the VAE are the same. The only heuristic I can think of is to check to see if the model name literally ends in Or to define some other metadata field and ask @madebyollin to add it to the model's config. |
TAESD - Tiny Autoencoder for Stable Diffusion - is a tiny VAE that provides significantly better results than my single-multiplication hack but is still very fast.
The entire TAESD model weights are under 10 MB!
This PR requires diffusers 0.20:
To Do
Test with
Have you discussed this change with the InvokeAI team?
Have you updated all relevant documentation?
Related Tickets & Documents
QA Instructions, Screenshots, Recordings
Should be able to import these models:
and use them as VAE.
Added/updated tests?