Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Stable Video Diffusion from SAI #339

Closed
uk9921 opened this issue Nov 22, 2023 · 12 comments
Closed

[Feature]: Stable Video Diffusion from SAI #339

uk9921 opened this issue Nov 22, 2023 · 12 comments

Comments

@uk9921
Copy link

uk9921 commented Nov 22, 2023

Expected behavior

stable-diffusion released img2video model. Do you have plan to support it?

@uk9921 uk9921 changed the title [Feature]: SVD推理有可能合并到这个插件中吗 [Feature]: SDV推理有可能合并到这个插件中吗 Nov 22, 2023
@uk9921 uk9921 changed the title [Feature]: SDV推理有可能合并到这个插件中吗 [Feature]: SVD推理有可能合并到这个插件中吗 Nov 22, 2023
@continue-revolution continue-revolution changed the title [Feature]: SVD推理有可能合并到这个插件中吗 [Feature]: Stable Video Diffusion from SAI Nov 22, 2023
@continue-revolution
Copy link
Owner

I've already noticed this model. I do have plan to support it in some way. Please wait patiently.

@drhead
Copy link

drhead commented Nov 22, 2023

I have actually tested using the special VAE decoder from the SVD model with AnimateDiff outputs and it helps output quality quite a bit, removing a lot of noise. That could be implemented relatively easily and bring a great benefit.

@continue-revolution
Copy link
Owner

Interesting. @drhead Do you mean replacing the original VAE with SVD VAE?

@uk9921
Copy link
Author

uk9921 commented Nov 22, 2023

@drhead Can you elaborate more on this? Do you input the latents generated by Animatediff into the VAE decoder of SVD?

@drhead
Copy link

drhead commented Nov 22, 2023

Interesting. @drhead Do you mean replacing the original VAE with SVD VAE?

Latents output from AnimateDiff can be decoded using the decoder half of the SVD VAE. They are then decoded with temporal awareness, which eliminates a large amount of noise in the output. I have only been able to do a modest amount of testing and have only tested it on realistic-style videos (since lack of training data may mean there will be issues decoding animation-style videos), but from what I can tell so far it does work rather well. Main issue is porting the model out of the SGM codebase.

@continue-revolution
Copy link
Owner

@drhead it would be helpful if you could share some code with me

@uk9921
Copy link
Author

uk9921 commented Nov 23, 2023

@drhead Very reasonable. We are very much looking forward to you sharing some examples of the differences before and after using SVD-VAE.

@zhangrc
Copy link
Contributor

zhangrc commented Nov 23, 2023

Is my 3060 still sufficient? Is it necessary to upgrade the GPU

@andzejsp
Copy link

when? world is waiting

@uk9921
Copy link
Author

uk9921 commented Nov 23, 2023

@zhangrc We can run with 6GB VRAM now, ref to this: https://github.com/brycedrennan/imaginAIry

@uk9921
Copy link
Author

uk9921 commented Nov 23, 2023

f5 is broken

@continue-revolution
Copy link
Owner

this is now available via Forge. Unfortunately it is too hard if we want it on OG A1111.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants