Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text to video pipeline #187

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

stronk-dev
Copy link
Contributor

@stronk-dev stronk-dev commented Sep 4, 2024

This PR adds support for a text-to-video pipeline using the THUDM/CogVideoX-2b and THUDM/CogVideoX-5b models

Some notes:

  • The dl_checkpoints.sh script was modified to download both models, but without any --include "*.fp16.safetensors" argument, as there does not seem to be a variant for them. I currently do force the precision using torch_dtype: FP16 for THUDM/CogVideoX-2b and BF16 for THUDM/CogVideoX-5b
  • num_frames parameter is currently hardcoded to their recommended value of 49
  • Included the VAE speedups (enable_slicing, enable_tiling), we might want to play around with these
  • Safety checker? Was added by @ad-astra-video

Dependent on livepeer/go-livepeer#3161

@stronk-dev stronk-dev marked this pull request as ready for review September 4, 2024 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants