Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The released I2VGen-XL model seems like a single stage model #49

Closed
junbangliang opened this issue Jan 1, 2024 · 3 comments
Closed

Comments

@junbangliang
Copy link

Hi,

Thanks for sharing the code for I2VGen-XL model.

From closer inspection of the unet_i2vgen.py file, it seems to me the released I2VGen-XL model is a single stage model where it takes in both image and text at the same time, without performing two stage processing as claimed in the paper. Is this the case?

Thanks!

@Steven-SWZhang
Copy link
Collaborator

Yes, in this open-source project, we have only released the model for single-stage high-definition video generation that can fully preserve the content of the input image, for the convenience of academic use. As for the multi-stage video generation models, we currently do not have plans to open-source it. Thank you for your interest in our work.

@junbangliang
Copy link
Author

junbangliang commented Jan 8, 2024

Hi,

Thanks for your previous answer, I have another question on the released single stage I2V model. Is it also trained with 35 million videos and 6 billion images?

Thanks!

@junbangliang junbangliang reopened this Jan 8, 2024
@Steven-SWZhang
Copy link
Collaborator

Hi,

Thanks for your previous answer, I am another question on the released single stage I2V model. Is it also trained with 35 million videos and 6 billion images?

Thanks!

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants