-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
image_size = [256,512] #62
Comments
Yes, all except LatteT2V are trained on 256 × 256 pixels. Maybe you can see if you can get a higher-resolution video without training. |
你好,请问ucf101.pt模型支持微调吗 |
当然可以finetune ucf101.pt这个模型 |
你可以使用我们提供的train.py去微调该模型 |
我尝试调整image_size为512来得到更高分辨率的视频,出现错误
(latte) yueyc@super-AS-4124GS-TNR:~/Latte$ bash sample/ffs.sh
Using Ema!
Traceback (most recent call last):
File "/home/yueyc/Latte/sample/sample.py", line 143, in
main(omega_conf)
File "/home/yueyc/Latte/sample/sample.py", line 67, in main
model.load_state_dict(state_dict)
File "/home/yueyc/anaconda3/envs/latte/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Latte:
size mismatch for pos_embed: copying a param with shape torch.Size([1, 256, 1152]) from checkpoint, the shape in current model is torch.Size([1, 1024, 1152]).
请问预训练的模型ffs.pt采用的分辨率是256px吗,是不是代表如果要采样得到更高分辨率的视频,就需要训练一个512px分辨率的视频
The text was updated successfully, but these errors were encountered: