You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues and checked the recent builds/commits of both this extension and the webui
Are you using the latest version of the extension?
I have the modelscope text2video extension updated to the lastest version and I still have the issue.
What happened?
Every time I click "generate" it tries to sample the tensors two times instead of only once. That means I have to wait double the time for only one video. I understand vid2vid got added, but even so, when I tried to do vid2vid generation, It outputs only from the text2video tab, even if samples the tensors two times.
Steps to reproduce the problem
For text2video and vid2vid
Go to ModelScope text2video
Add a prompt, for example "sunrise from tokyo, by makoto shinkai"
Click the yellow "Generate" button.
Waiting twice the time.
For vid2vid
Add a video that i got generated from text2video
Add a prompt, for example "a boy with sunglesses"
Click generate and still waiting twice the time because it does the sampling of the tesnsors twice.
Finding out that vid2vid doesn't generate from the vid2vid tab, but from text2video (which i left blank and outputs something else, like a tortoise underwater)
What should have happened?
It should sample the tensors only once if using only text2video
It should sample the tensors twice if I make add the prompts for both text2video and vid2vid.
It should sample the tensors once if only vid2vid was selected.
WebUI and Deforum extension Commit IDs
webui commit id - commit: a9fed7c3
txt2vid commit id -//github.com/deforum-art/sd-webui-modelscope-text2video.git | 8402005 (Fri Mar 24 14:49:52 2023)
What GPU were you using for launching?
RTX 3060 12GB VRAM, 16 GB Ram.
On which platform are you launching the webui backend with the extension?
Local PC setup (Windows)
Settings
--xformers --no-half-vae --api
I didn't change nothing, I just added the prompts, everything is at default with fp16 enabled for the gpu
Console logs
ModelScope text2video extension for auto1111 webui
Git commit: 84020058 (Fri Mar 24 14:49:52 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
device cuda
Working in txt2vid mode
latents torch.Size([1, 4, 24, 32, 32]) tensor(-0.0012, device='cuda:0') tensor(1.0001, device='cuda:0')
DDIM sampling tensor(1): 100%|███████████████████████████████████████| 31/31 [00:41<00:00, 1.33s/it]
STARTING VAE ON GPU. 24 CHUNKS TO PROCESS
VAE HALVED
DECODING FRAMES
VAE FINISHED
torch.Size([24, 3, 256, 256])
output/mp4s/20230324_215112403414.mp4
0%|| 0/1 [00:00<?, ?it/s]latents torch.Size([1, 4, 24, 32, 32]) tensor(-0.0007, device='cuda:0') tensor(1.0037, device='cuda:0')DDIM sampling tensor(1): 100%|███████████████████████████████████████| 31/31 [00:41<00:00, 1.34s/it]
STARTING VAE ON GPU. 24 CHUNKS TO PROCESS████████████████████████████| 31/31 [00:41<00:00, 1.34s/it]
VAE HALVED
DECODING FRAMES
VAE FINISHED
torch.Size([24, 3, 256, 256])
output/mp4s/20230324_215201616361.mp4
text2video finished, saving frames to C:\Stable Diffusion\stable-diffusion-webui\outputs/img2img-images\text2video-modelscope\20230324215000
Got a request to stitch frames to video using FFmpeg.
Frames:
C:\Stable Diffusion\stable-diffusion-webui\outputs/img2img-images\text2video-modelscope\20230324215000\%06d.png
To Video:
C:\Stable Diffusion\stable-diffusion-webui\outputs/img2img-images\text2video-modelscope\20230324215000\vid.mp4
Stitching *video*...
Stitching *video*...
Video stitching donein 0.26 seconds!
t2v complete, result saved at C:\Stable Diffusion\stable-diffusion-webui\outputs/img2img-images\text2video-modelscope\20230324215000
Additional information
No response
The text was updated successfully, but these errors were encountered:
I can confirm that vid2vid doesn't work guys. Please revert to an older version or wait for our fix, but it might take 24 hours or more since it's the weekend and we need some time off <3
Watch out for updates anyways. And thanks for providing feedback!
Is there an existing issue for this?
Are you using the latest version of the extension?
What happened?
Every time I click "generate" it tries to sample the tensors two times instead of only once. That means I have to wait double the time for only one video. I understand vid2vid got added, but even so, when I tried to do vid2vid generation, It outputs only from the text2video tab, even if samples the tensors two times.
Steps to reproduce the problem
For text2video and vid2vid
For vid2vid
What should have happened?
It should sample the tensors only once if using only text2video
It should sample the tensors twice if I make add the prompts for both text2video and vid2vid.
It should sample the tensors once if only vid2vid was selected.
WebUI and Deforum extension Commit IDs
webui commit id - commit: a9fed7c3
txt2vid commit id -//github.com/deforum-art/sd-webui-modelscope-text2video.git | 8402005 (Fri Mar 24 14:49:52 2023)
What GPU were you using for launching?
RTX 3060 12GB VRAM, 16 GB Ram.
On which platform are you launching the webui backend with the extension?
Local PC setup (Windows)
Settings
--xformers --no-half-vae --api
I didn't change nothing, I just added the prompts, everything is at default with fp16 enabled for the gpu
Console logs
Additional information
No response
The text was updated successfully, but these errors were encountered: