You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your work and well-organized repo. Reading through the paper, I was unable to locate ablations on effect of batch size (or effective batch size) on the generation performance. Could you provide any insight into how batch size affects the quality of video generation? In particular if using effective batch size through gradient accumulation steps, would you increase the total training iters to compensate?
Intuitively, it would be obvious that a higher batch size correlates with better performance (as shown through the efficacy of image-video joint training), but I was curious whether the benefits tapered off at all with the specific model size since the whole pipeline is relatively expensive to train, especially if we have to scale for gradient accumulation steps.
Thanks.
The text was updated successfully, but these errors were encountered:
Thank you for your work and well-organized repo. Reading through the paper, I was unable to locate ablations on effect of batch size (or effective batch size) on the generation performance. Could you provide any insight into how batch size affects the quality of video generation? In particular if using effective batch size through gradient accumulation steps, would you increase the total training iters to compensate?
Intuitively, it would be obvious that a higher batch size correlates with better performance (as shown through the efficacy of image-video joint training), but I was curious whether the benefits tapered off at all with the specific model size since the whole pipeline is relatively expensive to train, especially if we have to scale for gradient accumulation steps.
Thanks.
Thanks for your interest. I also think a larger batch size leads to better performance. But in my experience so far, using gradient accumulative does not provide significant gains for text-to-video tasks.
Hi,
Thank you for your work and well-organized repo. Reading through the paper, I was unable to locate ablations on effect of batch size (or effective batch size) on the generation performance. Could you provide any insight into how batch size affects the quality of video generation? In particular if using effective batch size through gradient accumulation steps, would you increase the total training iters to compensate?
Intuitively, it would be obvious that a higher batch size correlates with better performance (as shown through the efficacy of image-video joint training), but I was curious whether the benefits tapered off at all with the specific model size since the whole pipeline is relatively expensive to train, especially if we have to scale for gradient accumulation steps.
Thanks.
The text was updated successfully, but these errors were encountered: