-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the image size #10
Comments
@SaulZhang Hi, thanks for your inquiry! For the first question, since the stable diffusion PTM is for 512*512, so I guess the performance will drop, I recommend you to try fp16 training with 512 * 512 resolution. For the second quesiton, the answer is yes! you can also increase the batch size. We use single GPU to avoid sample a same case for multiple times, thereby affect FID score. (for normal experiment, you may feel free to use multiple GPU and large batch size) |
Thanks for your reply. I have tried setting |
@SaulZhang Hi, I guess it may work, or you can also try to enable gradient checkpointing to save vram. It seems to be already implemented by Diffusers. |
Okay, thank you for the suggestion. I will give it a try. |
@SaulZhang Hi, do you calculate the FID score across the whole dataset, or only a subset? |
I calculate the FID score across the whole testing set, and don't modify the code of sample. |
After thorough debugging, I've identified that the main issue lies within this particular line of code. |
Hi, thank you for your wonderful work. I have some questions.
1.Due to the limited memory of each GPU (8*A100 40G), I can only resize images into 256x256 (but not 512x512) so that my GPU can accommodate one story to train AR-LDM. What impact will this change have on the FID score?
2.Besides, the process of sample is very slow. Can we perform the sampling on multiple GPUs?
The text was updated successfully, but these errors were encountered: