-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lowering gpu requirement hyperparameter setting #16
Comments
Hi, my suggestion is to start with gamma=10 and try different itc and itd. The reason is: 1) I only roughly tuned hyper-parameters, the provided hyper-parameters are not optimal, so if you are looking for better performance, you may want to tune them; 2) although each GPU has 16 samples, the total mini-batch size is different when you use different number of GPUs, which I think will influence the hyper-parameter settings, but I think gamma=10 will lead to promising results. |
Thanks for your advice, It's a great help for me. I still have some questions about hyper-parameters tuning and batch size.
|
Yes, I think you can tune the hyper-parameters by searching in this range. I think larger batch size will lead to performance improvement, because it could provide better discriminative information for training. But in that case you may also need to tune some hyper-parameters for contrastive loss (--temp=0.5, --lam=0. are tuned based on batch size=16 per GPU). |
I appreciate for you answering my questions. I will close this issue and do some experiment mentioned above. |
@StolasIn Have you reproduced the results with one gpu now? I have reproduced the results of the paper under the setting of four gpus (batch=32, btach_gpu=8 gets better results than the paper). But when I try to experiment with one gpu, I get only poor results. |
The performance is related to many things, batch size, learning rate, regularizer ... For example, for StyleGAN2 without contrastive loss (for image generation not text-to-image generation), GPUs still matters a lot. |
Assume under 4 GPU setting, each GPU has N samples, resulting in a batch size of 4N. Are you using batch size of 4N when using one GPU? |
@drboog Yes, I did, but the performance of one card is still significantly worse than four cards. Thank you very much for providing this picture. I originally thought it was the difference caused by the different number of gups corresponding to different hyperparameters when cfg=auto. |
First of all thank you very much for this great work. It really helped me a lot. but I have a few questions that I want to ask.
I'm a beginner in text to image synthesis task, so my following questions may be a little stupid.
I think gamma is only sensitive in dataset and batch size per gpu.
The text was updated successfully, but these errors were encountered: