New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BatchSize How to set in a single card? #39
Comments
As described in README, we use 8 GPUs for training and the total batch size is 64 (8x8). We didn't try to train our TDN in a single gpu but you can try this by yourself :) |
Thank you very much for your reply. I'll check the code again, set the parameters and try again.
…------------------ 原始邮件 ------------------
发件人: "MCG-NJU/TDN" ***@***.***>;
发送时间: 2021年9月17日(星期五) 晚上6:53
***@***.***>;
***@***.******@***.***>;
主题: Re: [MCG-NJU/TDN] BatchSize How to set in a single card? (#39)
As described in README, we use 8 GPUs for training and the total batch size is 64 (8x8). We didn't try to train our TDN in a single gpu but you can try this by yourself :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
Hello @1145335145. Can you tell me on which dataset did your train results were lower? |
Hello.My dataset is sthv1, which is trained on a single card 3090 with BS = 16 and LR = 0.02, ACC = 46.3%. The author said that they have BS = 8 for each GPU on 8gpus, so the BS on a single card should be 64. I have another training on BS = 32 and LR = 0.005 (due to the limitation of video memory, the maximum video memory is set to 32, so the BS and LR are linearly scaled), ACC = 50.76%, Compared with the results given by the author, 52.3% is still about 1.5% lower.
…------------------ 原始邮件 ------------------
发件人: "MCG-NJU/TDN" ***@***.***>;
发送时间: 2021年9月21日(星期二) 上午6:03
***@***.***>;
***@***.******@***.***>;
主题: Re: [MCG-NJU/TDN] BatchSize How to set in a single card? (#39)
Hello @1145335145. Can you tell me on which dataset did your train results were lower?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
@1145335145 |
@1145335145 |
Hello,I'm sorry I didn't reply to your message in time. batchsize * ngpus = your single gpu batchsize. Congratulations on solving the problem!
在 2022-03-31 16:37:21,"ZChengLong578" ***@***.***> 写道:
@1145335145
Problem solved!
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
My test result is 6% lower than yours. How is your batch size set in 8 cards. How do you think batch size should be set in a single card?Thanks for your reply!
The text was updated successfully, but these errors were encountered: