BatchSize How to set in a single card？ #39

anfengmin100 · 2021-09-17T10:35:05Z

My test result is 6% lower than yours. How is your batch size set in 8 cards. How do you think batch size should be set in a single card？Thanks for your reply！

yztongzhan · 2021-09-17T10:53:18Z

As described in README, we use 8 GPUs for training and the total batch size is 64 (8x8). We didn't try to train our TDN in a single gpu but you can try this by yourself :)

anfengmin100 · 2021-09-17T11:26:03Z

Thank you very much for your reply. I'll check the code again, set the parameters and try again.

…

------------------ 原始邮件 ------------------ 发件人: "MCG-NJU/TDN" ***@***.***>; 发送时间: 2021年9月17日(星期五) 晚上6:53 ***@***.***>; ***@***.******@***.***>; 主题: Re: [MCG-NJU/TDN] BatchSize How to set in a single card？ (#39) As described in README, we use 8 GPUs for training and the total batch size is 64 (8x8). We didn't try to train our TDN in a single gpu but you can try this by yourself :) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

cbiras · 2021-09-20T22:03:26Z

Hello @1145335145. Can you tell me on which dataset did your train results were lower?

anfengmin100 · 2021-09-22T00:48:25Z

Hello.My dataset is sthv1, which is trained on a single card 3090 with BS = 16 and LR = 0.02, ACC = 46.3%. The author said that they have BS = 8 for each GPU on 8gpus, so the BS on a single card should be 64. I have another training on BS = 32 and LR = 0.005 (due to the limitation of video memory, the maximum video memory is set to 32, so the BS and LR are linearly scaled), ACC = 50.76%, Compared with the results given by the author, 52.3% is still about 1.5% lower.

…

------------------ 原始邮件 ------------------ 发件人: "MCG-NJU/TDN" ***@***.***>; 发送时间: 2021年9月21日(星期二) 上午6:03 ***@***.***>; ***@***.******@***.***>; 主题: Re: [MCG-NJU/TDN] BatchSize How to set in a single card？ (#39) Hello @1145335145. Can you tell me on which dataset did your train results were lower? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

ZChengLong578 · 2022-03-31T01:59:36Z

@1145335145
Hello,I have made mistakes in using single-node multi-GPU training on STHv1 data set. Could you please tell me how you train on a single card? If possible, I hope you can give me your running command and relevant changes. Thank you very much!

ZChengLong578 · 2022-03-31T08:37:11Z

@1145335145
Problem solved!

anfengmin100 · 2022-03-31T08:45:35Z

Hello，I'm sorry I didn't reply to your message in time. batchsize * ngpus = your single gpu batchsize. Congratulations on solving the problem！在 2022-03-31 16:37:21，"ZChengLong578" ***@***.***> 写道： @1145335145 Problem solved! — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

yztongzhan closed this as completed Sep 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BatchSize How to set in a single card？ #39

BatchSize How to set in a single card？ #39

anfengmin100 commented Sep 17, 2021

yztongzhan commented Sep 17, 2021

anfengmin100 commented Sep 17, 2021 via email

cbiras commented Sep 20, 2021

anfengmin100 commented Sep 22, 2021 via email

ZChengLong578 commented Mar 31, 2022

ZChengLong578 commented Mar 31, 2022

anfengmin100 commented Mar 31, 2022 via email

BatchSize How to set in a single card？ #39

BatchSize How to set in a single card？ #39

Comments

anfengmin100 commented Sep 17, 2021

yztongzhan commented Sep 17, 2021

anfengmin100 commented Sep 17, 2021 via email

cbiras commented Sep 20, 2021

anfengmin100 commented Sep 22, 2021 via email

ZChengLong578 commented Mar 31, 2022

ZChengLong578 commented Mar 31, 2022

anfengmin100 commented Mar 31, 2022 via email