-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training with multiple GPU #6
Comments
hi,i also want to retrain the model but it need the csv file with annotation.Do you know this file or how to design this file?Thank you very much |
@yinyin-llll hi bro, have you successfully retrained the model? |
No
CR7.xxl
***@***.***
…------------------ 原始邮件 ------------------
发件人: "Lê Trần Quốc ***@***.***>;
发送时间: 2024年9月2日(星期一) 上午10:45
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [liyues/PatRecon] training with multiple GPU (#6)
@yinyin-llll hi bro, have you successfully retrained the model?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi all, thanks for your interests in this work and code. This codebase has been quite a few years ago with many changes happened later. I will try to recall my memory to answer these questions to my best but sorry if there is anything unclear. |
For the multi-GPU training, I think you are right it is possible to assign different batch of data onto different GPUs so that you can increase your total training batch size. That is to say, this is data parallel training. The PyTorch version may be too old now and you may want to check the new version for the data parallel training in the new PyTorch version. Hope this can be helpful |
Hello Mrs. Liyues,
I would like to use more than one GPU, how can I achieve this? In trainer net at line 25: self.model = nn.DataParallel(self.model).cuda(), I should add gpu ids inside DataParallel besides self.model right? Should I also increase the batch size to be more than the used GPU? Also did you tried batch size more than 1? What was the result if you tried?
Thanks in advance.
The text was updated successfully, but these errors were encountered: