Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training with multiple GPU #6

Open
gokceay opened this issue May 8, 2021 · 5 comments
Open

training with multiple GPU #6

gokceay opened this issue May 8, 2021 · 5 comments

Comments

@gokceay
Copy link

gokceay commented May 8, 2021

Hello Mrs. Liyues,

I would like to use more than one GPU, how can I achieve this? In trainer net at line 25: self.model = nn.DataParallel(self.model).cuda(), I should add gpu ids inside DataParallel besides self.model right? Should I also increase the batch size to be more than the used GPU? Also did you tried batch size more than 1? What was the result if you tried?
Thanks in advance.

@gokceay gokceay changed the title training with a new dataset training with more gpu May 22, 2021
@gokceay gokceay changed the title training with more gpu training with multiple GPU May 22, 2021
@yinyin-llll
Copy link

hi,i also want to retrain the model but it need the csv file with annotation.Do you know this file or how to design this file?Thank you very much

@quocbao2772004
Copy link

@yinyin-llll hi bro, have you successfully retrained the model?

@yinyin-llll
Copy link

yinyin-llll commented Sep 3, 2024 via email

@liyues
Copy link
Owner

liyues commented Sep 3, 2024

Hi all, thanks for your interests in this work and code. This codebase has been quite a few years ago with many changes happened later. I will try to recall my memory to answer these questions to my best but sorry if there is anything unclear.
For the csv file, that should be just a file to store the path to the data file names where you store your data. So you may want to change this file to the one that can direct to your data path so the images can be loaded. Hope this can be helpful and let me know if there is any question.

@liyues
Copy link
Owner

liyues commented Sep 3, 2024

For the multi-GPU training, I think you are right it is possible to assign different batch of data onto different GPUs so that you can increase your total training batch size. That is to say, this is data parallel training. The PyTorch version may be too old now and you may want to check the new version for the data parallel training in the new PyTorch version. Hope this can be helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants