Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training on Multiple GPUs #40

Closed
baranataman opened this issue Feb 18, 2020 · 5 comments
Closed

Training on Multiple GPUs #40

baranataman opened this issue Feb 18, 2020 · 5 comments

Comments

@baranataman
Copy link

baranataman commented Feb 18, 2020

Hello, thanks for the model and the explanations, this is very helpful for my research. I obtained very good results from (fine-tuned) pre-trained model and I want to increase the quality by training the model with the whole VoxCeleb2 dataset. I have all the data set folder prepared but I couldn't run the training on multiple GPUs. I have 2 GPUs and I want to use both of them since one GPU can not carry that much data and gives RunTimeError: CUDA out of memory.
How should I modify the code in order to take advantage of both GPUs on my system?
Any help would be appreciated.
Thank you

@baranataman baranataman changed the title Trainning on Multiple GPU Training on Multiple GPU Feb 18, 2020
@baranataman baranataman changed the title Training on Multiple GPU Training on Multiple GPUs Feb 18, 2020
@vincent-thevenin
Copy link
Owner

Hi @baranataman, to use multiple gpus, you can wrap the networks in pytorch Dataparallel modules. I have code on my local repo that uses that butit's with the secondary branch and not yet online on GitHub. I will let you know when a commit is ready for that.

@baranataman
Copy link
Author

Thanks, I am waiting to hear from you

@vincent-thevenin
Copy link
Owner

vincent-thevenin commented Apr 6, 2020

@baranataman You can now train on multiple gpus from the save_disc branch

@baranataman
Copy link
Author

Thanks for the answer, but one last question: what are init_Wi.py and path_to_Wi parameter? I am running train.py but should I modify anything from initWi.py?
Thanks in advance

@vincent-thevenin
Copy link
Owner

No need to run initWi.py, I wrote it because during training some time ago I lost my Wi weights, this creates wi weights from the embedder for every video. I should comment it better.

No need to change path_to_Wi unless you want to put there weights is a separate drive. This is where your Wi weights are saved for each video, I load them separately and save them because there's too many of them to be loaded on gpu all at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants