multiple gpu when running 'sjSDM' #63

chnpenny · 2021-02-04T16:05:33Z

Hi,
We figured out that there's no argument 'n_gpu' in the function 'sjSDM', but only in 'sjSDM_cv'. Is it possible to use multiple gpus to run 'sjSDM' function at all? If so, is it implemented yet in 'sjSDM' function?
Thanks a lot!

MaximilianPi · 2021-02-04T16:46:19Z

Hi,
no, it is not supported and I'm not sure if it is worth the time. Let me explain, we have two different scenarios here:
a) sjSDM_cv is used to train up to hundreds of models. Distributing the workload to several GPUs is very favorable here, particularly for small models / datasets as you can run several small models at the same time on the same GPU, an example: 3 gpus, 21 tuning steps, 10x CV --> we have to fit 630 models. If they are small, we could use the sjSDM_cv function to run 21 CPU slaves to train 21 models simultaneously (7 models on each GPU).
b) sjSDM fits only one model. It is indeed possible to use more than one GPU to train the model (see https://www.tensorflow.org/tutorials/distribute/keras, or https://pytorch.org/tutorials/beginner/dist_overview.html). Here, the training of the model itself is distributed but there is an overhead and it is usually only used with very big data and models. As long as you can fit one model within minutes I don't think it is necessary to use distributed learning.

Cheers,
Max

MaximilianPi closed this as completed Sep 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiple gpu when running 'sjSDM' #63

multiple gpu when running 'sjSDM' #63

chnpenny commented Feb 4, 2021

MaximilianPi commented Feb 4, 2021

multiple gpu when running 'sjSDM' #63

multiple gpu when running 'sjSDM' #63

Comments

chnpenny commented Feb 4, 2021

MaximilianPi commented Feb 4, 2021