Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-gpu support for pytorch #153

Merged

Conversation

SteffenCzolbe
Copy link
Contributor

The PR adds multi-GPUs support to the pytorch train script. This function is already present in the Tensorflow implementation, but was absen from the pytorch one.

Command-line arguments are unchanged, and in line with the Tensorflow implementation.

New supported functions:

  • Train on a single GPU, that isn't the first one. Example: command line arg "--gpu 3" trains only on Cuda device 3
  • Train on any number of GPUs. Example: command line arg "--gpu 0,1" trains on GPUs No. 0 and 1. Parallelism is achieved by splitting the batch among the first dimension. If the batch size is less than the number of GPUs, an error message is thrown (behavior in-line with TensorFlow backend).

Verification:

Implementation was verified by benchmarking training speed on synthetic data. An almost linear speedup was achieved when scaling from 1 to 4 GPUs.

@adalca
Copy link
Collaborator

adalca commented Feb 24, 2020

Thank you @SteffenCzolbe , we'll take a look at this and do the pull in a bit.

@ahoopes ahoopes merged commit f61ec34 into voxelmorph:redesign May 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants