New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flag to disable MPI threads - HOROVOD_MPI_THREADS_DISABLE #324
Conversation
horovod/common/operations.cc
Outdated
// be used together with Horovod if multi-threaded MPI is installed. | ||
auto mpi_threads_disable = std::getenv("HOROVOD_MPI_THREADS_DISABLE"); | ||
int required; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems cleaner.
int required = MPI_THREAD_MULTIPLE;
if (mpi_threads_disable != nullptr && std::atoi(mpi_threads_disable) > 0) {
required = MPI_THREAD_FUNNELED;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2nd Jie's comment
horovod/common/operations.cc
Outdated
// be used together with Horovod if multi-threaded MPI is installed. | ||
auto mpi_threads_disable = std::getenv("HOROVOD_MPI_THREADS_DISABLE"); | ||
int required; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2nd Jie's comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, with minor edits
docs/running.md
Outdated
@@ -12,7 +12,9 @@ allows you to have a mixture of different NUMA configurations because the defaul | |||
`-mca pml ob1` and `-mca btl ^openib` flags force the use of TCP for MPI communication. This avoids many multiprocessing | |||
issues that Open MPI has with RDMA which typically result in segmentation faults. Using TCP for MPI does not have | |||
noticeable performance impact since most of the heavy communication is done by NCCL, which will use RDMA via RoCE or | |||
InfiniBand if they're available (see [Horovod on GPU](gpus.md)). | |||
InfiniBand if they're available (see [Horovod on GPU](gpus.md)). Notable exception from this rule are models that heavily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exception --> exceptions
docs/running.md
Outdated
@@ -12,7 +12,9 @@ allows you to have a mixture of different NUMA configurations because the defaul | |||
`-mca pml ob1` and `-mca btl ^openib` flags force the use of TCP for MPI communication. This avoids many multiprocessing | |||
issues that Open MPI has with RDMA which typically result in segmentation faults. Using TCP for MPI does not have | |||
noticeable performance impact since most of the heavy communication is done by NCCL, which will use RDMA via RoCE or | |||
InfiniBand if they're available (see [Horovod on GPU](gpus.md)). | |||
InfiniBand if they're available (see [Horovod on GPU](gpus.md)). Notable exception from this rule are models that heavily | |||
use `hvd.broadcast()` and `hvd.allgather()` operations. To make those operations use RDMA read [Open MPI with RDMA](#open-mpi-with-rdma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"RDMA read [Open MPI...]" --> "RDMA, read the [Open MPI...]"
docs/running.md
Outdated
@@ -39,6 +41,25 @@ $ mpirun -np 16 \ | |||
python train.py | |||
``` | |||
|
|||
### Open MPI with RDMA | |||
|
|||
As noted above, using TCP for MPI communication does not have any significant affects on performance in the majority of cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
affects --> effects
docs/running.md
Outdated
Models that make heavy use of `hvd.broadcast()` and `hvd.allgather()` operations are exceptions to that rule. | ||
|
||
Default Open MPI `openib` BTL that provides RDMA functionality does not work well with MPI multi-threading. In order to use | ||
RDMA with `openib`, multi-threading must be disabled via `-x HOROVOD_MPI_THREADS_DISABLE=1` option. See example below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See example --> See the example
docs/running.md
Outdated
python train.py | ||
``` | ||
|
||
Other MPI RDMA implementations may or may not benefit from disabling multi-threading, please consult vendor documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-threading, please --> multithreading, so please
No description provided.