Skip to content

Commit

Permalink
Fix document around DDP uneven inputs (#57448)
Browse files Browse the repository at this point in the history
Summary:
Typo fix and additional clarifications on the API.

Pull Request resolved: #57448

Reviewed By: SciPioneer

Differential Revision: D28153264

Pulled By: rohan-varma

fbshipit-source-id: 9bd35d918299ad7e080785d755f97b966f826615
  • Loading branch information
rohan-varma authored and facebook-github-bot committed May 10, 2021
1 parent 4d181ba commit d115e81
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions torch/nn/parallel/distributed.py
Expand Up @@ -958,11 +958,14 @@ def join(
modifications to the model or data loading is required.
.. warning::
If the model or training loop this context manager is wrapepd around
If the model or training loop this context manager is wrapped around
has additional distributed collective operations, such as
``SyncBatchNorm`` in the model's forward pass, then the flag
``throw_on_early_termination`` must be enabled. This is because this
context manager is not aware of non-DDP collective communication.
This flag will cause all ranks to throw when any one rank
exhausts inputs, allowing these errors to be caught and recovered
from across all ranks.
Args:
divide_by_initial_world_size (bool): If ``True``, will divide
Expand Down Expand Up @@ -993,7 +996,8 @@ def join(
of data. If ``False``, will continue training with a smaller
effective world size until all ranks are joined. Note that if
this flag is specified, then the flag
``divide_by_initial_world_size`` would be ignored.
``divide_by_initial_world_size`` would be ignored. Default
is ``False``.
Example::
Expand Down

0 comments on commit d115e81

Please sign in to comment.