-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
docsDocumentation relatedDocumentation relatedneeds triageWaiting to be triaged by maintainersWaiting to be triaged by maintainers
Description
📚 Documentation
The documentation on DDP currently says:
Using DDP this way has a few disadvantages over
torch.multiprocessing.spawn()
:
- All processes (including the main process) participate in training and have the updated state of the model and Trainer state.
- No multiprocessing pickle errors
- Easily scales to multi-node training
Are these meant to be advantages instead of disadvantages?
Metadata
Metadata
Assignees
Labels
docsDocumentation relatedDocumentation relatedneeds triageWaiting to be triaged by maintainersWaiting to be triaged by maintainers