Datamodule and DDP #17857
Unanswered
astrocyted
asked this question in
DDP / multi-GPU / multi-node
Datamodule and DDP
#17857
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying to wrap my head around why the Datamodule's setup hook is supposed to be called by all processes and GPUs independently:
https://github.com/Lightning-AI/lightning/blob/cb17fe90ec3b04b680e1b425431e3201529501c5/src/lightning/pytorch/core/hooks.py#LL357C1-L360C24
I mean since in the docs examples very commonly dataset splitting/partitioning is done in setup() e.g.:
https://github.com/Lightning-AI/lightning/blob/cb17fe90ec3b04b680e1b425431e3201529501c5/src/lightning/pytorch/demos/mnist_datamodule.py#L199
I just dont understand why anyone would want dataset splitting be carried independenly in different nodes, eventhough if they all share the same random seeds, its still redundant. why not just do it once on rank zero and then copy the datamodule similar to prepare_data? Could you explain the necessity of any operation within setup having to be done independently on all processes?
Beta Was this translation helpful? Give feedback.
All reactions