DataParallel copies the model onto GPUs sequentially #51385
Labels
enhancement
Not as big of a feature, but technically not a bug. Should be easy to fix
module: data parallel
module: performance
Issues related to performance, either of kernel code or framework glue
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
I have 8 GPUs and can see it graphically with
watch -n 0.1 nvidia-smi
. It could save some time by doing the parallel async copy?Same seems for distributing the batch size, but less sure
P.S. I know that DP is not recommended in favor of DDP, but for legacy code and simplicity reasons (also for easier recovery from OOMs, exceptions and easier logging), DP still remains important
cc @VitalyFedyunin @ngimel
The text was updated successfully, but these errors were encountered: