I would like to provide custom DDP communication hooks in the arguments to deepspeed.initialize() so that these hooks can be "registered" (in quotes because DeepSpeed doesn't directly use DDP) internally and used alongside ZeRO stages. I currently have the PowerSGD communication hook available with PyTorch 1.8 in mind.
Can this be supported with DeepSpeed?
@samyam @jeffra @tjruwase