Comprehensively test NCCL's get_future()
API
#56838
Labels
better-engineering
Relatively self-contained tasks for better engineering contributors
oncall: distributed
Add this issue/PR to distributed oncall triage queue
pt_distributed_rampup
Ramp up tasks for new developers on PT distributed
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
馃殌 Feature
In ProcessGroupNCCL, we added a
get_future()
API to support gradient compression use cases, where a user can callget_future()
to schedule additional callbacks when implementing custom gradient compression algorithms.However, get_future() can be more generally useful and today is created for all nccl collectives as well as recv p2p op, but does not appear to be tested anywhere. It would be great to added tests that use
get_future()
and then enqueue more CUDA operations on the result and verify all synchronization happens appropriately to ensure this API works as expected.cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @agolynski @SciPioneer @H-Huang @mrzzd @cbalioglu @gcramer23
The text was updated successfully, but these errors were encountered: