Skip to content

Pull requests: pytorch/torchft

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

ProcessGroupNCCL,Manager: surface async abort errors correctly CLA Signed This label is managed by the Meta Open Source bot.
#147 opened Mar 21, 2025 by d4l3k Loading…
fork ProcessGroupNCCL CLA Signed This label is managed by the Meta Open Source bot.
#134 opened Mar 14, 2025 by d4l3k Draft
abort PG on error CLA Signed This label is managed by the Meta Open Source bot.
#133 opened Mar 12, 2025 by d4l3k Draft
[WIP Fix pipe close warnings CLA Signed This label is managed by the Meta Open Source bot.
#129 opened Mar 10, 2025 by H-Huang Draft
Add option to skip init sync CLA Signed This label is managed by the Meta Open Source bot.
#127 opened Mar 10, 2025 by dl541 Draft
Disable async quorum for the first quorum sync CLA Signed This label is managed by the Meta Open Source bot.
#112 opened Feb 19, 2025 by fegin Draft
Implementing bucketized model averaging for LocalSGD CLA Signed This label is managed by the Meta Open Source bot.
#111 opened Feb 18, 2025 by Krishn1412 Loading…
make torchft work for llama3_8b 8x CLA Signed This label is managed by the Meta Open Source bot.
#104 opened Feb 8, 2025 by d4l3k Draft
[WIP][RFC] Required changes for integration with TorchTitan CLA Signed This label is managed by the Meta Open Source bot.
#82 opened Jan 27, 2025 by fegin Loading…
rust: add open telemetry tracing CLA Signed This label is managed by the Meta Open Source bot.
#80 opened Jan 24, 2025 by d4l3k Draft
[WIP] FSDP example CLA Signed This label is managed by the Meta Open Source bot.
#77 opened Jan 22, 2025 by mreso Draft
Test manager join CLA Signed This label is managed by the Meta Open Source bot.
#62 opened Jan 8, 2025 by Jackmin801 Draft
ProTip! Follow long discussions with comments:>50.