Skip to content

Pull requests: pytorch/torchft

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

ci: lint + unittest CLA Signed This label is managed by the Meta Open Source bot.
#1 by d4l3k was merged Oct 27, 2024 Loading…
Sorry, didn't mean to merge directly to main CLA Signed This label is managed by the Meta Open Source bot.
#2 by ZainRizvi was merged Oct 30, 2024 Loading…
Testing versions CLA Signed This label is managed by the Meta Open Source bot.
#3 by ZainRizvi was closed Oct 31, 2024 Loading…
ci: switch to amazon runners CLA Signed This label is managed by the Meta Open Source bot.
#4 by d4l3k was merged Oct 31, 2024 Loading…
train_ddp, process_group: fixes so CUDA works e2e CLA Signed This label is managed by the Meta Open Source bot.
#5 by d4l3k was merged Nov 3, 2024 Loading…
lighthouse: add heartbeats CLA Signed This label is managed by the Meta Open Source bot.
#6 by d4l3k was merged Nov 7, 2024 Loading…
lighthouse: add dashboard CLA Signed This label is managed by the Meta Open Source bot.
#7 by d4l3k was merged Nov 8, 2024 Loading…
lighthosue, manager: dashboard kill and heartbeat old ui CLA Signed This label is managed by the Meta Open Source bot.
#8 by d4l3k was merged Nov 9, 2024 Loading…
ci: use stable rust and gate on number of gpus CLA Signed This label is managed by the Meta Open Source bot.
#9 by d4l3k was merged Nov 9, 2024 Loading…
dashboard: show quorum status, age, old replicas CLA Signed This label is managed by the Meta Open Source bot.
#10 by d4l3k was merged Nov 9, 2024 Loading…
train, manager, dashboard: show world size on dashboard, manual replica_id, convergence tweaks CLA Signed This label is managed by the Meta Open Source bot.
#11 by d4l3k was merged Nov 11, 2024 Loading…
[checkpointing] support ipv6 CLA Signed This label is managed by the Meta Open Source bot.
#12 by d4l3k was merged Nov 16, 2024 Loading…
process_group: added registration to support DeviceMesh and functional_collectives CLA Signed This label is managed by the Meta Open Source bot.
#13 by d4l3k was merged Nov 17, 2024 Loading…
process_group: register via public API CLA Signed This label is managed by the Meta Open Source bot.
#14 by d4l3k was merged Nov 22, 2024 Loading…
Update README.md to include Rust installation CLA Signed This label is managed by the Meta Open Source bot.
#15 by fegin was merged Nov 22, 2024 Loading…
Specifiy the devices when registering the backend to avoid warnings CLA Signed This label is managed by the Meta Open Source bot.
#16 by fegin was merged Nov 23, 2024 Loading…
[WIP] A test case to show how to use DeviceMesh API to create the customized PG CLA Signed This label is managed by the Meta Open Source bot.
#17 by fegin was closed Dec 24, 2024 Draft
docs: add sphinx documentation and add missing documentation CLA Signed This label is managed by the Meta Open Source bot.
#18 by d4l3k was merged Nov 27, 2024 Loading…
manager: expand API to include errors, participant information and numeric test CLA Signed This label is managed by the Meta Open Source bot.
#19 by d4l3k was merged Nov 28, 2024 Loading…
docs: add legal info + fix jinja2 security warning CLA Signed This label is managed by the Meta Open Source bot.
#20 by d4l3k was merged Dec 3, 2024 Loading…
process_group: wrapper updates and ErrorSwallowingProcessGroup CLA Signed This label is managed by the Meta Open Source bot.
#21 by d4l3k was merged Dec 4, 2024 Loading…
lintrunner: added black,isort,rustfmt CLA Signed This label is managed by the Meta Open Source bot.
#22 by d4l3k was merged Dec 6, 2024 Loading…
lintrunner: enable pyre CLA Signed This label is managed by the Meta Open Source bot.
#23 by d4l3k was merged Dec 6, 2024 Loading…
manager: added FIXED_WITH_SPARES mode CLA Signed This label is managed by the Meta Open Source bot.
#24 by d4l3k was merged Dec 6, 2024 Loading…
manager: added E2E tests and support getting lighthouse and manager addresses CLA Signed This label is managed by the Meta Open Source bot.
#25 by d4l3k was merged Dec 7, 2024 Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.