-
Notifications
You must be signed in to change notification settings - Fork 25
Pull requests: pytorch/torchft
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
ci: lint + unittest
CLA Signed
This label is managed by the Meta Open Source bot.
#1
by d4l3k
was merged Oct 27, 2024
Loading…
Sorry, didn't mean to merge directly to main
CLA Signed
This label is managed by the Meta Open Source bot.
#2
by ZainRizvi
was merged Oct 30, 2024
Loading…
Testing versions
CLA Signed
This label is managed by the Meta Open Source bot.
#3
by ZainRizvi
was closed Oct 31, 2024
Loading…
ci: switch to amazon runners
CLA Signed
This label is managed by the Meta Open Source bot.
#4
by d4l3k
was merged Oct 31, 2024
Loading…
train_ddp, process_group: fixes so CUDA works e2e
CLA Signed
This label is managed by the Meta Open Source bot.
#5
by d4l3k
was merged Nov 3, 2024
Loading…
lighthouse: add heartbeats
CLA Signed
This label is managed by the Meta Open Source bot.
#6
by d4l3k
was merged Nov 7, 2024
Loading…
lighthouse: add dashboard
CLA Signed
This label is managed by the Meta Open Source bot.
#7
by d4l3k
was merged Nov 8, 2024
Loading…
lighthosue, manager: dashboard kill and heartbeat old ui
CLA Signed
This label is managed by the Meta Open Source bot.
#8
by d4l3k
was merged Nov 9, 2024
Loading…
ci: use stable rust and gate on number of gpus
CLA Signed
This label is managed by the Meta Open Source bot.
#9
by d4l3k
was merged Nov 9, 2024
Loading…
dashboard: show quorum status, age, old replicas
CLA Signed
This label is managed by the Meta Open Source bot.
#10
by d4l3k
was merged Nov 9, 2024
Loading…
train, manager, dashboard: show world size on dashboard, manual replica_id, convergence tweaks
CLA Signed
This label is managed by the Meta Open Source bot.
#11
by d4l3k
was merged Nov 11, 2024
Loading…
[checkpointing] support ipv6
CLA Signed
This label is managed by the Meta Open Source bot.
#12
by d4l3k
was merged Nov 16, 2024
Loading…
process_group: added registration to support DeviceMesh and functional_collectives
CLA Signed
This label is managed by the Meta Open Source bot.
#13
by d4l3k
was merged Nov 17, 2024
Loading…
process_group: register via public API
CLA Signed
This label is managed by the Meta Open Source bot.
#14
by d4l3k
was merged Nov 22, 2024
Loading…
Update README.md to include Rust installation
CLA Signed
This label is managed by the Meta Open Source bot.
#15
by fegin
was merged Nov 22, 2024
Loading…
Specifiy the devices when registering the backend to avoid warnings
CLA Signed
This label is managed by the Meta Open Source bot.
#16
by fegin
was merged Nov 23, 2024
Loading…
[WIP] A test case to show how to use DeviceMesh API to create the customized PG
CLA Signed
This label is managed by the Meta Open Source bot.
docs: add sphinx documentation and add missing documentation
CLA Signed
This label is managed by the Meta Open Source bot.
#18
by d4l3k
was merged Nov 27, 2024
Loading…
manager: expand API to include errors, participant information and numeric test
CLA Signed
This label is managed by the Meta Open Source bot.
#19
by d4l3k
was merged Nov 28, 2024
Loading…
docs: add legal info + fix jinja2 security warning
CLA Signed
This label is managed by the Meta Open Source bot.
#20
by d4l3k
was merged Dec 3, 2024
Loading…
process_group: wrapper updates and ErrorSwallowingProcessGroup
CLA Signed
This label is managed by the Meta Open Source bot.
#21
by d4l3k
was merged Dec 4, 2024
Loading…
lintrunner: added black,isort,rustfmt
CLA Signed
This label is managed by the Meta Open Source bot.
#22
by d4l3k
was merged Dec 6, 2024
Loading…
lintrunner: enable pyre
CLA Signed
This label is managed by the Meta Open Source bot.
#23
by d4l3k
was merged Dec 6, 2024
Loading…
manager: added FIXED_WITH_SPARES mode
CLA Signed
This label is managed by the Meta Open Source bot.
#24
by d4l3k
was merged Dec 6, 2024
Loading…
manager: added E2E tests and support getting lighthouse and manager addresses
CLA Signed
This label is managed by the Meta Open Source bot.
#25
by d4l3k
was merged Dec 7, 2024
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.