Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pipelining] Add _PipelineStage runtime #125729

Closed
wants to merge 3 commits into from

Conversation

Copy link

pytorch-bot bot commented May 8, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125729

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit db46937 with merge base 946b96f (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (pipeline) release notes category ci-td-distributed labels May 8, 2024
kwen2501 added a commit that referenced this pull request May 8, 2024
ghstack-source-id: 4a462a8623af8e4a20fc36ed53d4acaabffe74ad
Pull Request resolved: #125729
cc mrshenli pritamdamania87 zhaojuanmao satgera gqchen aazzolini osalpekar jiayisuse H-Huang awgu penguinwu fegin XilunWu wanchaol fduwjj wz337 tianyu-l wconstab yf225 chauhang d4l3k

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull request May 8, 2024
ghstack-source-id: cd9fb47480cc752361de28ced0f0d66ca87fec3d
Pull Request resolved: #125729
Copy link
Contributor

@wconstab wconstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason ManualPipelineStage isn't in the same file? It tripped me up a few times that I couldn't figure out what file it was in, and it seems logical to keep all in the same place.

Any particular code to be reviewed carefully? I think most of the changes are reviewed on the branch already so this is mostly code movement but lmk if there are important changes.

@kwen2501
Copy link
Contributor Author

kwen2501 commented May 9, 2024

The original plan was for tracer's stage and manual's stage be in different files (files are more 1:1 mapped with classes back then), and the base be on the manual side. But as consolidation goes on, the base becomes on the tracer side (bc its implementation is more general). I guess it wouldn't be a big deal to merge ManualPipelineStage into _PipelineStage.py and indeed better in terms of file structure. FYI @H-Huang

Re 2nd q: no substantial code change. @wconstab

cc mrshenli pritamdamania87 zhaojuanmao satgera gqchen aazzolini osalpekar jiayisuse H-Huang awgu penguinwu fegin XilunWu wanchaol fduwjj wz337 tianyu-l wconstab yf225 chauhang d4l3k

[ghstack-poisoned]
@kwen2501
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 10, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request May 11, 2024
1. Add pipeline schedules:
- GPipe
- 1F1B
- Interleaved 1F1B
- LoopedBFS

2. Add basic forward and backward tests:
test_schedule.py

Pull Request resolved: #125975
Approved by: https://github.com/wconstab
ghstack dependencies: #125729
tinglvv pushed a commit to tinglvv/pytorch that referenced this pull request May 14, 2024
tinglvv pushed a commit to tinglvv/pytorch that referenced this pull request May 14, 2024
1. Add pipeline schedules:
- GPipe
- 1F1B
- Interleaved 1F1B
- LoopedBFS

2. Add basic forward and backward tests:
test_schedule.py

Pull Request resolved: pytorch#125975
Approved by: https://github.com/wconstab
ghstack dependencies: pytorch#125729
pytorchmergebot pushed a commit that referenced this pull request May 14, 2024
Resolves pytorch/PiPPy#1062.

Also added a gradient equivalence test.

Pull Request resolved: #126114
Approved by: https://github.com/H-Huang
ghstack dependencies: #125729, #125975
@github-actions github-actions bot deleted the gh/kwen2501/23/head branch June 11, 2024 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-td-distributed ciflow/trunk Trigger trunk jobs on your pull request Merged oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (pipeline) release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants