Skip to content

Conversation

@wconstab
Copy link
Contributor

@wconstab wconstab commented Jan 16, 2025

Stack from ghstack (oldest at bottom):

minor changes to test public PP api instead of internal/private one and
also save a few lines of code for microbatch splitting in the process

cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @d4l3k @c-p-i-o

minor changes to test public PP api instead of internal/private one and
also save a few lines of code for microbatch splitting in the process

[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue topic: not user facing topic category labels Jan 16, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Jan 16, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/145011

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 5 Pending

As of commit ea1fc2d with merge base adbbcd8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@wconstab wconstab mentioned this pull request Jan 16, 2025
@wconstab wconstab requested a review from H-Huang January 17, 2025 00:40
minor changes to test public PP api instead of internal/private one and
also save a few lines of code for microbatch splitting in the process

cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 d4l3k c-p-i-o

[ghstack-poisoned]
minor changes to test public PP api instead of internal/private one and
also save a few lines of code for microbatch splitting in the process

[ghstack-poisoned]
Copy link
Member

@H-Huang H-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good


# Run the pipeline
if pp_group.rank() == 0:
pipeline_schedule._step_microbatches(arg_mbs=input_mb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thanks for removing! I checked the test folder, it doesn't look like any of our tests are using _step_microbatches anymore

@wconstab
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 17, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorch-bot pytorch-bot bot temporarily deployed to upload-benchmark-results January 17, 2025 22:49 Inactive
@pytorch-bot pytorch-bot bot temporarily deployed to upload-benchmark-results January 17, 2025 22:49 Inactive
@pytorch-bot pytorch-bot bot temporarily deployed to upload-benchmark-results January 17, 2025 22:49 Inactive
@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

pytorchmergebot pushed a commit that referenced this pull request Jan 18, 2025
Allows test classes using MPCT to set their own timeout as a class
property, which is good enough since the processgroup is shared across
test instances and the timeout is set at processgroup init.

Also sets a default timeout of 2 minutes, which is probably (?) long
enough for reasonable tests, but can be changed if it causes flakyness.
It's preferable to have as short default timeout as possible, since when
debugging tests getting a timeout quickly helps.
Pull Request resolved: #145099
Approved by: https://github.com/d4l3k, https://github.com/fduwjj
ghstack dependencies: #145010, #145011
@github-actions github-actions bot deleted the gh/wconstab/388/head branch February 20, 2025 02:06
jurgen-paul pushed a commit to jurgen-paul/pytorch.git.file that referenced this pull request Mar 19, 2025
minor changes to test public PP api instead of internal/private one and
also save a few lines of code for microbatch splitting in the process

ghstack-source-id: b173162
Pull Request resolved: pytorch/pytorch#145011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged merging oncall: distributed Add this issue/PR to distributed oncall triage queue topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants