Match all_gather_into_tensor args names in remapping #117224

lamroger · 2024-01-11T06:12:26Z

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225

pytorch-bot · 2024-01-11T06:12:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/117224

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 3c8a1a3 with merge base 5046b49 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / linux-focal-rocm5.7-py3.8 / test (default, 1, 1, linux.rocm.gpu) (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lamroger · 2024-01-11T06:13:44Z

I'll write tests tomorrow

lamroger · 2024-01-11T06:14:38Z

torch/distributed/distributed_c10d.py


 @_exception_logger
-def all_gather_into_tensor(output_tensor, input_tensor, group=None, async_op=False):
+def all_gather_into_tensor(output_tensor, input_tensor, group: ProcessGroup = None, async_op=False):


I matched the type in the doc below but lmk if this is wrong

I don't think we should add a single ProcessGroup type to this API, the other args like output_tensor and input_tensor does not have type too

The heredoc below specify this

Args: output_tensor (Tensor): Output tensor to accommodate tensor elements from all ranks. It must be correctly sized to have one of the following forms: (i) a concatenation of all the input tensors along the primary dimension; for definition of "concatenation", see ``torch.cat()``; (ii) a stack of all the input tensors along the primary dimension; for definition of "stack", see ``torch.stack()``. Examples below may better explain the supported output forms. input_tensor (Tensor): Tensor to be gathered from current rank. Different from the ``all_gather`` API, the input tensors in this API must have the same size across all ranks. group (ProcessGroup, optional): The process group to work on. If None, the default process group will be used.

I'm cool with taking it out but wondering if there's a reason why this method shouldnt have type annotations

wanchaol · 2024-01-11T06:33:52Z

torch/distributed/_functional_collectives.py

-    group,  # TODO add a type,
+    output_tensor: torch.Tensor,
+    input_tensor: torch.Tensor,
+    group: ProcessGroup = None,


functional collective does not only accept a ProcessGroup so please don't add type annotation with ProcessGroup

Sounds good - thanks for the heads up. I think explicit typing is still helpful but that can be in another PR

Oh I didn't mean to not adding types because other field already have types, I think we should add RANK_TYPES like other APIs in this file.

wanchaol · 2024-01-11T06:34:57Z

torch/distributed/distributed_c10d.py


 @_exception_logger
-def all_gather_into_tensor(output_tensor, input_tensor, group=None, async_op=False):
+def all_gather_into_tensor(output_tensor, input_tensor, group: ProcessGroup = None, async_op=False):


I don't think we should add a single ProcessGroup type to this API, the other args like output_tensor and input_tensor does not have type too

lamroger · 2024-01-11T18:02:33Z

Hi @wanchaol - I'm on a macbook so prob not the best ticket to pickup for local testing. For the test, I mostly grabbed from an existing test and used kwargs instead to test. LMK if there's a better way or example I can follow. Thanks!

wconstab

Lgtm, thanks for adding the test. IIUC you matched the status who for type annotation but if you want to improve the file the type for process group should be like wanchao said

wanchaol

lgtm, thanks for addressing comments.

lamroger · 2024-01-15T17:43:34Z

Hi @wconstab @wanchaol @voznesenskym - thanks for the reviews! If it looks good, could someone help merge for me? I'm not authorized. Not sure if there are more steps. thanks!

wconstab · 2024-01-17T01:17:38Z

you just need to use pytorchbot to help you merge. There is probably a wiki about it, let me ask it for help to find out where it is

@pytorchbot --help

pytorch-bot · 2024-01-17T01:17:40Z

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'help' (choose from 'merge', 'revert', 'rebase', 'label', 'drci', 'close')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,close} ...

Try @pytorchbot --help for more info.

wconstab · 2024-01-17T01:18:19Z

@pytorchbot --help

pytorch-bot · 2024-01-17T01:18:21Z

PyTorchBot Help

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,close} ...

In order to invoke the bot on your PR, include a line that starts with
@pytorchbot anywhere in a comment. That line will form the command; no
multi-line commands are allowed. Some commands may be used on issues as specified below.

Example:
    Some extra context, blah blah, wow this PR looks awesome

    @pytorchbot merge

optional arguments:
  -h, --help            Show this help message and exit.

command:
  {merge,revert,rebase,label,drci,close}
    merge               Merge a PR
    revert              Revert a PR
    rebase              Rebase a PR
    label               Add label to a PR
    drci                Update Dr. CI
    close               Close a PR

Merge

usage: @pytorchbot merge [-f MESSAGE | -i] [-ic] [-r [{viable/strict,main}]]

Merge an accepted PR, subject to the rules in .github/merge_rules.json.
By default, this will wait for all required checks (lint, pull) to succeed before merging.

optional arguments:
  -f MESSAGE, --force MESSAGE
                        Merge without checking anything. This requires a reason for auditting purpose, for example:
                        @pytorchbot merge -f 'Minor update to fix lint. Expecting all PR tests to pass'
                        
                        Please use `-f` as last resort, prefer `--ignore-current` to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.
  -i, --ignore-current  Merge while ignoring the currently failing jobs.  Behaves like -f if there are no pending jobs.
  -ic                   Old flag for --ignore-current. Deprecated in favor of -i.
  -r [{viable/strict,main}], --rebase [{viable/strict,main}]
                        Rebase the PR to re run checks before merging.  Accepts viable/strict or main as branch options and will default to viable/strict if not specified.

Revert

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst}

Revert a merged PR. This requires that you are a Meta employee.

Example:
  @pytorchbot revert -m="This is breaking tests on trunk. hud.pytorch.org/" -c=nosignal

optional arguments:
  -m MESSAGE, --message MESSAGE
                        The reason you are reverting, will be put in the commit message. Must be longer than 3 words.
  -c {nosignal,ignoredsignal,landrace,weird,ghfirst}, --classification {nosignal,ignoredsignal,landrace,weird,ghfirst}
                        A machine-friendly classification of the revert reason.

Rebase

usage: @pytorchbot rebase [-s | -b BRANCH]

Rebase a PR. Rebasing defaults to the stable viable/strict branch of pytorch.
Repeat contributor may use this command to rebase their PR.

optional arguments:
  -s, --stable          [DEPRECATED] Rebase onto viable/strict
  -b BRANCH, --branch BRANCH
                        Branch you would like to rebase to

Label

usage: @pytorchbot label labels [labels ...]

Adds label to a PR or Issue [Can be used on Issues]

positional arguments:
  labels  Labels to add to given Pull Request or Issue [Can be used on Issues]

Dr CI

usage: @pytorchbot drci 

Update Dr. CI. Updates the Dr. CI comment on the PR in case it's gotten out of sync with actual CI results.

Close

usage: @pytorchbot close

Close a PR [Can be used on issues]

wconstab · 2024-01-17T01:18:56Z

anyway i will attempt to merge
@pytorchbot merge

pytorchmergebot · 2024-01-17T01:20:54Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

lamroger · 2024-01-17T01:28:28Z

anyway i will attempt to merge @pytorchbot merge

Ah appreciate it

Match all_gather_into_tensor args names in remapping

49139cd

pytorch-bot bot added the release notes: distributed (c10d) release notes category label Jan 11, 2024

github-actions bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Jan 11, 2024

lamroger commented Jan 11, 2024

View reviewed changes

pytorchbot added the open source label Jan 11, 2024

wanchaol requested changes Jan 11, 2024

View reviewed changes

lamroger added 4 commits January 11, 2024 16:35

revert some typing

3aede67

remove unused import

6c36d34

woops dont add more text

0cc8bb3

Add test using kwargs to ensure param names match

62b65a3

github-actions bot added the ciflow/inductor label Jan 11, 2024

double space inline comment

1cc1b24

lamroger marked this pull request as ready for review January 11, 2024 17:59

lamroger requested a review from wanchaol January 11, 2024 17:59

Copy 1 to 1

3c8a1a3

colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 11, 2024

wconstab approved these changes Jan 12, 2024

View reviewed changes

wanchaol approved these changes Jan 12, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 17, 2024

pytorchmergebot added the merging label Jan 17, 2024

pytorchmergebot added the Merged label Jan 17, 2024

pytorchmergebot closed this in 2c5488d Jan 17, 2024

pytorchmergebot removed the merging label Jan 17, 2024

Match all_gather_into_tensor args names in remapping #117224

Match all_gather_into_tensor args names in remapping #117224

Uh oh!

Conversation

lamroger commented Jan 11, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jan 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/117224

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

lamroger commented Jan 11, 2024

Uh oh!

lamroger Jan 11, 2024

Choose a reason for hiding this comment

Uh oh!

wanchaol Jan 11, 2024

Choose a reason for hiding this comment

Uh oh!

lamroger Jan 11, 2024

Choose a reason for hiding this comment

Uh oh!

wanchaol Jan 11, 2024

Choose a reason for hiding this comment

Uh oh!

lamroger Jan 11, 2024

Choose a reason for hiding this comment

Uh oh!

wanchaol Jan 11, 2024

Choose a reason for hiding this comment

Uh oh!

wanchaol Jan 11, 2024

Choose a reason for hiding this comment

Uh oh!

lamroger commented Jan 11, 2024

Uh oh!

wconstab left a comment

Choose a reason for hiding this comment

Uh oh!

wanchaol left a comment

Choose a reason for hiding this comment

Uh oh!

lamroger commented Jan 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wconstab commented Jan 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jan 17, 2024

Uh oh!

wconstab commented Jan 17, 2024

Uh oh!

pytorch-bot bot commented Jan 17, 2024

PyTorchBot Help

Merge

Revert

Rebase

Label

Dr CI

Close

Uh oh!

wconstab commented Jan 17, 2024

Uh oh!

pytorchmergebot commented Jan 17, 2024

Merge started

Uh oh!

lamroger commented Jan 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

lamroger commented Jan 11, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jan 11, 2024 •

edited

Loading

lamroger commented Jan 15, 2024 •

edited

Loading

wconstab commented Jan 17, 2024 •

edited

Loading