-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Match all_gather_into_tensor args names in remapping #117224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/117224
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 3c8a1a3 with merge base 5046b49 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
I'll write tests tomorrow |
|
||
@_exception_logger | ||
def all_gather_into_tensor(output_tensor, input_tensor, group=None, async_op=False): | ||
def all_gather_into_tensor(output_tensor, input_tensor, group: ProcessGroup = None, async_op=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I matched the type in the doc below but lmk if this is wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should add a single ProcessGroup type to this API, the other args like output_tensor
and input_tensor
does not have type too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The heredoc below specify this
Args:
output_tensor (Tensor): Output tensor to accommodate tensor elements
from all ranks. It must be correctly sized to have one of the
following forms:
(i) a concatenation of all the input tensors along the primary
dimension; for definition of "concatenation", see ``torch.cat()``;
(ii) a stack of all the input tensors along the primary dimension;
for definition of "stack", see ``torch.stack()``.
Examples below may better explain the supported output forms.
input_tensor (Tensor): Tensor to be gathered from current rank.
Different from the ``all_gather`` API, the input tensors in this
API must have the same size across all ranks.
group (ProcessGroup, optional): The process group to work on. If None,
the default process group will be used.
I'm cool with taking it out but wondering if there's a reason why this method shouldnt have type annotations
group, # TODO add a type, | ||
output_tensor: torch.Tensor, | ||
input_tensor: torch.Tensor, | ||
group: ProcessGroup = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
functional collective does not only accept a ProcessGroup so please don't add type annotation with ProcessGroup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good - thanks for the heads up. I think explicit typing is still helpful but that can be in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I didn't mean to not adding types because other field already have types, I think we should add RANK_TYPES
like other APIs in this file.
|
||
@_exception_logger | ||
def all_gather_into_tensor(output_tensor, input_tensor, group=None, async_op=False): | ||
def all_gather_into_tensor(output_tensor, input_tensor, group: ProcessGroup = None, async_op=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should add a single ProcessGroup type to this API, the other args like output_tensor
and input_tensor
does not have type too
Hi @wanchaol - I'm on a macbook so prob not the best ticket to pickup for local testing. For the test, I mostly grabbed from an existing test and used kwargs instead to test. LMK if there's a better way or example I can follow. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, thanks for adding the test. IIUC you matched the status who for type annotation but if you want to improve the file the type for process group should be like wanchao said
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks for addressing comments.
Hi @wconstab @wanchaol @voznesenskym - thanks for the reviews! If it looks good, could someone help merge for me? I'm not authorized. Not sure if there are more steps. thanks! |
you just need to use pytorchbot to help you merge. There is probably a wiki about it, let me ask it for help to find out where it is @pytorchbot --help |
❌ 🤖 pytorchbot command failed:
Try |
@pytorchbot --help |
PyTorchBot Help
Merge
Revert
Rebase
Label
Dr CI
Closeusage: @pytorchbot close Close a PR [Can be used on issues] |
anyway i will attempt to merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Ah appreciate it |
Fixes #114179
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225