Skip to content

Conversation

kumpera
Copy link
Contributor

@kumpera kumpera commented Jun 15, 2022

Fixes #66329

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 15, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit 344eb74 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Jun 15, 2022
Copy link
Contributor

@fduwjj fduwjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a high-level n00b question. What's the purpose of this wrapper? Also you want to let linter happy.

Copy link
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@fduwjj The purpose of this wrapper is to work with TORCH_DISTRIBUTED_DEBUG in order to detect collective mismatches.

@kumpera
Copy link
Contributor Author

kumpera commented Jun 24, 2022

@pytorchmergebot merge

@kumpera
Copy link
Contributor Author

kumpera commented Jun 24, 2022

I manually verified that the new failures are unrelated.

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Merge failed due to Refusing to merge as mandatory check(s) pull failed for rule superuser
Raised by https://github.com/pytorch/pytorch/actions/runs/2556036662

@kumpera
Copy link
Contributor Author

kumpera commented Jun 28, 2022

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Merge failed due to This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again.
Raised by https://github.com/pytorch/pytorch/actions/runs/2578484374

@kumpera
Copy link
Contributor Author

kumpera commented Jun 28, 2022

@pytorchmergebot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased fix_66329 onto refs/remotes/origin/master, please pull locally before adding more changes (for example, via git checkout fix_66329 && git pull --rebase)

@kumpera
Copy link
Contributor Author

kumpera commented Jun 29, 2022

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

@kumpera your PR has been successfully merged.

@github-actions
Copy link
Contributor

Hey @kumpera.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

facebook-github-bot pushed a commit that referenced this pull request Jun 30, 2022
Summary:
Fixes #66329

Pull Request resolved: #79633
Approved by: https://github.com/fduwjj, https://github.com/rohan-varma

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/08795f9afc7091f54c88f582c0056820cc7666f8

Reviewed By: b0noI

Differential Revision: D37523070

Pulled By: kumpera

fbshipit-source-id: b11019fc91c79676ad962222ba02e10f5f68897d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add _reduce_scatter_base and _allgather_base to processGroupWrapper

5 participants