Skip to content

Conversation

@wanchaol
Copy link
Collaborator

@wanchaol wanchaol commented Nov 21, 2022

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 21, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89442

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit 997b32a:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Contributor

@yhcharles yhcharles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pytorchmergebot
Copy link
Collaborator

Rebased gh/wanchaol/217/orig onto refs/remotes/origin/viable/strict because #89443 was rebased, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/89442)

Copy link
Contributor

@fduwjj fduwjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just have a n00b question.

start_reduction[i] = True
else:
with torch.no_grad():
dest_tensor_on_rank_i[0].add_(to_scatter[i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, does this mean the overall aggregation logic is executed in one single thread?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it's not actually running in multi threads, only PG is mocked with multiple threads to assume it's a world.

@wanchaol
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 21, 2022
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 2 additional jobs have failed, first few of them are: trunk ,trunk / linux-bionic-cuda11.7-py3.10-gcc7 / test (jit_legacy, 1, 1, linux.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@wanchaol
Copy link
Collaborator Author

@pytorchbot merge -f "failures not related"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Dec 10, 2022
@facebook-github-bot facebook-github-bot deleted the gh/wanchaol/216/head branch June 8, 2023 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants