Skip to content

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented Oct 5, 2023

Stack from ghstack (oldest at bottom):

I tested by adding some warning logs in C++, run a distributed program and show that they now had [rank0]: in the messages. There is no existing test infra for C++ logging so I couldn't easily add a unit test.

The implementation strategy is to setup a global variable in C++, and then poke it when we initialize a process group. This was the simplest thing I could think of that would work.

This PR only works for non-glog logging. Probably need to come up with some other strategy for glog, e.g., a custom prefix, but need to make sure this doesn't conflict with fbcode. I can't easily test this from OSS, will leave as follow up work.

Signed-off-by: Edward Z. Yang ezyang@meta.com

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 5, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110623

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d9c6d76 with merge base b5f9696 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ezyang added a commit that referenced this pull request Oct 5, 2023
Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: d623b28
Pull Request resolved: #110623
@ezyang ezyang added ciflow/trunk Trigger trunk jobs on your pull request topic: new features topic category labels Oct 5, 2023
@albanD albanD removed their request for review October 5, 2023 18:12
I tested by adding some warning logs in C++, run a distributed program and show that they now had `[rank0]:` in the messages. There is no existing test infra for C++ logging so I couldn't easily add a unit test.

The implementation strategy is to setup a global variable in C++, and then poke it when we initialize a process group. This was the simplest thing I could think of that would work.

This PR only works for non-glog logging. Probably need to come up with some other strategy for glog, e.g., a custom prefix, but need to make sure this doesn't conflict with fbcode. I can't easily test this from OSS, will leave as follow up work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
I tested by adding some warning logs in C++, run a distributed program and show that they now had `[rank0]:` in the messages. There is no existing test infra for C++ logging so I couldn't easily add a unit test.

The implementation strategy is to setup a global variable in C++, and then poke it when we initialize a process group. This was the simplest thing I could think of that would work.

This PR only works for non-glog logging. Probably need to come up with some other strategy for glog, e.g., a custom prefix, but need to make sure this doesn't conflict with fbcode. I can't easily test this from OSS, will leave as follow up work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Oct 6, 2023
Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: 0c7f749
Pull Request resolved: #110623
I tested by adding some warning logs in C++, run a distributed program and show that they now had `[rank0]:` in the messages. There is no existing test infra for C++ logging so I couldn't easily add a unit test.

The implementation strategy is to setup a global variable in C++, and then poke it when we initialize a process group. This was the simplest thing I could think of that would work.

This PR only works for non-glog logging. Probably need to come up with some other strategy for glog, e.g., a custom prefix, but need to make sure this doesn't conflict with fbcode. I can't easily test this from OSS, will leave as follow up work.

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Oct 9, 2023
Signed-off-by: Edward Z. Yang <ezyangmeta.com>

ghstack-source-id: da4caf1
Pull Request resolved: #110623
Copy link
Collaborator

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@ezyang
Copy link
Contributor Author

ezyang commented Oct 9, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/ezyang/2380/head branch October 13, 2023 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: distributed (c10d) release notes category topic: new features topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants