Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tensor] redistribute among different process groups #1247

Merged
merged 9 commits into from
Jul 12, 2022
Merged

[tensor] redistribute among different process groups #1247

merged 9 commits into from
Jul 12, 2022

Conversation

feifeibear
Copy link
Contributor

No description provided.

if pg is not None and pg != self.get_process_group():
print('here _redistribute')
# if the pg is not equal, convert the current tensor to replicated
self._redistribute(ReplicaSpec())
Copy link
Contributor

@1SAA 1SAA Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the updated operand, redistribute, always used for non-model data in training.
It natrual that we need to keep redistribute as an autograd operand.
But here, _redistribute is not capable being used as an autograd operand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will discuss with you offline.
I hope the ColoTensor can be used independently, we don't need to assume it's only used for training.

@YuliangLiu0306 YuliangLiu0306 merged commit 1aad903 into hpcaitech:main Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants