-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Enable torch.Generator
to support pytorch/xla generator implementation
#161369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…f-tree backends.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161369
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit d838990 with merge base 76f81b5 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
related issue: pytorch/xla#9159 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding out-of-tree accelerators, I think PrivateUse1
based mechanism should be the recommended option - https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/detail/PrivateUse1HooksInterface.h#L76-L77
hi @EikanWang Thanks for reviewing!
|
@EikanWang Thanks for the mention. @iwknow PyTorch supports registering generators for out-of-tree backends by inheriting from the AcceleratorHooksInterface class. You can refer to the below link for more information.
Therefore, it seems to me that you might need to add new files named |
torch/csrc/Generator.cpp
Outdated
} else if (c10::impl::hasGenerator(device_type)) { | ||
self->cdata = at::Generator(c10::impl::makeGenerator(device)); | ||
} else { | ||
throw std::runtime_error("No generator available for device type: " + | ||
c10::toString(device_type)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, wo do not need to modify this one, all the accelerators except CPU will follow the same AcceleratorHooksInterface
logic.
This is actually a brilliant idea, i will take a closer look. |
… from off-tree backends." This reverts commit 32c2b34.
Hi @FFFrog |
virtual DeviceIndex getNumDevices() const { | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this one? Can deviceCount
from base class AcceleratorHooksInterface
help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. i guess it is copy/paste from getNumGPUs
XPUHooksInterface.h. removed to use DeviceIndex deviceCount
from AcceleratorHooksInterface
It would be better if we change the code below into pytorch/aten/src/ATen/Context.h Lines 182 to 183 in f44ad54
|
@FFFrog Thanks for reviewing. PR updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM, let's trigger CI first.
@iwknow, I don't have valid approval permissions, so we'll have to get approval from @albanD.
Hey, @albanD, please take a look at this, thanks.
Also, it's necessary to find a better way to implement Hooks registration, otherwise every time we add a new backend to PyTorch, we'll need to add a Hooks interface for it.
@albanD, a kindly ping. |
Hi @FFFrog, can you please nominate another approver as it seems that albanD is unresponsive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds ok to add to make pytorch/XLA work. Do you have a sister PR on the xla side to make it work we should look at at the same time?
Also I would like one of the pytorch/XLA maintainer to comment here to make sure this is good on their end.
@FFFrog xla is a bit of a weird backend for historical reason. We don't plan on adding any new other backend that is not going through PrivateUse1 going forward indeed.
Sister PR for xla: yes, i do have the code change for XLA. it's basically a concrete implementation of the interface. but it will be in pytorch/XLA package. would you like to review? qihqi has been reviewing the generator change. I think he can take a look at this change as well. qihqi, please leave a comment if this change looks good to you and is what torch/XLA team want. |
Thank you, I got it. |
torch.Generator
to support off-tree generator implementation torch.Generator
to support pytorch/xla generator implementation
@zhanyong-wan can you please take a look or nominate someone from XLA side to see if this change is good from the XLA side. thanks! |
@qihqi ping again. |
Currently, the implementation of
torch.Generator
only support "cpu" and "cuda" device type. https://github.com/pytorch/pytorch/blob/main/torch/csrc/Generator.cpp#L55-L61This change enables
torch.Generator
to support more device type by allowing any device backend to register their own generator factory through a Generator Registry. This is similar to what "DeviceGuardImpl registry" does today.Key Changes:
New registry API:
Python/C++ integration:
Backend extensibility:
Example usage:
This allows torch.Generator(device='xla') to return an XlaGeneratorImpl when the torch_xla extension is imported.