-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make LazyGraphExecutor extensible #87218
Make LazyGraphExecutor extensible #87218
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87218
Note: Links to docs will display an error until the docs builds have been completed. ✅ No Failures, 1 PendingAs of commit 11d1021: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@alanwaketan Does this align with what you plan to implement? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
} | ||
|
||
LazyGraphExecutor::~LazyGraphExecutor() = default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reasons why this isn't in the header?
I guess we don't necessarily need to make all methods virtual all in once. We can just make them on-demand. |
@@ -41,7 +42,7 @@ class TORCH_API BackendImplInterface { | |||
|
|||
virtual const IrBuilder* GetIrBuilder() const = 0; | |||
|
|||
virtual bool ShouldSyncTensor(const LazyTensorPtr tensor) const; | |||
virtual LazyGraphExecutor* GetLazyGraphExecutor() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is an API now, maybe a few words to describe it would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, have you compare this approach with providing a registration method directly in the LazyGraphExecutor class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, have you compare this approach with providing a registration method directly in the LazyGraphExecutor class?
I have not. Do we anticipate the additional layer of indirection to have much impact on performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think performance is the main concern. Since the LazyGraphExecutor becomes a backend interface anyway, it just doesn't feel necessary to have the registration part in the BackendInterface to make the design cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it just doesn't feel necessary to have the registration part in the BackendInterface to make the design cleaner
I'm not sure what you mean by this. When you say registration, are you referring the to static object initialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about things similar to BackendRegistrar
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see now what you mean. I can implement that for LazyGraphExecutor
void UnregisterTensor(LazyTensor::Data* data); | ||
virtual ~LazyGraphExecutor(); | ||
|
||
virtual void RegisterTensor(std::shared_ptr<LazyTensor::Data> data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these become APIs now, maybe provide a few words explaining why people want to override these methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM- thanks for doing this @antoniojkim and for the feedback @alanwaketan! I like the idea of having the registrar in LazyGraphExecutor, to avoid piling stuff into backend interface.
ef4cc8c
to
fa0d4ab
Compare
@JackCaoG I don't know if PyTorch/XLA is using the lazy graph executor yet, but if so, this PR will break PyTorch/XLA. If not, please ignore. |
yea.. I saw |
Working on a companion pull request now. @antoniojkim Here is the README to how to land such patches together if you haven't seen it before. Basically, we need the PyTorch PR to update the xla pin pointing to the companion PR and then land the PR once the CI is all green. |
Here is the companion PR: pytorch/xla#4106. You will need to update https://github.com/pytorch/pytorch/blob/master/.github/ci_commit_pins/xla.txt#L1 with eff277e81fcfdeccba71e75ff40b6e2f3e29e27b. |
@pytorchbot merge -g |
Merge startedYour change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 2 additional jobs have failed, first few of them are: trunk ,trunk / linux-bionic-cuda11.7-py3.10-gcc7 / test (default, 1, 4, linux.4xlarge.nvidia.gpu) Details for Dev Infra teamRaised by workflow job |
e1cf63a
to
08cbf54
Compare
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: The following mandatory check(s) failed (Rule Dig deeper by viewing the failures on hud Details for Dev Infra teamRaised by workflow job |
08cbf54
to
2946121
Compare
2946121
to
11d1021
Compare
@pytorchbot merge -g |
Merge startedYour change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: HTTP Error 502: Bad Gateway Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Hey @antoniojkim. |
Add `LazyGraphExecutor` to backend interface so that its is extensible by a vendor backend. I've made some preliminary methods virtual. Not sure if we want to make all methods in `LazyGraphExecutor` virtual. Pull Request resolved: pytorch#87218 Approved by: https://github.com/wconstab, https://github.com/alanwaketan
Add `LazyGraphExecutor` to backend interface so that its is extensible by a vendor backend. I've made some preliminary methods virtual. Not sure if we want to make all methods in `LazyGraphExecutor` virtual. Pull Request resolved: pytorch#87218 Approved by: https://github.com/wconstab, https://github.com/alanwaketan
Add `LazyGraphExecutor` to backend interface so that its is extensible by a vendor backend. I've made some preliminary methods virtual. Not sure if we want to make all methods in `LazyGraphExecutor` virtual. Pull Request resolved: pytorch#87218 Approved by: https://github.com/wconstab, https://github.com/alanwaketan
Add
LazyGraphExecutor
to backend interface so that its is extensible by a vendor backend.I've made some preliminary methods virtual. Not sure if we want to make all methods in
LazyGraphExecutor
virtual.