Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable dispatch stub for backend PrivateUse1 #99611

Closed
wants to merge 3 commits into from

Conversation

caizhi-mt
Copy link
Contributor

@caizhi-mt caizhi-mt commented Apr 20, 2023

When expanding the new backend of pytorch in the form of out ot tree, Privateuse1 will be reused. So we also need to support PrivateUse1 in the dispatch stub module

cc @bdhirsh

@pytorch-bot
Copy link

pytorch-bot bot commented Apr 20, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99611

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit bf93485:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@caizhi-mt
Copy link
Contributor Author

@bdhirsh Please help me review this commit, thank you

@@ -170,6 +172,10 @@ struct DispatchStub<rT (*)(Args...), T> {
impl.mps_dispatch_ptr = reinterpret_cast<void*>(fn_ptr);
}

void set_private_use1_dispatch_ptr(FnPtr fn_ptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you make it set_privateuse1_dispatch_ptr (for greppability)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@bdhirsh
Copy link
Contributor

bdhirsh commented Apr 24, 2023

One comment: I don't think getting DispatchStub to work with externally backends, is strictly necessary, since for any ATen kernels that are implemented with it, you can always register your own dedicated kernel to the PrivateUse1 dispatch key for that operator.

Is the idea that you want to be able to re-use some kernels from ATen that are implemented in terms of dispatch stub? That sounds pretty useful, I just wanted to confirm.

If so, it would be good to add a test. Do you think you can add a DispatchStub handler in test/cpp_extensions/open_registration_extension.cpp, and exercise that logic in test/test_cpp_extensions_open_device_registration.py?

@bdhirsh bdhirsh added the module: backend non-standard backend support label Apr 24, 2023
@caizhi-mt caizhi-mt force-pushed the enable_privateuse1_dispatch branch 2 times, most recently from d217ca1 to a287040 Compare April 25, 2023 14:26
@caizhi-mt
Copy link
Contributor Author

yes,that's exactly what I want to do, "re-use some kernels" through dispatch stub。
A simple unit test has been added, which re-use CPU Abs kernel through "abs_stub"。

@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 25, 2023
}
}

REGISTER_PRIVATEUSE1_DISPATCH(abs_stub, &abs_kernel);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to be able to test that we actually ran this kernel (right now the "test" just calls abs() and checks that the outputs match).

One way to do it would be the way it's done in custom_add_Tensor below - add a global counter for whether our custom abs kernel was called, that we increment inside of the kernel. Then expose a custom_abs_called helper to python that you can assert in the python code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. I have modified the unit test according to your suggestion, added custom_abs_called and abs_counter in test.

@caizhi-mt caizhi-mt force-pushed the enable_privateuse1_dispatch branch 2 times, most recently from bde88cf to effa7de Compare May 1, 2023 07:49
@bdhirsh
Copy link
Contributor

bdhirsh commented May 2, 2023

Can you take a look at the failing CI?

It looks like the custom cpp extension is failing to be compiled:

2023-05-01T10:03:59.2852263Z  const at::Tensor& custom_resize_(const at::Tensor& self, at::IntArrayRef size,
2023-05-01T10:03:59.2852610Z                    ^~~~~~~~~~~~~~
2023-05-01T10:03:59.2853410Z /var/lib/jenkins/workspace/test/cpp_extensions/open_registration_extension.cpp:63:19: note: ‘const at::Tensor& custom_resize_(const at::Tensor&, c10::IntArrayRef, c10::optional<c10::MemoryFormat>)’ previously defined here
2023-05-01T10:03:59.2854072Z  const at::Tensor& custom_resize_(
2023-05-01T10:03:59.2854350Z                    ^~~~~~~~~~~~~~
2023-05-01T10:03:59.2854649Z ninja: build stopped: subcommand failed.

c10::IntArrayRef size,
c10::optional<c10::MemoryFormat> optional_memory_format) {
// Since this custom device is just for testing, not bothering to implement kernels.
return self;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like it's failing because custom_resize_() is already defined further down.

You can just use the existing one instead of defining this new one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I have updated the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all checks have passed, could this PR be merged?

@caizhi-mt caizhi-mt force-pushed the enable_privateuse1_dispatch branch 3 times, most recently from 9253ca6 to ce72f84 Compare May 5, 2023 01:39
@bdhirsh bdhirsh added the release notes: composability release notes category label May 5, 2023
@bdhirsh
Copy link
Contributor

bdhirsh commented May 5, 2023

@pytorchbot merge

@pytorch-bot
Copy link

pytorch-bot bot commented May 5, 2023

This PR needs to be approved by an authorized maintainer before merge.

@caizhi-mt caizhi-mt force-pushed the enable_privateuse1_dispatch branch 2 times, most recently from 6099fdc to aba8401 Compare May 6, 2023 02:27
@caizhi-mt
Copy link
Contributor Author

This PR need to be approved by an authorized maintainer before merge. Could you approve this PR? @bdhirsh

@caizhi-mt
Copy link
Contributor Author

@bdhirsh Could you approve this PR? Thank you

@caizhi-mt
Copy link
Contributor Author

@ezyang Could you review and approve this PR? Thank you

@ezyang
Copy link
Contributor

ezyang commented May 9, 2023

Sorry, y'all have already gone a review on this, but I want to put the brakes on this. The dispatch stub API is not intended for external users, and for the most part we've expunged all the cases where we registered kernels that call into stubs so that they don't CompositeExplicitAutograd anymore. Concretely, what operators do you think you need this for?

@caizhi-mt
Copy link
Contributor Author

Actually I want to extend a new hardware backend based on "PrivateUse1" key. And I want to use the dispatchstub mechanism to call my kernels. such as: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/UnaryOps.cpp#L446
This is why I commit this PR。

@ezyang
Copy link
Contributor

ezyang commented May 10, 2023

Are you really using TensorIterator to implement your kernels though? Seems... unlikely.

@caizhi-mt
Copy link
Contributor Author

yes, I actually used TensorIterator.

  1. I am doing some cuda-compatible work using Privateuse1 key base on our hardware which will reuse cuda kernels.
  2. We will refer to cuda kernels to implement our own kernels using TensorIterator

PrivateUse1 key is just like "CPU" key, "CUDA" key. I think it is necessary to support privateuse1 in dispatchstub.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok fine

@ezyang
Copy link
Contributor

ezyang commented May 11, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 11, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. You can rebase and merge by leaving the following comment on this PR:
@pytorchbot merge -r
Or just rebase by leaving @pytorchbot rebase comment

Details for Dev Infra team Raised by workflow job

@ezyang
Copy link
Contributor

ezyang commented May 12, 2023

@pytorchbot merge -r

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased enable_privateuse1_dispatch onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout enable_privateuse1_dispatch && git pull --rebase)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@caizhi-mt
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled. If you believe this is a mistake,then you can re trigger it through pytorch-bot.

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: backend non-standard backend support open source release notes: composability release notes category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants