New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pytorch/cuda] Concat fast path w/ zero tensor #46805
Conversation
This pull request was exported from Phabricator. Differential Revision: D24524441 |
💊 CI failures summary and remediationsAs of commit 6656908 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 1 failure confirmed as flaky and can be ignored:
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 12 times. |
This pull request was exported from Phabricator. Differential Revision: D24524441 |
bdfd149
to
6cf1b1a
Compare
Summary: Pull Request resolved: pytorch#46805 The current implementation goes with slow path if there is zero tensor in the list. This is inefficient. Use the fast path for torch.cat even if there are empty tensors. This wastes one thread block for the empty tensor, but still much better than the slow path. Test Plan: CI + sandcastle Differential Revision: D24524441 fbshipit-source-id: 522dea42628207bd77a8dfba39476b1dc3c1de45
6cf1b1a
to
5eac620
Compare
This pull request was exported from Phabricator. Differential Revision: D24524441 |
5eac620
to
6656908
Compare
This pull request was exported from Phabricator. Differential Revision: D24524441 |
This pull request has been merged in e3b55a8. |
1 similar comment
This pull request has been merged in e3b55a8. |
Summary: The current implementation goes with slow path if there is zero tensor in the list. This is inefficient. Use the fast path for torch.cat even if there are empty tensors. This wastes one thread block for the empty tensor, but still much better than the slow path.
Test Plan: CI + sandcastle
Differential Revision: D24524441