New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 64bit indexing support for softmax #52713
Conversation
💊 CI failures summary and remediationsAs of commit 48f0eba (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_xla_linux_bionic_py3_6_clang9_test (1/1)Step: "Run tests" (full log | diagnosis details | 🔁 rerun)
|
cc: @ptrblck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add tests?
@ngimel I have fixed the bug you catch, and added a test. The test passes on my 3090. |
@@ -11975,6 +11975,25 @@ def test_softmax_results(self, device, dtype): | |||
self.assertEqual(grad_input, ref_grad_input) | |||
self.assertEqual(input.grad, ref_input.grad) | |||
|
|||
@onlyCUDA | |||
@dtypesIfCUDA(torch.float, torch.half) | |||
@largeTensorTest("20GB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my 3090, half takes ~18GB mem, and float takes ~19.8GB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will these tests run in your CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our CI has A100 and 3090, so yes!
@@ -11975,6 +11975,25 @@ def test_softmax_results(self, device, dtype): | |||
self.assertEqual(grad_input, ref_grad_input) | |||
self.assertEqual(input.grad, ref_input.grad) | |||
|
|||
@onlyCUDA | |||
@dtypesIfCUDA(torch.float, torch.half) | |||
@largeTensorTest("20GB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will these tests run in your CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: fixes pytorch#52715 pytorch#52716 split across batch dimension Pull Request resolved: pytorch#52713 Reviewed By: ailzhang Differential Revision: D26640033 Pulled By: ngimel fbshipit-source-id: f169cb0d6abc1cfbddf658d9775759a7d56f5c12
Summary: fixes pytorch#52715 pytorch#52716 split across batch dimension Pull Request resolved: pytorch#52713 Reviewed By: ailzhang Differential Revision: D26640033 Pulled By: ngimel fbshipit-source-id: f169cb0d6abc1cfbddf658d9775759a7d56f5c12
fixes #52715 #52716
split across batch dimension