New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maximum number of threads per block for sm_86 is 1536 #45889
Conversation
Thanks for the PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
I'm concerned though, since we don't (and won't) compile binaries for 8.6, does it mean we could potentially compute wrong number of blocks per sm for 8.0 binaries run on 8.6? I think it's benign, and should not lead to "insufficient resources" error, but could you guys double check? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Codecov Report
@@ Coverage Diff @@
## master #45889 +/- ##
=======================================
Coverage 68.19% 68.19%
=======================================
Files 410 410
Lines 53226 53226
=======================================
Hits 36297 36297
Misses 16929 16929 Continue to review full report at Codecov.
|
Summary: according to https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications Pull Request resolved: pytorch#45889 Reviewed By: albanD Differential Revision: D24131188 Pulled By: ngimel fbshipit-source-id: 31d3038f7b1bc403751448c62b19609573c67a49
according to https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications