-
Notifications
You must be signed in to change notification settings - Fork 25.5k
[FP8][cuBLAS][H100] only test fp32 outputs for rowwise _scaled_mm
on H100
#162022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162022
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit bf995fe with merge base 5a2da09 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
||
# only cuBLAS supports rowwise with fp32 output and cuBLAS only supports | ||
# rowwise on SM 9.0 | ||
if torch.cuda.get_device_capability != (9, 0) and output_dtype == torch.float: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but H100 is sm_90 so the test should not have been failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is added by the PR and checks that it indeed fails on non-sm90?
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@ngimel is right, `ciflow/h100` doesn't actually appear to test the PR :( Pull Request resolved: #163354 Approved by: https://github.com/ngimel, https://github.com/Skylion007
…n H100 (pytorch#162022) only cuBLAS supports float32 output and cuBLAS only supports rowwise for SM 9.0 Intended to land after pytorch#161305 Pull Request resolved: pytorch#162022 Approved by: https://github.com/ngimel
@ngimel is right, `ciflow/h100` doesn't actually appear to test the PR :( Pull Request resolved: pytorch#163354 Approved by: https://github.com/ngimel, https://github.com/Skylion007
…n H100 (pytorch#162022) only cuBLAS supports float32 output and cuBLAS only supports rowwise for SM 9.0 Intended to land after pytorch#161305 Pull Request resolved: pytorch#162022 Approved by: https://github.com/ngimel
@ngimel is right, `ciflow/h100` doesn't actually appear to test the PR :( Pull Request resolved: pytorch#163354 Approved by: https://github.com/ngimel, https://github.com/Skylion007
…n H100 (pytorch#162022) only cuBLAS supports float32 output and cuBLAS only supports rowwise for SM 9.0 Intended to land after pytorch#161305 Pull Request resolved: pytorch#162022 Approved by: https://github.com/ngimel
@ngimel is right, `ciflow/h100` doesn't actually appear to test the PR :( Pull Request resolved: pytorch#163354 Approved by: https://github.com/ngimel, https://github.com/Skylion007
only cuBLAS supports float32 output and cuBLAS only supports rowwise for SM 9.0
Intended to land after #161305
cc @csarofeen @ptrblck @xwang233