-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ROCm] warn unsupported PYTORCH_CUDA_FUSER_DISABLE_FMA #50508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 7e2fb17 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
How am I supposed to resolve the clang-format error? |
|
ROCm CI failures are not related to this change. Will retest. |
|
@pytorchbot retest this please |
|
Again, ROCm CI failures are not related to this change. There was a JIT-related commit in master that broke ROCm that was recently reverted. Will retest. |
|
@pytorchbot retest this please |
Both the lint and test failures seem irrelevant to your change. I am trying landing this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Hey @jeffdaily, are the test errors real? |
|
@mrshenli where did those errors come from? The last ROCm CI jobs in this PR showed green? Unless... does our ROCm CI not cover the JIT tests? I was leaning heavily on our CI here, and I wouldn't have allowed this PR for review had it not first passed CI. |
|
Hey @jeffdaily It's this one below. The failed CI tests are from |
|
The only code change in this PR was protected by an ifdef; CUDA failures shouldn't be related. However, I have confirmed that the change in this PR is not covered currently by ROCm CI. That said, I am going to revise this PR. Our internal teams do need to skip unsupported compiler flags, but updating our CI scripts to cover this test is outside the scope of this PR. (Meaning, there could be other unrelated JIT test failures if we enabled all of them for ROCm CI.) Since the ROCm change to use |
|
Accidentally committed the revision in a hipified file. Fixed in 12f0b4a. |
|
Hey @jeffdaily, sorry that I dropped the ball on this. Is this still relevant? If yes, I will land when all tests pass. |
|
@mrshenli no worries. Yes, this PR is still needed for upcoming HIP compiler changes. I merged upstream hoping it would resolve the CI failures, but note that all CI failures were unrelated to this change. |
|
@mrshenli there is no way all these new CI failures are due to this PR. tensorpipe, sccache failed to connect, fbgemm. ROCm CI failed to build. It's like CI gets worse every time I rebase. |
|
Hey @jeffdaily, looks like this PR unintentionally included third-party changes (e.g., fbgemm, kineto, tensorpipe, etc.). Could you please remove those from this PR? |
d63242f to
7e2fb17
Compare
|
facepalm. I can't wait to put this 5-line change behind us. |
Codecov Report
@@ Coverage Diff @@
## master #50508 +/- ##
=======================================
Coverage 80.63% 80.63%
=======================================
Files 1959 1959
Lines 214878 214878
=======================================
+ Hits 173269 173271 +2
+ Misses 41609 41607 -2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrshenli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: nvcc's `--fmad=false` is not valid for the HIP compiler. Upcoming ROCm releases will start treating unrecognized compiler flags as an error. Pull Request resolved: pytorch#50508 Reviewed By: albanD Differential Revision: D25920291 Pulled By: mrshenli fbshipit-source-id: c0ff3b74dd07f3d0661ba29efafaab291ef3621c
Summary: nvcc's `--fmad=false` is not valid for the HIP compiler. Upcoming ROCm releases will start treating unrecognized compiler flags as an error. Pull Request resolved: pytorch/pytorch#50508 Reviewed By: albanD Differential Revision: D25920291 Pulled By: mrshenli fbshipit-source-id: c0ff3b74dd07f3d0661ba29efafaab291ef3621c
Summary: nvcc's `--fmad=false` is not valid for the HIP compiler. Upcoming ROCm releases will start treating unrecognized compiler flags as an error. Pull Request resolved: pytorch/pytorch#50508 Reviewed By: albanD Differential Revision: D25920291 Pulled By: mrshenli fbshipit-source-id: c0ff3b74dd07f3d0661ba29efafaab291ef3621c
nvcc's
--fmad=falseis not valid for the HIP compiler. Upcoming ROCm releases will start treating unrecognized compiler flags as an error.