-
Notifications
You must be signed in to change notification settings - Fork 75
[rocm7.0_internal_testing]fp8: optimize skip rowwise tests #2476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rocm7.0_internal_testing]fp8: optimize skip rowwise tests #2476
Conversation
Skip based on ROCm version and gfx type. Signed-off-by: Jagadish Krishnamoorthy <jagadish.krishnamoorthy@amd.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes the skipping of ROCm-specific FP8 rowwise tests by introducing more granular version and architecture checks. Instead of skipping tests based solely on architecture, it now combines both ROCm version and architecture criteria for more precise test filtering.
- Replaces
skipIfRocmArchwith a newskipIfRocmVersionAndArchdecorator that checks both ROCm version and GPU architecture - Updates test decorators to skip FP8 rowwise tests only for ROCm 7.1+ on gfx950 architecture
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| torch/testing/_internal/common_utils.py | Adds new skipIfRocmVersionAndArch decorator function for combined version and architecture filtering |
| test/test_matmul_cuda.py | Updates import and test decorators to use the new combined version/architecture skip logic |
| rocm_version = str(torch.version.hip) | ||
| rocm_version = rocm_version.split("-")[0] # ignore git sha | ||
| rocm_version_tuple = tuple(int(x) for x in rocm_version.split(".")) | ||
| if rocm_version_tuple is None or version is None or rocm_version_tuple < tuple(version): |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition rocm_version_tuple is None can never be true since rocm_version_tuple is always assigned a tuple value on line 1997. This check should be removed or the logic needs to be restructured.
| if rocm_version_tuple is None or version is None or rocm_version_tuple < tuple(version): | |
| if version is None or rocm_version_tuple < tuple(version): |
| rocm_version_tuple = tuple(int(x) for x in rocm_version.split(".")) | ||
| if rocm_version_tuple is None or version is None or rocm_version_tuple < tuple(version): | ||
| prop = torch.cuda.get_device_properties(0) | ||
| if prop.gcnArchName.split(":")[0] in arch: |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will raise an exception if arch is None, but the function parameter allows None values. Add a null check: if arch and prop.gcnArchName.split(":")[0] in arch:
| if prop.gcnArchName.split(":")[0] in arch: | |
| if arch and prop.gcnArchName.split(":")[0] in arch: |
| self.assertEqual(out_fp8, out_fp8_s) | ||
|
|
||
| @skipIfRocmArch("gfx950") | ||
| @skipIfRocmVersionAndArch((7, 1), "gfx950") |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new decorator expects an iterable for the arch parameter (based on the in operator usage), but a string is passed. This should be a list: @skipIfRocmVersionAndArch((7, 1), ["gfx950"])
| @skipIfRocmVersionAndArch((7, 1), "gfx950") | |
| @skipIfRocmVersionAndArch((7, 1), ["gfx950"]) |
| ) | ||
|
|
||
| @skipIfRocmArch("gfx950") | ||
| @skipIfRocmVersionAndArch((7, 1), "gfx950") |
Copilot
AI
Aug 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new decorator expects an iterable for the arch parameter (based on the in operator usage), but a string is passed. This should be a list: @skipIfRocmVersionAndArch((7, 1), ["gfx950"])
| @skipIfRocmVersionAndArch((7, 1), "gfx950") | |
| @skipIfRocmVersionAndArch((7, 1), ["gfx950"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in" operator can also used for string comparison !
|
Jenkins build for 30d22a1cf728bebb4f1226d1aaba94fc2e950480 commit finished as FAILURE |
ce7fd34
into
ROCm:rocm7.0_internal_testing
Skip based on ROCm version and gfx type.