-
Notifications
You must be signed in to change notification settings - Fork 79
skip test cutlass mxfp8_gemm on unsupported arches #5810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -6,9 +6,9 @@ | |||||
| import pytest | ||||||
| import torch | ||||||
| from nvfuser_direct import nvf_cutlass | ||||||
| from python.direct_utils import microarchitecture_is | ||||||
|
|
||||||
| compute_cap = torch.cuda.get_device_capability() | ||||||
| if compute_cap < (10, 0) or compute_cap >= (12, 0): | ||||||
| if not microarchitecture_is(10, 0): | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [P1] Logic change narrows supported architectures more than original The original condition The new condition If the intent was to maintain the original behavior while using the utility function, consider using a range check or adding support for additional architectures that were previously supported (10.1-10.3, 11.x). If restricting to exactly 10.0 is intentional, this is fine but represents a significant behavior change from the original code.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider adding explanatory comments about which architectures are supported/tested, similar to test_cutlass_gemm.py (lines 12-16). This would help future maintainers understand why the test is restricted to compute capability 10.0 only. Example from test_cutlass_gemm.py: # GPU Compute Capability: https://developer.nvidia.com/cuda/gpus
# tested on blackwell compute 10.0 (B200 and GB200)
# doesn't support 12.0 (RTX PRO 6000 and RTX 50XX)
# Not tested on 10.3 (B300 and GB300)
# Not tested on 12.1 (DGX Spark)
if not microarchitecture_is(10, 0):This documentation helps clarify the intentional restriction and provides context for when the restriction might be relaxed in the future. Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
||||||
| pytest.skip( | ||||||
| reason="MxFp8 Requires compute capability 10.", | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The error message "MxFp8 Requires compute capability 10." is misleading. The new logic using Previous behavior: Allowed 10.0, 10.1, 10.2, 10.3, 11.x (anything from 10.0 up to but excluding 12.0) Consider updating the message to be more specific, e.g., "MxFp8 requires compute capability 10.0. Other architectures have not been tested." This matches the pattern used in test_cutlass_gemm.py which has detailed comments about tested vs untested architectures.
Suggested change
|
||||||
| allow_module_level=True, | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: test_cutlass_nvfp4_gemm.py still uses the old pattern with
torch.cuda.get_device_capability()and allows all architectures in the range [10.0, 12.0). For consistency, consider updating that test file as well if NVFP4 should also be restricted to only tested architectures.Current state in test_cutlass_nvfp4_gemm.py:
If NVFP4 has the same testing limitations as MxFp8 and grouped_mm, it should follow the same pattern for maintainability and clarity.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!