skip test nvfp4_gemm on unsupported arches by liqiangxl · Pull Request #5815 · NVIDIA/Fuser

liqiangxl · 2026-01-13T21:47:18Z

Same as #5810
err msg Exception raised from runGemm at /opt/pytorch/nvfuser/cutlass/nvfp4_scaled_mm.cu:255

github-actions · 2026-01-13T21:48:07Z

Auto-merge Status

✅ Internal CI is finished
✅ No failed checks
✅ PR is mergeable
ℹ️ PR mergeable_state: clean

Description

Replace manual compute capability checking with dedicated microarchitecture_is utility function
Update skip condition to only test nvfp4_gemm on compute capability 10.0
Improve skip reason message for better clarity about architecture requirements

Changes walkthrough

Relevant files

Bug fix

test_cutlass_nvfp4_gemm.py `Update nvfp4_gemm test skip logic with utility function` tests/python/direct/test_cutlass_nvfp4_gemm.py Import microarchitecture_is utility function from python.direct_utils Replace manual compute capability check with microarchitecture_is(10, 0) condition Update skip reason to specify compute capability 10.0 requirement	+3/-3

PR Reviewer Guide

Here are some key observations to aid the review process:

🧪 PR contains tests
⚡ Recommended focus areas for review
Architecture Restriction Change The change modifies the compute capability check from a range-based approach (>= 10.0 and < 12.0) to an exact match (== 10.0). This significantly narrows the supported architectures and may exclude valid compute capabilities like 10.1, 10.2, 11.0, 11.1 that were previously supported. The rationale for this restriction change should be validated, especially given the referenced error from nvfp4_scaled_mm.cu:255. if not microarchitecture_is(10, 0): pytest.skip( reason="Nvfp4 Requires compute capability 10.0, other arches are not tested.", allow_module_level=True, ) Error Root Cause Analysis The PR description mentions an exception from runGemm at nvfp4_scaled_mm.cu:255, but doesn't provide details about the root cause. The change to restrict to only compute capability 10.0 may be a workaround rather than a proper fix. The underlying issue causing failures on other architectures should be investigated to determine if this restriction is appropriate or if a more comprehensive solution is needed. if not microarchitecture_is(10, 0): pytest.skip( reason="Nvfp4 Requires compute capability 10.0, other arches are not tested.", allow_module_level=True, )

liqiangxl · 2026-01-13T21:48:14Z

!build

greptile-apps · 2026-01-13T21:49:21Z

Greptile Summary

This PR restricts nvfp4 GEMM tests to run only on compute capability 10.0 devices, preventing runtime errors on unsupported architectures. The change follows the same pattern as PR #5810 which fixed similar issues for mxfp8 GEMM tests.

Changes:

Replaced manual compute capability check with microarchitecture_is(10, 0) helper function
Updated skip message to clarify that only 10.0 is tested
Made the check more restrictive: old logic allowed 10.0-11.x, new logic allows only 10.0

Note: This is more restrictive than the previous implementation which allowed compute capabilities 10.x up to (but not including) 12.0. The commit message confirms this is intentional ("only test nvfp4_gemm on 10.0 device").

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The change is a straightforward test infrastructure update that prevents test failures on unsupported hardware. It follows an established pattern from PR skip test cutlass mxfp8_gemm on unsupported arches #5810, uses an existing utility function correctly, and only affects test execution (not production code). The more restrictive logic is intentional as confirmed by the commit message.
No files require special attention

Important Files Changed

Filename	Overview
tests/python/direct/test_cutlass_nvfp4_gemm.py	Updated architecture check to restrict testing to compute capability 10.0 only, consistent with PR #5810 pattern

Sequence Diagram

sequenceDiagram
    participant Test as test_cutlass_nvfp4_gemm.py
    participant Utils as direct_utils.microarchitecture_is
    participant CUDA as torch.cuda
    participant Pytest as pytest.skip

    Note over Test: Module import phase
    Test->>Utils: microarchitecture_is(10, 0)
    Utils->>CUDA: get_device_properties(current_device())
    CUDA-->>Utils: device properties (major, minor)
    Utils->>Utils: Check: major == 10 and minor == 0
    alt Not compute capability 10.0
        Utils-->>Test: False
        Test->>Pytest: skip(reason="...", allow_module_level=True)
        Note over Pytest: Tests skipped for this module
    else Compute capability 10.0
        Utils-->>Test: True
        Note over Test: Proceed with test execution
        Test->>Test: test_nvfp4_gemm()
        Test->>Test: test_nvfp4_gemm_epilogue()
        Test->>Test: test_nvfp4_grouped_mm()
    end

only test nvfp4_gemm on 10.0 device

73c7293

liqiangxl requested a review from jacobhinkle January 13, 2026 21:47

liqiangxl added the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 13, 2026

jacobhinkle approved these changes Jan 13, 2026

View reviewed changes

github-actions Bot merged commit 3dee5aa into main Jan 13, 2026
20 checks passed

github-actions Bot removed the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 13, 2026

github-actions Bot deleted the llu/skip_unsupported_arches_nvfp4_gemm branch January 13, 2026 22:08

greptile-apps Bot mentioned this pull request Jan 14, 2026

skip scaled grouped mm test on unsupported arches #5816

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skip test nvfp4_gemm on unsupported arches#5815

skip test nvfp4_gemm on unsupported arches#5815
github-actions[bot] merged 1 commit intomainfrom
llu/skip_unsupported_arches_nvfp4_gemm

liqiangxl commented Jan 13, 2026

Uh oh!

github-actions Bot commented Jan 13, 2026 •

edited

Loading

Changes walkthrough

PR Reviewer Guide

Uh oh!

liqiangxl commented Jan 13, 2026

Uh oh!

greptile-apps Bot commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liqiangxl commented Jan 13, 2026

Uh oh!

github-actions Bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Auto-merge Status

Description

Changes walkthrough

PR Reviewer Guide

Uh oh!

liqiangxl commented Jan 13, 2026

Uh oh!

greptile-apps Bot commented Jan 13, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jan 13, 2026 •

edited

Loading