Skip to content

skip test nvfp4_gemm on unsupported arches#5815

Merged
github-actions[bot] merged 1 commit intomainfrom
llu/skip_unsupported_arches_nvfp4_gemm
Jan 13, 2026
Merged

skip test nvfp4_gemm on unsupported arches#5815
github-actions[bot] merged 1 commit intomainfrom
llu/skip_unsupported_arches_nvfp4_gemm

Conversation

@liqiangxl
Copy link
Copy Markdown
Collaborator

Same as #5810
err msg Exception raised from runGemm at /opt/pytorch/nvfuser/cutlass/nvfp4_scaled_mm.cu:255

@liqiangxl liqiangxl requested a review from jacobhinkle January 13, 2026 21:47
@liqiangxl liqiangxl added the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 13, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 13, 2026

Auto-merge Status

✅ Internal CI is finished
✅ No failed checks
✅ PR is mergeable
ℹ️ PR mergeable_state: clean

Description

  • Replace manual compute capability checking with dedicated microarchitecture_is utility function

  • Update skip condition to only test nvfp4_gemm on compute capability 10.0

  • Improve skip reason message for better clarity about architecture requirements

Changes walkthrough

Relevant files
Bug fix
test_cutlass_nvfp4_gemm.py
Update nvfp4_gemm test skip logic with utility function   

tests/python/direct/test_cutlass_nvfp4_gemm.py

  • Import microarchitecture_is utility function from python.direct_utils
  • Replace manual compute capability check with microarchitecture_is(10,
    0) condition
  • Update skip reason to specify compute capability 10.0 requirement
  • +3/-3     

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Architecture Restriction Change

    The change modifies the compute capability check from a range-based approach (>= 10.0 and < 12.0) to an exact match (== 10.0). This significantly narrows the supported architectures and may exclude valid compute capabilities like 10.1, 10.2, 11.0, 11.1 that were previously supported. The rationale for this restriction change should be validated, especially given the referenced error from nvfp4_scaled_mm.cu:255.

    if not microarchitecture_is(10, 0):
        pytest.skip(
            reason="Nvfp4 Requires compute capability 10.0, other arches are not tested.",
            allow_module_level=True,
        )
    Error Root Cause Analysis

    The PR description mentions an exception from runGemm at nvfp4_scaled_mm.cu:255, but doesn't provide details about the root cause. The change to restrict to only compute capability 10.0 may be a workaround rather than a proper fix. The underlying issue causing failures on other architectures should be investigated to determine if this restriction is appropriate or if a more comprehensive solution is needed.

    if not microarchitecture_is(10, 0):
        pytest.skip(
            reason="Nvfp4 Requires compute capability 10.0, other arches are not tested.",
            allow_module_level=True,
        )

    @liqiangxl
    Copy link
    Copy Markdown
    Collaborator Author

    !build

    @greptile-apps
    Copy link
    Copy Markdown
    Contributor

    greptile-apps Bot commented Jan 13, 2026

    Greptile Summary

    This PR restricts nvfp4 GEMM tests to run only on compute capability 10.0 devices, preventing runtime errors on unsupported architectures. The change follows the same pattern as PR #5810 which fixed similar issues for mxfp8 GEMM tests.

    Changes:

    • Replaced manual compute capability check with microarchitecture_is(10, 0) helper function
    • Updated skip message to clarify that only 10.0 is tested
    • Made the check more restrictive: old logic allowed 10.0-11.x, new logic allows only 10.0

    Note: This is more restrictive than the previous implementation which allowed compute capabilities 10.x up to (but not including) 12.0. The commit message confirms this is intentional ("only test nvfp4_gemm on 10.0 device").

    Confidence Score: 5/5

    • This PR is safe to merge with minimal risk
    • The change is a straightforward test infrastructure update that prevents test failures on unsupported hardware. It follows an established pattern from PR skip test cutlass mxfp8_gemm on unsupported arches #5810, uses an existing utility function correctly, and only affects test execution (not production code). The more restrictive logic is intentional as confirmed by the commit message.
    • No files require special attention

    Important Files Changed

    Filename Overview
    tests/python/direct/test_cutlass_nvfp4_gemm.py Updated architecture check to restrict testing to compute capability 10.0 only, consistent with PR #5810 pattern

    Sequence Diagram

    sequenceDiagram
        participant Test as test_cutlass_nvfp4_gemm.py
        participant Utils as direct_utils.microarchitecture_is
        participant CUDA as torch.cuda
        participant Pytest as pytest.skip
    
        Note over Test: Module import phase
        Test->>Utils: microarchitecture_is(10, 0)
        Utils->>CUDA: get_device_properties(current_device())
        CUDA-->>Utils: device properties (major, minor)
        Utils->>Utils: Check: major == 10 and minor == 0
        alt Not compute capability 10.0
            Utils-->>Test: False
            Test->>Pytest: skip(reason="...", allow_module_level=True)
            Note over Pytest: Tests skipped for this module
        else Compute capability 10.0
            Utils-->>Test: True
            Note over Test: Proceed with test execution
            Test->>Test: test_nvfp4_gemm()
            Test->>Test: test_nvfp4_gemm_epilogue()
            Test->>Test: test_nvfp4_grouped_mm()
        end
    
    Loading

    @github-actions github-actions Bot merged commit 3dee5aa into main Jan 13, 2026
    20 checks passed
    @github-actions github-actions Bot removed the enable-auto-merge Auto-merge a PR when: 1) PR mergeable 2) Internal CI complete 3) No failures label Jan 13, 2026
    @github-actions github-actions Bot deleted the llu/skip_unsupported_arches_nvfp4_gemm branch January 13, 2026 22:08
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants