[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe #24750

mgoin · 2025-09-12T15:10:21Z

Purpose

Also possibly found the culprit for the blackwell cutlass mla failing test https://buildkite.com/vllm/ci/builds/30554/steps/canvas?jid=01993edf-720e-4749-81eb-da58099b7c78

E       RuntimeError: _C::sm100_cutlass_mla_decode() expected at most 9 argument(s) but received 10 argument(s). Declaration: _C::sm100_cutlass_mla_decode(Tensor($0! -> ) out, Tensor q_nope, Tensor q_pe, Tensor kv_c_and_k_pe_cache, Tensor seq_lens, Tensor page_table, Tensor workspace, float scale, int num_kv_splits) -> ()

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: mgoin <mgoin64@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a CI failure in test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe by ensuring that the dequantized reference weight tensors are moved to the correct CUDA device. The change is correct and resolves the device mismatch issue. I have provided suggestions to consolidate the tensor conversion and device placement calls for improved code clarity and efficiency.

tests/kernels/moe/test_mxfp4_moe.py

Signed-off-by: mgoin <mgoin64@gmail.com>

yewentao256

Thanks for the work!

yewentao256 · 2025-09-12T22:04:04Z

tests/kernels/moe/test_mxfp4_moe.py

        w2_q.view(torch.uint8),
        w2_scale.view(torch.uint8).reshape(-1)).to(torch.float32).reshape(
-            num_experts, hidden_size, intermediate_size)
+            num_experts, hidden_size, intermediate_size).to(device)


What will happen if we don't add the to(device) here?

Signed-off-by: mgoin <mgoin64@gmail.com>

…project#24750) Signed-off-by: mgoin <mgoin64@gmail.com>

…project#24750) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: bbartels <benjamin@bartels.dev>

Signed-off-by: mgoin <mgoin64@gmail.com>

…project#24750) Signed-off-by: mgoin <mgoin64@gmail.com>

Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe

e776c41

Signed-off-by: mgoin <mgoin64@gmail.com>

mgoin requested review from tlrmchlsmth, WoosukKwon and yewentao256 as code owners September 12, 2025 15:10

gemini-code-assist bot reviewed Sep 12, 2025

View reviewed changes

tests/kernels/moe/test_mxfp4_moe.py Show resolved Hide resolved

tests/kernels/moe/test_mxfp4_moe.py Show resolved Hide resolved

elvircrn mentioned this pull request Sep 12, 2025

[Kernels][DP/EP] Optimize Silu Kernel for R1 #24054

Merged

5 tasks

afeldman-nm mentioned this pull request Sep 12, 2025

[CI] Speed up model unit tests in CI #24253

Merged

mgoin added ready ONLY add when PR is ready to merge/full CI is needed ci-failure Issue about an unexpected test failure in CI labels Sep 12, 2025

github-project-automation bot added this to CI Failures Sep 12, 2025

mgoin added 2 commits September 12, 2025 18:37

Merge branch 'main' into fix-blackwell-mxfp4-test

2243787

Fix dummy def for sm100_cutlass_mla_decode

f84512c

Signed-off-by: mgoin <mgoin64@gmail.com>

yewentao256 reviewed Sep 12, 2025

View reviewed changes

DarkLight1337 approved these changes Sep 13, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) September 13, 2025 07:10

DarkLight1337 merged commit 59d7ffc into vllm-project:main Sep 13, 2025
81 of 82 checks passed

github-project-automation bot moved this to Done in CI Failures Sep 13, 2025

simon-mo pushed a commit that referenced this pull request Sep 13, 2025

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe (#24750)

26b999c

Signed-off-by: mgoin <mgoin64@gmail.com>

shyeh25 pushed a commit to shyeh25/vllm that referenced this pull request Sep 15, 2025

Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe (vllm-project#24750)

a6f7995

Signed-off-by: mgoin <mgoin64@gmail.com>

dsxsteven pushed a commit to dsxsteven/vllm_splitPR that referenced this pull request Sep 15, 2025

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe (vllm-…

29ec475

…project#24750) Signed-off-by: mgoin <mgoin64@gmail.com>

bbartels pushed a commit to bbartels/vllm that referenced this pull request Sep 15, 2025

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe (vllm-…

874c8dd

…project#24750) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: bbartels <benjamin@bartels.dev>

shyeh25 pushed a commit to shyeh25/vllm that referenced this pull request Sep 23, 2025

Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe (vllm-project#24750)

3c8e8e2

Signed-off-by: mgoin <mgoin64@gmail.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe (vllm-…

2f56a1e

…project#24750) Signed-off-by: mgoin <mgoin64@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe #24750

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe #24750

Uh oh!

mgoin commented Sep 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

yewentao256 left a comment

Uh oh!

yewentao256 Sep 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe #24750

[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe #24750

Uh oh!

Conversation

mgoin commented Sep 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

yewentao256 Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mgoin commented Sep 12, 2025 •

edited by github-actions bot

Loading