Skip to content

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented Sep 12, 2025

Purpose

Also possibly found the culprit for the blackwell cutlass mla failing test https://buildkite.com/vllm/ci/builds/30554/steps/canvas?jid=01993edf-720e-4749-81eb-da58099b7c78

E       RuntimeError: _C::sm100_cutlass_mla_decode() expected at most 9 argument(s) but received 10 argument(s). Declaration: _C::sm100_cutlass_mla_decode(Tensor($0! -> ) out, Tensor q_nope, Tensor q_pe, Tensor kv_c_and_k_pe_cache, Tensor seq_lens, Tensor page_table, Tensor workspace, float scale, int num_kv_splits) -> ()

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: mgoin <mgoin64@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a CI failure in test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe by ensuring that the dequantized reference weight tensors are moved to the correct CUDA device. The change is correct and resolves the device mismatch issue. I have provided suggestions to consolidate the tensor conversion and device placement calls for improved code clarity and efficiency.

@mgoin mgoin added ready ONLY add when PR is ready to merge/full CI is needed ci-failure Issue about an unexpected test failure in CI labels Sep 12, 2025
Copy link
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work!

w2_q.view(torch.uint8),
w2_scale.view(torch.uint8).reshape(-1)).to(torch.float32).reshape(
num_experts, hidden_size, intermediate_size)
num_experts, hidden_size, intermediate_size).to(device)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if we don't add the to(device) here?

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) September 13, 2025 07:10
@DarkLight1337 DarkLight1337 merged commit 59d7ffc into vllm-project:main Sep 13, 2025
81 of 82 checks passed
simon-mo pushed a commit that referenced this pull request Sep 13, 2025
shyeh25 pushed a commit to shyeh25/vllm that referenced this pull request Sep 15, 2025
dsxsteven pushed a commit to dsxsteven/vllm_splitPR that referenced this pull request Sep 15, 2025
bbartels pushed a commit to bbartels/vllm that referenced this pull request Sep 15, 2025
…project#24750)

Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: bbartels <benjamin@bartels.dev>
shyeh25 pushed a commit to shyeh25/vllm that referenced this pull request Sep 23, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-failure Issue about an unexpected test failure in CI ready ONLY add when PR is ready to merge/full CI is needed
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants