[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support #24883

Josephasafg · 2025-09-15T14:10:05Z

Purpose

There was a bug in the FP32 support recently added to mamba conv state where if you pass this arg --mamba-cache-dtype "float32" you'd get this error:

E       triton.compiler.errors.CompilationError: at 102:8:
E           w_base = w_ptr + (idx_feats * stride_w_dim)  # [BLOCK_N,]
E
E           # Does 2 things:
E           # 1. READ prior-block init-state data - [done by every Triton programs]
E           # 2. update conv_state with new data [only by the Triton program handles chunk_offset=0]
E           if chunk_offset == 0:
E               # read from conv_states
E               load_init_state = False
E               if HAS_INITIAL_STATES:  # the new HAS_INITIAL_STATES
E                   load_init_state = tl.load(has_initial_states_ptr + idx_seq).to(
E                       tl.int1)
E               if load_init_state:
E               ^
E       AssertionError("Mismatched type for col0 between then block (<['256'], fp32>) and else block (<['256'], bf16>)")

The kernel needed to cast the input type to be the same as conv state.

Test Plan

Added two tests to test_hybrid.py as they were missing. Tests pass

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request correctly addresses a Triton compilation error for Mamba's FP32 convolution state during prefill by casting the input tensor to match the state's data type. The fix in causal_conv1d_fn is appropriate. However, the same bug likely exists in the causal_conv1d_update function, which handles the decode path, and this has not been addressed. This could lead to failures during token generation after the prefill phase. I've added a critical comment highlighting this omission. The test enhancements are good, improving coverage for different cache dtype parameters.

vllm/model_executor/layers/mamba/ops/causal_conv1d.py

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

…_conv_state

mergify · 2025-09-17T22:48:41Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Josephasafg.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…_conv_state

Josephasafg · 2025-09-18T06:45:23Z

@heheda12345 @tdoublep Can I please get a review?

tdoublep · 2025-09-18T06:49:24Z

vllm/model_executor/layers/mamba/ops/causal_conv1d.py

+    original_x_dtype = x.dtype
+    x = x.to(conv_state.dtype)


Do we definitely want to cast x to the conv_state dtype, rather than casting conv_state to the x_dtype?

Its a good question - Since the user picks fp32 for the cache type, Im afraid that by downcasting it to fp16 and then back to fp32, we could lose accuracy doing it. I had also figured that by choosing fp32, we want the computations to be done in that type dont we?

How does it work for the SSM state? I guess we want it to be consistent.

we cast to float in the kernel

tdoublep

LGTM

) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com> Signed-off-by: charlifu <charlifu@amd.com>

Add support for fp32 to conv state kernel

3148be5

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

Josephasafg requested a review from tdoublep as a code owner September 15, 2025 14:10

gemini-code-assist bot reviewed Sep 15, 2025

View reviewed changes

vllm/model_executor/layers/mamba/ops/causal_conv1d.py Show resolved Hide resolved

Added fp32 cast to conv update

b81854a

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

Josephasafg changed the title ~~[Bug][Mamba] - Fix Conv State Kernel FP32 Support~~ [Bugfix][Mamba] - Fix Conv State Kernel FP32 Support Sep 15, 2025

Merge branch 'main' of https://github.com/vllm-project/vllm into fp32…

94b2fe5

…_conv_state

mergify bot added the needs-rebase label Sep 17, 2025

Merge branch 'main' of https://github.com/vllm-project/vllm into fp32…

bfdaf6f

…_conv_state

Josephasafg force-pushed the fp32_conv_state branch from fe6756e to bfdaf6f Compare September 18, 2025 06:44

mergify bot removed the needs-rebase label Sep 18, 2025

tdoublep reviewed Sep 18, 2025

View reviewed changes

Josephasafg requested a review from tdoublep September 18, 2025 08:10

tdoublep approved these changes Sep 18, 2025

View reviewed changes

tdoublep enabled auto-merge (squash) September 18, 2025 10:48

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 18, 2025

tdoublep merged commit 66072b3 into vllm-project:main Sep 18, 2025
53 checks passed

debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025

[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support (vllm-project#24883

8dc45d7

) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support (vllm-project#24883

ebc3450

) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support (vllm-project#24883

6d9fdc4

) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com> Signed-off-by: charlifu <charlifu@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support #24883

[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support #24883

Uh oh!

Josephasafg commented Sep 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Sep 17, 2025

Uh oh!

Josephasafg commented Sep 18, 2025

Uh oh!

tdoublep Sep 18, 2025

Uh oh!

Josephasafg Sep 18, 2025

Uh oh!

tdoublep Sep 18, 2025

Uh oh!

Josephasafg Sep 18, 2025

Uh oh!

tdoublep left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support #24883

[Bugfix][Mamba] - Fix Conv State Kernel FP32 Support #24883

Uh oh!

Conversation

Josephasafg commented Sep 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Sep 17, 2025

Uh oh!

Josephasafg commented Sep 18, 2025

Uh oh!

tdoublep Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Josephasafg Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

tdoublep Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Josephasafg Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

tdoublep left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Josephasafg commented Sep 15, 2025 •

edited by github-actions bot

Loading