Skip to content

Conversation

luccafong
Copy link
Collaborator

Summary: GB200 FlashInfer Prefill is not compatible with CutlassMLA FP8, allowing disable it for now.

Differential Revision: D81994905

@facebook-github-bot
Copy link

@luccafong has exported this pull request. If you are a Meta employee, you can view the originating diff in D81994905.

@mergify mergify bot added the v1 label Sep 19, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new environment variable, VLLM_DISABLE_FLASHINFER_PREFILL, to provide an option to disable FlashInfer prefill. This change addresses a compatibility issue on GB200 with CutlassMLA FP8. The implementation adds the new flag in vllm/envs.py and correctly uses it in vllm/v1/attention/backends/mla/common.py to control the feature. The default behavior is unchanged. The changes are correct and well-contained.

    Summary: GB200 FlashInfer Prefill is not compatible with CutlassMLA FP8, allowing disable it for now.

    Differential Revision: D81994905

Signed-off-by: Lu Fang <fanglu@fb.com>
@mgoin
Copy link
Member

mgoin commented Sep 19, 2025

Seems reasonable for now, thanks

@mgoin mgoin enabled auto-merge (squash) September 19, 2025 21:08
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 19, 2025
@mgoin mgoin merged commit ee7a66d into vllm-project:main Sep 19, 2025
52 checks passed
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Lu Fang <fanglu@fb.com>
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: charlifu <charlifu@amd.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants