[Fix] Improve CPU backend compatibility for RISC-V #25816

ihb2032 · 2025-09-28T03:15:53Z

Purpose

Fixes #25737
This PR aims to fix crashes and improve the compatibility of vLLM's CPU backend when running on the RISC-V architecture. It addresses two specific issues:

IPEX Dependency Crash: The chunked_prefill feature in the CPU attention backend unconditionally imports intel_extension_for_pytorch, causing a ModuleNotFoundError on non-x86 platforms. This PR fixes this by guarding the import with the existing _use_ipex flag and raising a NotImplementedError if the feature is used without its dependency.
Proactive Disabling for RISC-V: To improve the user experience, this PR also adds riscv64 to the platform exclusion list for the chunked_prefill feature. This provides a clear warning at startup and prevents users from attempting to use a feature that is known to be unsupported on their hardware, similar to the existing handling for ARM and POWER architectures.

Together, these changes allow vLLM to initialize and run on RISC-V CPUs without crashing due to these architecture-specific dependencies.

Test Plan

Environment

The fix was tested in the following RISC-V environment:

CPU: Sophgo SG2044
OS: EulixOS 3.0
Compiler: GCC 15.1
Python: 3.11
PyTorch: 2.8.0

Test Command

Run any vLLM process that uses the CPU backend, for example, the latency benchmark:

vllm bench latency \
  Qwen/Qwen1.5-0.5B \
  --input-len 128 \
  --output-len 32 \
  --enforce-eager --dtype float16 --max_model_len 4096 --max_num_batched_tokens 4096

Test Result

Before this PR

The vLLM engine crashes during initialization with a ModuleNotFoundError because it tries to import intel_extension_for_pytorch on a RISC-V machine.

After this PR

The vLLM engine now starts successfully. A warning is logged to the console indicating that chunked_prefill has been disabled for the RISC-V platform, and the program proceeds to run without crashing. The logged warning looks like this:

INFO ... Chunked prefill is not supported for ARM and POWER, S390X and riscv64 CPUs; disabling it for V1 backend.

gemini-code-assist

Code Review

This pull request effectively addresses a crash on RISC-V architectures by improving CPU backend compatibility. The changes are well-structured into two parts: proactively disabling the chunked_prefill feature for RISC-V with a clear warning, and adding a safeguard to raise a NotImplementedError if the feature is used without its required intel_extension_for_pytorch dependency. Both changes are implemented correctly and improve the robustness and user experience of vLLM on non-x86 platforms. The code is clean and the logic is sound. Great work!

tests/v1/kv_connector/nixl_integration/run_edge_case_test.sh

vllm/v1/attention/backends/cpu_attn.py

vllm/transformers_utils/tokenizers/mistral.py

ihb2032 · 2025-09-29T23:37:57Z

Hi @hmellor,
Thanks again for all your detailed feedback and guidance. This has been a great learning experience.
I've pushed an updated version of the PR that addresses all the points you raised, and I've also taken the opportunity to clean up the commit history to make it as easy as possible to review.
Here is a summary of the changes:

File Permissions: All file permissions have been corrected and are now perfectly in sync with the upstream/main branch.
Redundant Code: You were right about the consistency issue. The redundant NotImplementedError check has been removed.
Commit History & Scope: I have rebased the branch to clean up the commit history. The necessary CI fixes you pointed out as "unrelated" have been squashed into the relevant feature commit, making the history logical and focused.
The PR should now be in good shape. Please let me know if there is anything else I can address. Thank you!

hmellor

Thanks for being receptive to my feedback.

I'm still not sure if these changes are necessary though. It's not clear to me why they're needed.

tests/v1/kv_connector/nixl_integration/run_edge_case_test.sh

vllm/transformers_utils/tokenizers/mistral.py

vllm/multimodal/parse.py

ihb2032 · 2025-09-30T09:08:01Z

Hi [@hmellor],
My apologies for the complexity and the unrelated changes in the previous version. You were absolutely right that the PR had become unfocused.
I have now reset the branch and submitted a new version that contains only the single, essential change required to fix the RISC-V crash: adding riscv to the chunked_prefill exclusion list.
This resolves the permission issues and removes all the other refactoring changes that you pointed out.
I hope this new version is much clearer and easier to review. Thank you for your patience and guidance.

Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn> Signed-off-by: ihb2032 <1355790728@qq.com>

hmellor

LGTM. Thanks for fixing RISC-V

ihb2032 · 2025-09-30T12:22:17Z

Great! Thanks for your review and guidance.

Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn> Signed-off-by: ihb2032 <1355790728@qq.com>

Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn> Signed-off-by: ihb2032 <1355790728@qq.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

ihb2032 requested a review from LucasWilkinson as a code owner September 28, 2025 03:15

mergify bot added the v1 label Sep 28, 2025

gemini-code-assist bot reviewed Sep 28, 2025

View reviewed changes

ihb2032 changed the title ~~[Fix] Improve CPU backend compatibility for RISC-V~~ [Fix] Improve CPU backend compatibility for RISC-V issue#25737 Sep 28, 2025

ihb2032 changed the title ~~[Fix] Improve CPU backend compatibility for RISC-V issue#25737~~ [Fix] Improve CPU backend compatibility for RISC-V Sep 28, 2025

github-project-automation bot moved this from To Triage to In progress in gpt-oss Issues & Enhancements Sep 29, 2025

mergify bot removed the tpu Related to Google TPUs label Sep 29, 2025

hmellor requested changes Sep 29, 2025

View reviewed changes

tests/v1/kv_connector/nixl_integration/run_edge_case_test.sh Show resolved Hide resolved

vllm/v1/attention/backends/cpu_attn.py Outdated Show resolved Hide resolved

vllm/transformers_utils/tokenizers/mistral.py Show resolved Hide resolved

ihb2032 force-pushed the fix/cpu-riscv-compatibility branch 3 times, most recently from 28268a4 to b1d7bcf Compare September 29, 2025 23:32

ihb2032 force-pushed the fix/cpu-riscv-compatibility branch from b1d7bcf to 536a0a6 Compare September 30, 2025 00:09

ihb2032 requested a review from hmellor September 30, 2025 00:10

hmellor requested changes Sep 30, 2025

View reviewed changes

ihb2032 force-pushed the fix/cpu-riscv-compatibility branch from 536a0a6 to 00caf1c Compare September 30, 2025 09:06

ihb2032 force-pushed the fix/cpu-riscv-compatibility branch 3 times, most recently from 8d44d6a to 40a0083 Compare September 30, 2025 10:03

Fix(cpu): Add riscv to chunked_prefill exclusion list

cf276f5

Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn> Signed-off-by: ihb2032 <1355790728@qq.com>

ihb2032 force-pushed the fix/cpu-riscv-compatibility branch from 40a0083 to cf276f5 Compare September 30, 2025 10:11

ihb2032 requested a review from hmellor September 30, 2025 11:33

hmellor approved these changes Sep 30, 2025

View reviewed changes

github-project-automation bot moved this from In progress to Ready in gpt-oss Issues & Enhancements Sep 30, 2025

hmellor enabled auto-merge (squash) September 30, 2025 12:02

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 30, 2025

hmellor merged commit bb6d430 into vllm-project:main Sep 30, 2025
45 checks passed

github-project-automation bot moved this to Done in Structured Output Sep 30, 2025

github-project-automation bot moved this to Done in Tool Calling Sep 30, 2025

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Sep 30, 2025

ihb2032 deleted the fix/cpu-riscv-compatibility branch September 30, 2025 14:01

pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025

[Fix] Improve CPU backend compatibility for RISC-V (vllm-project#25816)

2b16dad

Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn> Signed-off-by: ihb2032 <1355790728@qq.com>

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

[Fix] Improve CPU backend compatibility for RISC-V (#25816)

02776c0

Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn> Signed-off-by: ihb2032 <1355790728@qq.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Fix] Improve CPU backend compatibility for RISC-V #25816

[Fix] Improve CPU backend compatibility for RISC-V #25816

Uh oh!

ihb2032 commented Sep 28, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ihb2032 commented Sep 29, 2025 •

edited

Loading

Uh oh!

hmellor left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ihb2032 commented Sep 30, 2025 •

edited

Loading

Uh oh!

hmellor left a comment

Uh oh!

ihb2032 commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Fix] Improve CPU backend compatibility for RISC-V #25816

[Fix] Improve CPU backend compatibility for RISC-V #25816

Uh oh!

Conversation

ihb2032 commented Sep 28, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Environment

Test Command

Test Result

Before this PR

After this PR

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ihb2032 commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ihb2032 commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

ihb2032 commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

ihb2032 commented Sep 28, 2025 •

edited by github-actions bot

Loading

ihb2032 commented Sep 29, 2025 •

edited

Loading

ihb2032 commented Sep 30, 2025 •

edited

Loading