Skip to content

Conversation

ihb2032
Copy link
Contributor

@ihb2032 ihb2032 commented Sep 28, 2025

Purpose

Fixes #25737
This PR aims to fix crashes and improve the compatibility of vLLM's CPU backend when running on the RISC-V architecture. It addresses two specific issues:

  1. IPEX Dependency Crash: The chunked_prefill feature in the CPU attention backend unconditionally imports intel_extension_for_pytorch, causing a ModuleNotFoundError on non-x86 platforms. This PR fixes this by guarding the import with the existing _use_ipex flag and raising a NotImplementedError if the feature is used without its dependency.

  2. Proactive Disabling for RISC-V: To improve the user experience, this PR also adds riscv64 to the platform exclusion list for the chunked_prefill feature. This provides a clear warning at startup and prevents users from attempting to use a feature that is known to be unsupported on their hardware, similar to the existing handling for ARM and POWER architectures.

Together, these changes allow vLLM to initialize and run on RISC-V CPUs without crashing due to these architecture-specific dependencies.

Test Plan

Environment

The fix was tested in the following RISC-V environment:

  • CPU: Sophgo SG2044
  • OS: EulixOS 3.0
  • Compiler: GCC 15.1
  • Python: 3.11
  • PyTorch: 2.8.0

Test Command

Run any vLLM process that uses the CPU backend, for example, the latency benchmark:

vllm bench latency \
  Qwen/Qwen1.5-0.5B \
  --input-len 128 \
  --output-len 32 \
  --enforce-eager --dtype float16 --max_model_len 4096 --max_num_batched_tokens 4096

Test Result

Before this PR

The vLLM engine crashes during initialization with a ModuleNotFoundError because it tries to import intel_extension_for_pytorch on a RISC-V machine.

After this PR

The vLLM engine now starts successfully. A warning is logged to the console indicating that chunked_prefill has been disabled for the RISC-V platform, and the program proceeds to run without crashing. The logged warning looks like this:

INFO ... Chunked prefill is not supported for ARM and POWER, S390X and riscv64 CPUs; disabling it for V1 backend.

@mergify mergify bot added the v1 label Sep 28, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a crash on RISC-V architectures by improving CPU backend compatibility. The changes are well-structured into two parts: proactively disabling the chunked_prefill feature for RISC-V with a clear warning, and adding a safeguard to raise a NotImplementedError if the feature is used without its required intel_extension_for_pytorch dependency. Both changes are implemented correctly and improve the robustness and user experience of vLLM on non-x86 platforms. The code is clean and the logic is sound. Great work!

@ihb2032 ihb2032 changed the title [Fix] Improve CPU backend compatibility for RISC-V [Fix] Improve CPU backend compatibility for RISC-V issue#25737 Sep 28, 2025
@ihb2032 ihb2032 changed the title [Fix] Improve CPU backend compatibility for RISC-V issue#25737 [Fix] Improve CPU backend compatibility for RISC-V Sep 28, 2025
@github-project-automation github-project-automation bot moved this from To Triage to In progress in gpt-oss Issues & Enhancements Sep 29, 2025
@mergify mergify bot removed the tpu Related to Google TPUs label Sep 29, 2025
@ihb2032 ihb2032 force-pushed the fix/cpu-riscv-compatibility branch 3 times, most recently from 28268a4 to b1d7bcf Compare September 29, 2025 23:32
@ihb2032
Copy link
Contributor Author

ihb2032 commented Sep 29, 2025

Hi @hmellor,
Thanks again for all your detailed feedback and guidance. This has been a great learning experience.
I've pushed an updated version of the PR that addresses all the points you raised, and I've also taken the opportunity to clean up the commit history to make it as easy as possible to review.
Here is a summary of the changes:

  1. File Permissions: All file permissions have been corrected and are now perfectly in sync with the upstream/main branch.
  2. Redundant Code: You were right about the consistency issue. The redundant NotImplementedError check has been removed.
  3. Commit History & Scope: I have rebased the branch to clean up the commit history. The necessary CI fixes you pointed out as "unrelated" have been squashed into the relevant feature commit, making the history logical and focused.
    The PR should now be in good shape. Please let me know if there is anything else I can address. Thank you!

@ihb2032 ihb2032 force-pushed the fix/cpu-riscv-compatibility branch from b1d7bcf to 536a0a6 Compare September 30, 2025 00:09
@ihb2032 ihb2032 requested a review from hmellor September 30, 2025 00:10
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for being receptive to my feedback.

I'm still not sure if these changes are necessary though. It's not clear to me why they're needed.

@ihb2032 ihb2032 force-pushed the fix/cpu-riscv-compatibility branch from 536a0a6 to 00caf1c Compare September 30, 2025 09:06
@ihb2032
Copy link
Contributor Author

ihb2032 commented Sep 30, 2025

Hi [@hmellor],
My apologies for the complexity and the unrelated changes in the previous version. You were absolutely right that the PR had become unfocused.
I have now reset the branch and submitted a new version that contains only the single, essential change required to fix the RISC-V crash: adding riscv to the chunked_prefill exclusion list.
This resolves the permission issues and removes all the other refactoring changes that you pointed out.
I hope this new version is much clearer and easier to review. Thank you for your patience and guidance.

@ihb2032 ihb2032 force-pushed the fix/cpu-riscv-compatibility branch 3 times, most recently from 8d44d6a to 40a0083 Compare September 30, 2025 10:03
Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>
Signed-off-by: ihb2032 <1355790728@qq.com>
@ihb2032 ihb2032 force-pushed the fix/cpu-riscv-compatibility branch from 40a0083 to cf276f5 Compare September 30, 2025 10:11
@ihb2032 ihb2032 requested a review from hmellor September 30, 2025 11:33
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for fixing RISC-V

@github-project-automation github-project-automation bot moved this from In progress to Ready in gpt-oss Issues & Enhancements Sep 30, 2025
@hmellor hmellor enabled auto-merge (squash) September 30, 2025 12:02
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 30, 2025
@ihb2032
Copy link
Contributor Author

ihb2032 commented Sep 30, 2025

Great! Thanks for your review and guidance.

@hmellor hmellor merged commit bb6d430 into vllm-project:main Sep 30, 2025
45 checks passed
@ihb2032 ihb2032 deleted the fix/cpu-riscv-compatibility branch September 30, 2025 14:01
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>
Signed-off-by: ihb2032 <1355790728@qq.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: lyd1992 <liuyudong@iscas.ac.cn>
Signed-off-by: ihb2032 <1355790728@qq.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models kv-connector llama Related to Llama models multi-modality Related to multi-modality (#4194) new-model Requests to new models performance Performance-related issues qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm speculative-decoding structured-output tool-calling v1
Projects
Status: Done
Status: Done
Status: Done
Development

Successfully merging this pull request may close these issues.

[Bug]: RISC-V and non-Intel CPU architectures fail due to widespread unconditional IPEX dependencies
3 participants