[Platform] Allow platform use V1 Engine by default #19792

wangxiyuan · 2025-06-18T06:39:07Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Now more platform supports V1 Engine. This PR allow the platform to decide whether to using V1 Engine by default or not.

With this PR, user don't need to set VLLM_USE_V1=1 by hand any more if the platform works well on V1 by default.

Test Plan

This is a change mainly for oot platform. Origin test can make sure the logic is not breaked.

Test Result

(Optional) Documentation Update

this is a inner code change, no need for doc update.

github-actions · 2025-06-18T06:39:16Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Summary of Changes

Hello @wangxiyuan, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant improvement in how the V1 Engine is enabled by default across different hardware platforms. By modifying the platform interface to allow platforms to declare their default V1 Engine preference, I've made the system more flexible and user-friendly, eliminating the need for manual configuration in many common scenarios.

Highlights

Platform-driven V1 Engine Defaulting: I've refactored the V1 Engine compatibility check to allow each platform (e.g., CUDA, ROCm, TPU, CPU) to explicitly declare not only if it supports the V1 Engine, but also if the V1 Engine should be the default for that platform. This removes the previous hardcoded logic for determining V1 defaults based on hardware type.
Updated supports_v1 Interface: The supports_v1 class method in vllm/platforms/interface.py and its implementations (in cpu.py, cuda.py, rocm.py, tpu.py) now return a tuple (is_v1_support, is_v1_default) instead of a single boolean. This new return type provides granular control over V1 Engine defaulting.
Simplified User Experience: With this change, users on platforms that default to the V1 Engine will no longer need to manually set the VLLM_USE_V1=1 environment variable, streamlining the setup process.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request successfully refactors the supports_v1 method across platform interfaces and implementations to allow each platform to determine if V1 engine is supported and if it's the default. The changes are logical and well-contained. The primary area for improvement is ensuring consistency in type hints and docstrings for the modified supports_v1 methods in platform-specific files.

vllm/platforms/cpu.py

vllm/platforms/cuda.py

vllm/platforms/rocm.py

vllm/platforms/tpu.py

Isotr0py · 2025-06-18T07:17:52Z

vllm/platforms/interface.py

+    def supports_v1(cls, model_config: ModelConfig) -> tuple[bool, bool]:
        """Returns whether the current platform can support v1 for the supplied
        model configuration.
+
+        Returns:
+            tuple[bool, bool]: (is_v1_support, is_v1_default)
        """
-        return False
+        return False, False


Since CPU platform is a special case for partial v1 support, I think we can add an extra default_v1 method in interface, so that we don't need to touch other platform's implementation. WDYT?

@classmethod def default_v1(cls, model_config: ModelConfig) -> tuple[bool, bool]: """Returns whether the current platform can use v1 by default for the supplied model configuration. """ return cls.supports_v1(model_config)

For CPU platform, it will be like this:

@classmethod def default_v1(cls, model_config: ModelConfig) -> tuple[bool, bool]: """Returns whether the current platform can use v1 by default for the supplied model configuration. """ return cls.supports_v1(model_config) and cls.get_cpu_architecture() == CpuArchEnum.X86

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

Isotr0py

LGTM now!

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: minpeter <kali2005611@gmail.com>

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yang Wang <elainewy@meta.com>

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>

gemini-code-assist bot reviewed Jun 18, 2025

View reviewed changes

mergify bot added rocm tpu labels Jun 18, 2025

gemini-code-assist bot reviewed Jun 18, 2025

View reviewed changes

vllm/platforms/cpu.py Outdated Show resolved Hide resolved

vllm/platforms/cuda.py Outdated Show resolved Hide resolved

vllm/platforms/rocm.py Outdated Show resolved Hide resolved

vllm/platforms/tpu.py Outdated Show resolved Hide resolved

wangxiyuan force-pushed the check_v1_default branch from ec76bc5 to 75e768c Compare June 18, 2025 06:44

Isotr0py reviewed Jun 18, 2025

View reviewed changes

[Platform] Allow platform use V1 Engine by default

5a7f6a7

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

wangxiyuan force-pushed the check_v1_default branch from 75e768c to 5a7f6a7 Compare June 18, 2025 07:44

mergify bot removed the tpu label Jun 18, 2025

Isotr0py removed the rocm label Jun 18, 2025

Isotr0py approved these changes Jun 18, 2025

View reviewed changes

Isotr0py enabled auto-merge (squash) June 18, 2025 07:46

github-actions bot added the ready label Jun 18, 2025

Isotr0py merged commit 257ab95 into vllm-project:main Jun 18, 2025
79 checks passed

yeqcharlotte pushed a commit to yeqcharlotte/vllm that referenced this pull request Jun 22, 2025

[Platform] Allow platform use V1 Engine by default (vllm-project#19792)

fdb5f5a

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

minpeter pushed a commit to minpeter/vllm that referenced this pull request Jun 24, 2025

[Platform] Allow platform use V1 Engine by default (vllm-project#19792)

dcb0f02

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: minpeter <kali2005611@gmail.com>

yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Jun 24, 2025

[Platform] Allow platform use V1 Engine by default (vllm-project#19792)

9eac5fc

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yang Wang <elainewy@meta.com>

gmarinho2 pushed a commit to gmarinho2/vllm that referenced this pull request Jun 26, 2025

[Platform] Allow platform use V1 Engine by default (vllm-project#19792)

9211536

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 30, 2025

[Platform] Allow platform use V1 Engine by default (vllm-project#19792)

9cd1875

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

wseaton pushed a commit to wseaton/vllm that referenced this pull request Jun 30, 2025

[Platform] Allow platform use V1 Engine by default (vllm-project#19792)

36c1bd4

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

avigny pushed a commit to avigny/vllm that referenced this pull request Jul 31, 2025

[Platform] Allow platform use V1 Engine by default (vllm-project#19792)

62e8021

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Platform] Allow platform use V1 Engine by default #19792

[Platform] Allow platform use V1 Engine by default #19792

Uh oh!

wangxiyuan commented Jun 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jun 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Isotr0py Jun 18, 2025

Uh oh!

Isotr0py left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Platform] Allow platform use V1 Engine by default #19792

[Platform] Allow platform use V1 Engine by default #19792

Uh oh!

Conversation

wangxiyuan commented Jun 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

github-actions bot commented Jun 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Isotr0py Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wangxiyuan commented Jun 18, 2025 •

edited by github-actions bot

Loading