[Bugfix] when use s3 model cannot use default load_format #24435

lengrongfu · 2025-09-08T09:39:05Z

Purpose

#23236 (comment)

Current cannot use default loader model from s3 , so should to check it.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request aims to fix an issue where loading models from S3 fails with the default load_format. The change introduces a check in EngineArgs.__post_init__ to enforce that S3 models use the runai_streamer load format.

However, the current implementation has some critical issues. It is overly restrictive and will cause an error on the common use case of providing an S3 path with the default load_format="auto", leading to a poor user experience. Additionally, this bugfix lacks any unit tests, which is essential for verifying the correctness of the new logic and preventing regressions. I have provided a detailed comment with a suggested code change to improve the logic and a request to add comprehensive tests. Addressing these points is critical to ensure the change is robust and user-friendly.

vllm/engine/arg_utils.py

22quinn

The failure would occur for other formats too if user does not set the right one. Do we want to clutter arg_utils with many custom overrides? Is the expectation for users to set it right or for vllm to always infer the right format? If we decide the 2nd way, we should infer everything.
We can get some second opinions. cc @hmellor

lengrongfu · 2025-09-09T07:58:05Z

The failure would occur for other formats too if user does not set the right one. Do we want to clutter arg_utils with many custom overrides? Is the expectation for users to set it right or for vllm to always infer the right format? If we decide the 2nd way, we should infer everything. We can get some second opinions. cc @hmellor

This is a very good point.

From a user-friendly perspective, it's probably better for VLLM to always infer the correct format.
From a software maintenance perspective, it's probably better for users to set it correctly.

So it's a trade-off.

hmellor · 2025-09-09T08:03:01Z

Instead of arg_utils we could move the validation into ModelConfig? This is then localised to where it is most relevant.

(FYI, currently ModelConfig lives in config/__init__.py but it will eventually by moved to config/model.py.)

hmellor · 2025-09-09T08:12:13Z

As for the design decision, if we have a value called "auto" it'd be nice it it worked for all the supported load_formats

lengrongfu · 2025-09-09T16:43:36Z

Instead of arg_utils we could move the validation into ModelConfig? This is then localised to where it is most relevant.

(FYI, currently ModelConfig lives in config/__init__.py but it will eventually by moved to config/model.py.)

because load_format field is in EngineArgs struct, canot access in ModelConfig; base before have like logic in create_model_config method, so i move to this method.
current don't config/model.py this file, i can commit a pr moved to config/model.py.

DarkLight1337 · 2025-09-10T04:42:01Z

Closing as superseded by #23845

lengrongfu · 2025-09-10T07:33:55Z

@DarkLight1337 I'll verify whether this pull request #23845 solves my problem. If not, this pull request may still be needed.

lengrongfu · 2025-09-13T17:31:15Z

@DarkLight1337 After testing, I think this problem is not solved. #23845 this PR is like pr #23842. we can reopen this pr?

lengrongfu · 2025-09-15T03:37:25Z

Hey @hmellor @22quinn the PR has been updated. Could you please review it when you have a moment? Thanks!

hmellor · 2025-09-15T10:05:10Z

because load_format field is in EngineArgs struct, canot access in ModelConfig; base before have like logic in create_model_config method, so i move to this method.

load_format is a Literal not a struct. I made a mistake before, load_format is part of LoadConfig. Since this validation requires information about load_config and model_config can it be moved to VllmConfig instead?

lengrongfu · 2025-09-16T10:19:33Z

because load_format field is in EngineArgs struct, canot access in ModelConfig; base before have like logic in create_model_config method, so i move to this method.

load_format is a Literal not a struct. I made a mistake before, load_format is part of LoadConfig. Since this validation requires information about load_config and model_config can it be moved to VllmConfig instead?

@hmellor yes, it need load_config and model_config, so i add this validation to VllmConfig.try_verify_and_update_config .

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

22quinn

Added logging to avoid silent overrides. auto makes some sense and there's already a bunch of load_format override logic.

…ct#24435) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>

…ct#24435) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com> Signed-off-by: charlifu <charlifu@amd.com>

gemini-code-assist bot reviewed Sep 8, 2025

View reviewed changes

vllm/engine/arg_utils.py Outdated Show resolved Hide resolved

lengrongfu force-pushed the feat/add-verify-runai branch from 8c3c485 to efb08cf Compare September 8, 2025 10:13

22quinn reviewed Sep 9, 2025

View reviewed changes

DarkLight1337 closed this Sep 10, 2025

DarkLight1337 reopened this Sep 14, 2025

lengrongfu force-pushed the feat/add-verify-runai branch from 6300f6c to 9196a86 Compare September 16, 2025 10:05

lengrongfu requested review from simon-mo, WoosukKwon, youkaichao, robertgshaw2-redhat, mgoin, tlrmchlsmth, houseroad, hmellor, yewentao256 and ProExpertProg as code owners September 16, 2025 10:05

lengrongfu force-pushed the feat/add-verify-runai branch from 9196a86 to a12ab8d Compare September 16, 2025 10:14

lengrongfu added 3 commits September 17, 2025 00:57

[Bugfix] when use s3 model cannot use default load_format

7de38ab

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

add verify logic to create_model_config method

28f1cee

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

add verify load_format in vllmconfig

b26328d

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>

lengrongfu force-pushed the feat/add-verify-runai branch from a12ab8d to b26328d Compare September 17, 2025 08:01

Merge branch 'main' into feat/add-verify-runai

f4c1d1a

22quinn approved these changes Sep 18, 2025

View reviewed changes

22quinn enabled auto-merge (squash) September 18, 2025 05:19

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 18, 2025

Add logging for override

43d2751

22quinn merged commit 350c94d into vllm-project:main Sep 18, 2025
43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] when use s3 model cannot use default load_format #24435

[Bugfix] when use s3 model cannot use default load_format #24435

Uh oh!

lengrongfu commented Sep 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

22quinn left a comment

Uh oh!

lengrongfu commented Sep 9, 2025

Uh oh!

hmellor commented Sep 9, 2025

Uh oh!

hmellor commented Sep 9, 2025

Uh oh!

lengrongfu commented Sep 9, 2025

Uh oh!

DarkLight1337 commented Sep 10, 2025

Uh oh!

lengrongfu commented Sep 10, 2025

Uh oh!

lengrongfu commented Sep 13, 2025

Uh oh!

lengrongfu commented Sep 15, 2025

Uh oh!

hmellor commented Sep 15, 2025 •

edited

Loading

Uh oh!

lengrongfu commented Sep 16, 2025

Uh oh!

22quinn left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Bugfix] when use s3 model cannot use default load_format #24435

[Bugfix] when use s3 model cannot use default load_format #24435

Uh oh!

Conversation

lengrongfu commented Sep 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

22quinn left a comment

Choose a reason for hiding this comment

Uh oh!

lengrongfu commented Sep 9, 2025

Uh oh!

hmellor commented Sep 9, 2025

Uh oh!

hmellor commented Sep 9, 2025

Uh oh!

lengrongfu commented Sep 9, 2025

Uh oh!

DarkLight1337 commented Sep 10, 2025

Uh oh!

lengrongfu commented Sep 10, 2025

Uh oh!

lengrongfu commented Sep 13, 2025

Uh oh!

lengrongfu commented Sep 15, 2025

Uh oh!

hmellor commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lengrongfu commented Sep 16, 2025

Uh oh!

22quinn left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lengrongfu commented Sep 8, 2025 •

edited by github-actions bot

Loading

hmellor commented Sep 15, 2025 •

edited

Loading

22quinn left a comment •

edited

Loading