Skip to content

Conversation

lengrongfu
Copy link
Contributor

@lengrongfu lengrongfu commented Sep 8, 2025

Purpose

#23236 (comment)

Current cannot use default loader model from s3 , so should to check it.

Test Plan

Test Result

image
Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an issue where loading models from S3 fails with the default load_format. The change introduces a check in EngineArgs.__post_init__ to enforce that S3 models use the runai_streamer load format.

However, the current implementation has some critical issues. It is overly restrictive and will cause an error on the common use case of providing an S3 path with the default load_format="auto", leading to a poor user experience. Additionally, this bugfix lacks any unit tests, which is essential for verifying the correctness of the new logic and preventing regressions. I have provided a detailed comment with a suggested code change to improve the logic and a request to add comprehensive tests. Addressing these points is critical to ensure the change is robust and user-friendly.

@lengrongfu lengrongfu force-pushed the feat/add-verify-runai branch from 8c3c485 to efb08cf Compare September 8, 2025 10:13
Copy link
Collaborator

@22quinn 22quinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure would occur for other formats too if user does not set the right one. Do we want to clutter arg_utils with many custom overrides? Is the expectation for users to set it right or for vllm to always infer the right format? If we decide the 2nd way, we should infer everything.
We can get some second opinions. cc @hmellor

@lengrongfu
Copy link
Contributor Author

The failure would occur for other formats too if user does not set the right one. Do we want to clutter arg_utils with many custom overrides? Is the expectation for users to set it right or for vllm to always infer the right format? If we decide the 2nd way, we should infer everything. We can get some second opinions. cc @hmellor

This is a very good point.

  • From a user-friendly perspective, it's probably better for VLLM to always infer the correct format.
  • From a software maintenance perspective, it's probably better for users to set it correctly.

So it's a trade-off.

@hmellor
Copy link
Member

hmellor commented Sep 9, 2025

Instead of arg_utils we could move the validation into ModelConfig? This is then localised to where it is most relevant.

(FYI, currently ModelConfig lives in config/__init__.py but it will eventually by moved to config/model.py.)

@hmellor
Copy link
Member

hmellor commented Sep 9, 2025

As for the design decision, if we have a value called "auto" it'd be nice it it worked for all the supported load_formats

@lengrongfu
Copy link
Contributor Author

Instead of arg_utils we could move the validation into ModelConfig? This is then localised to where it is most relevant.

(FYI, currently ModelConfig lives in config/__init__.py but it will eventually by moved to config/model.py.)

  1. because load_format field is in EngineArgs struct, canot access in ModelConfig; base before have like logic in create_model_config method, so i move to this method.
  2. current don't config/model.py this file, i can commit a pr moved to config/model.py.

@DarkLight1337
Copy link
Member

Closing as superseded by #23845

@lengrongfu
Copy link
Contributor Author

@DarkLight1337 I'll verify whether this pull request #23845 solves my problem. If not, this pull request may still be needed.

@lengrongfu
Copy link
Contributor Author

@DarkLight1337 After testing, I think this problem is not solved. #23845 this PR is like pr #23842. we can reopen this pr?

@DarkLight1337 DarkLight1337 reopened this Sep 14, 2025
@lengrongfu
Copy link
Contributor Author

Hey @hmellor @22quinn the PR has been updated. Could you please review it when you have a moment? Thanks!

@hmellor
Copy link
Member

hmellor commented Sep 15, 2025

  1. because load_format field is in EngineArgs struct, canot access in ModelConfig; base before have like logic in create_model_config method, so i move to this method.

load_format is a Literal not a struct. I made a mistake before, load_format is part of LoadConfig. Since this validation requires information about load_config and model_config can it be moved to VllmConfig instead?

@lengrongfu
Copy link
Contributor Author

  1. because load_format field is in EngineArgs struct, canot access in ModelConfig; base before have like logic in create_model_config method, so i move to this method.

load_format is a Literal not a struct. I made a mistake before, load_format is part of LoadConfig. Since this validation requires information about load_config and model_config can it be moved to VllmConfig instead?

@hmellor yes, it need load_config and model_config, so i add this validation to VllmConfig.try_verify_and_update_config .

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
@lengrongfu lengrongfu force-pushed the feat/add-verify-runai branch from a12ab8d to b26328d Compare September 17, 2025 08:01
Copy link
Collaborator

@22quinn 22quinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added logging to avoid silent overrides. auto makes some sense and there's already a bunch of load_format override logic.

@22quinn 22quinn enabled auto-merge (squash) September 18, 2025 05:19
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 18, 2025
@22quinn 22quinn merged commit 350c94d into vllm-project:main Sep 18, 2025
43 checks passed
debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025
…ct#24435)

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
…ct#24435)

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
…ct#24435)

Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: charlifu <charlifu@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants