-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Bugfix] when use s3 model cannot use default load_format #24435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to fix an issue where loading models from S3 fails with the default load_format
. The change introduces a check in EngineArgs.__post_init__
to enforce that S3 models use the runai_streamer
load format.
However, the current implementation has some critical issues. It is overly restrictive and will cause an error on the common use case of providing an S3 path with the default load_format="auto"
, leading to a poor user experience. Additionally, this bugfix lacks any unit tests, which is essential for verifying the correctness of the new logic and preventing regressions. I have provided a detailed comment with a suggested code change to improve the logic and a request to add comprehensive tests. Addressing these points is critical to ensure the change is robust and user-friendly.
8c3c485
to
efb08cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failure would occur for other formats too if user does not set the right one. Do we want to clutter arg_utils
with many custom overrides? Is the expectation for users to set it right or for vllm to always infer the right format? If we decide the 2nd way, we should infer everything.
We can get some second opinions. cc @hmellor
This is a very good point.
So it's a trade-off. |
Instead of arg_utils we could move the validation into (FYI, currently |
As for the design decision, if we have a value called |
|
Closing as superseded by #23845 |
@DarkLight1337 I'll verify whether this pull request #23845 solves my problem. If not, this pull request may still be needed. |
@DarkLight1337 After testing, I think this problem is not solved. #23845 this PR is like pr #23842. we can reopen this pr? |
|
6300f6c
to
9196a86
Compare
9196a86
to
a12ab8d
Compare
@hmellor yes, it need |
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
a12ab8d
to
b26328d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added logging to avoid silent overrides. auto
makes some sense and there's already a bunch of load_format
override logic.
…ct#24435) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
…ct#24435) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com>
…ct#24435) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io> Co-authored-by: 22quinn <33176974+22quinn@users.noreply.github.com> Signed-off-by: charlifu <charlifu@amd.com>
Purpose
#23236 (comment)
Current cannot use default loader model from s3 , so should to check it.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.