-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Misc] Fix examples openai_pooling_client.py #24853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully fixes a broken example in openai_pooling_client.py
and improves the project structure by reorganizing pooling-related examples into a dedicated pooling
subdirectory. The changes are generally solid. However, I've identified a few inconsistencies in the newly added example commands within docstrings. Specifically, the --runner pooling
flag is missing in several places. While vLLM might auto-detect the correct runner, explicitly including this flag would make the examples more robust, consistent with the documentation, and less error-prone for users who might adapt them for other models. I've provided specific suggestions to address this.
These example files no longer exist in the doc preview. cc @hmellor do you know why this happens? |
The generation of multi-file examples was hand written so does not cover every corner case. For this case you just have to add a Here is an example of a multi-file example with a readme https://github.com/vllm-project/vllm/tree/main/examples/online_serving/chart-helm and its corresponding docs page which:
|
I shouldn't touch these files QvQ |
Signed-off-by: wang.yuqi <noooop@126.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you updated the links to these examples in the docs, could you also verify that either:
- None of these examples are executed in testing
- Any examples that are executed in testing have their paths updated too
As far as I know, none of these examples are executed in testing vllm ci is very complicated, please help me do a double check |
If none of the example appear in vllm/.buildkite/test-pipeline.yaml Lines 320 to 343 in 59e17dd
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you shorten the top level headings? We are already in the example section for online/offline examples. Shortening the titles improves readablility in the nav drawer
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <noooop@126.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for consolidating these examples!
Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Purpose
Use jason9693/Qwen2.5-1.5B-apeach to demonstrate pooling api, but #20930 defaults to chunked prefill, while all pooling does not support chunked prefill, so the encode task is disabled.
So this demonstration will output an Error
Use internlm/internlm2-1_8b-reward to demonstrate that pooling api is more suitable
address #24650 (comment)
Put pooling related examples into a separate folder for easy access by users. ( Other examples might also need to be organized into folders, but I'm not very familiar with these examples.
Verify that the output of all other pooling examples is correct.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.