[Misc] Fix examples openai_pooling_client.py #24853

noooop · 2025-09-15T06:14:41Z

Purpose

examples/online_serving/openai_pooling_client.py

Use jason9693/Qwen2.5-1.5B-apeach to demonstrate pooling api, but #20930 defaults to chunked prefill, while all pooling does not support chunked prefill, so the encode task is disabled.

So this demonstration will output an Error

Pooling Response:
{'error': {'code': 400,
           'message': 'The model does not support Pooling API',
           'param': None,
           'type': 'BadRequestError'}}

Use internlm/internlm2-1_8b-reward to demonstrate that pooling api is more suitable

address #24650 (comment)

Put pooling related examples into a separate folder for easy access by users. ( Other examples might also need to be organized into folders, but I'm not very familiar with these examples.
Verify that the output of all other pooling examples is correct.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: wang.yuqi <noooop@126.com>

gemini-code-assist

Code Review

This pull request successfully fixes a broken example in openai_pooling_client.py and improves the project structure by reorganizing pooling-related examples into a dedicated pooling subdirectory. The changes are generally solid. However, I've identified a few inconsistencies in the newly added example commands within docstrings. Specifically, the --runner pooling flag is missing in several places. While vLLM might auto-detect the correct runner, explicitly including this flag would make the examples more robust, consistent with the documentation, and less error-prone for users who might adapt them for other models. I've provided specific suggestions to address this.

examples/online_serving/pooling/openai_classification_client.py

examples/online_serving/pooling/openai_embedding_client.py

examples/online_serving/pooling/openai_pooling_client.py

Signed-off-by: wang.yuqi <noooop@126.com>

DarkLight1337 · 2025-09-15T08:07:27Z

These example files no longer exist in the doc preview. cc @hmellor do you know why this happens?

hmellor · 2025-09-15T08:27:43Z

The generation of multi-file examples was hand written so does not cover every corner case.

For this case you just have to add a README.md in each subfolder. Each script will then appear in the Example materials section of the documentation page.

Here is an example of a multi-file example with a readme https://github.com/vllm-project/vllm/tree/main/examples/online_serving/chart-helm and its corresponding docs page which:

Links to the directory on GitHub
Includes the content of the README
lists all the files that are not the README in expandable admonitions (the titles of these admonitions are the file names, so make sure they're informative!)

noooop · 2025-09-15T08:42:38Z

lists all the files that are not the README in expandable admonitions (the titles of these admonitions are the file names, so make sure they're informative!)

I shouldn't touch these files QvQ

Signed-off-by: wang.yuqi <noooop@126.com>

hmellor

I see you updated the links to these examples in the docs, could you also verify that either:

None of these examples are executed in testing
Any examples that are executed in testing have their paths updated too

examples/offline_inference/pooling/README.md

examples/online_serving/pooling/openai_pooling_client.py

noooop · 2025-09-15T09:08:08Z

None of these examples are executed in testing

As far as I know, none of these examples are executed in testing

vllm ci is very complicated, please help me do a double check

Signed-off-by: wang.yuqi <noooop@126.com>

hmellor · 2025-09-15T09:20:49Z

If none of the example appear in

vllm/.buildkite/test-pipeline.yaml

Lines 320 to 343 in 59e17dd

    
           - label: Examples Test # 30min 
        
             timeout_in_minutes: 45 
        
             mirror_hardwares: [amdexperimental] 
        
             working_dir: "/vllm-workspace/examples" 
        
             source_file_dependencies: 
        
             - vllm/entrypoints 
        
             - examples/ 
        
             commands: 
        
               - pip install tensorizer # for tensorizer test 
        
               - python3 offline_inference/basic/generate.py --model facebook/opt-125m 
        
               - python3 offline_inference/basic/generate.py --model meta-llama/Llama-2-13b-chat-hf --cpu-offload-gb 10 
        
               - python3 offline_inference/basic/chat.py 
        
               - python3 offline_inference/prefix_caching.py 
        
               - python3 offline_inference/llm_engine_example.py 
        
               - python3 offline_inference/audio_language.py --seed 0 
        
               - python3 offline_inference/vision_language.py --seed 0 
        
               - python3 offline_inference/vision_language_pooling.py --seed 0 
        
               - python3 offline_inference/vision_language_multi_image.py --seed 0 
        
               - VLLM_USE_V1=0 python3 others/tensorize_vllm_model.py --model facebook/opt-125m serialize --serialized-directory /tmp/ --suffix v1 && python3 others/tensorize_vllm_model.py --model facebook/opt-125m deserialize --path-to-tensors /tmp/vllm/facebook/opt-125m/v1/model.tensors 
        
               - python3 offline_inference/encoder_decoder_multimodal.py --model-type whisper --seed 0 
        
               - python3 offline_inference/basic/classify.py 
        
               - python3 offline_inference/basic/embed.py 
        
               - python3 offline_inference/basic/score.py 
        
               - VLLM_USE_V1=0 python3 offline_inference/profiling.py --model facebook/opt-125m run_num_steps --num-steps 2

I think we're good

Signed-off-by: wang.yuqi <noooop@126.com>

hmellor

Could you shorten the top level headings? We are already in the example section for online/offline examples. Shortening the titles improves readablility in the nav drawer

examples/offline_inference/pooling/README.md

examples/online_serving/pooling/README.md

Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>

Signed-off-by: wang.yuqi <noooop@126.com>

hmellor

LGTM, thanks for consolidating these examples!

Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

noooop added 2 commits September 15, 2025 13:39

+ pooling

3a163e9

Signed-off-by: wang.yuqi <noooop@126.com>

+ fix

9739a89

Signed-off-by: wang.yuqi <noooop@126.com>

noooop requested a review from hmellor as a code owner September 15, 2025 06:14

mergify bot added documentation Improvements or additions to documentation qwen Related to Qwen models labels Sep 15, 2025

gemini-code-assist bot reviewed Sep 15, 2025

View reviewed changes

examples/online_serving/pooling/openai_classification_client.py Show resolved Hide resolved

examples/online_serving/pooling/openai_embedding_client.py Show resolved Hide resolved

examples/online_serving/pooling/openai_pooling_client.py Show resolved Hide resolved

fix

0cbaaa7

Signed-off-by: wang.yuqi <noooop@126.com>

noooop changed the title ~~[Misc] Fix openai_pooling_client.py examples~~ [Misc] Fix examples openai_pooling_client.py Sep 15, 2025

noooop added 2 commits September 15, 2025 16:49

+ README.md

77a156c

Signed-off-by: wang.yuqi <noooop@126.com>

fix

1b71beb

Signed-off-by: wang.yuqi <noooop@126.com>

hmellor reviewed Sep 15, 2025

View reviewed changes

examples/offline_inference/pooling/README.md Outdated Show resolved Hide resolved

examples/online_serving/pooling/openai_pooling_client.py Show resolved Hide resolved

fix

bbf5d8b

Signed-off-by: wang.yuqi <noooop@126.com>

noooop added 2 commits September 15, 2025 17:22

fix

e3ab036

Signed-off-by: wang.yuqi <noooop@126.com>

fix

e9649a0

Signed-off-by: wang.yuqi <noooop@126.com>

hmellor requested changes Sep 15, 2025

View reviewed changes

examples/offline_inference/pooling/README.md Outdated Show resolved Hide resolved

examples/online_serving/pooling/README.md Outdated Show resolved Hide resolved

noooop and others added 3 commits September 15, 2025 18:35

Update examples/offline_inference/pooling/README.md

27f7e4a

Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>

Update examples/online_serving/pooling/README.md

8626601

Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: wang.yuqi <noooop@126.com>

move openai_embedding_long_text out

002c381

Signed-off-by: wang.yuqi <noooop@126.com>

hmellor approved these changes Sep 15, 2025

View reviewed changes

hmellor enabled auto-merge (squash) September 15, 2025 11:25

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 15, 2025

hmellor merged commit bf214ca into vllm-project:main Sep 15, 2025
27 of 29 checks passed

noooop deleted the fix_pooling_examples branch September 15, 2025 13:08

DarkLight1337 mentioned this pull request Sep 16, 2025

add dtype and max_num_batched_tokens to classify example so that it can be run out of box #24965

Open

5 tasks

noooop mentioned this pull request Sep 19, 2025

[Bug]: "pooling_type='ALL' no longer supported for embeddings #25165

Open

1 task

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Misc] Fix examples openai_pooling_client.py (vllm-project#24853)

b8266b7

Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

noooop mentioned this pull request Sep 27, 2025

[RFC]: Support Returning Prompt Hidden States #24288

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Fix examples openai_pooling_client.py #24853

[Misc] Fix examples openai_pooling_client.py #24853

Uh oh!

noooop commented Sep 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 commented Sep 15, 2025

Uh oh!

hmellor commented Sep 15, 2025 •

edited

Loading

Uh oh!

noooop commented Sep 15, 2025

Uh oh!

hmellor left a comment

Uh oh!

Uh oh!

Uh oh!

noooop commented Sep 15, 2025 •

edited

Loading

Uh oh!

hmellor commented Sep 15, 2025

Uh oh!

hmellor left a comment

Uh oh!

Uh oh!

Uh oh!

hmellor left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Misc] Fix examples openai_pooling_client.py #24853

[Misc] Fix examples openai_pooling_client.py #24853

Uh oh!

Conversation

noooop commented Sep 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 commented Sep 15, 2025

Uh oh!

hmellor commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

noooop commented Sep 15, 2025

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

noooop commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmellor commented Sep 15, 2025

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

noooop commented Sep 15, 2025 •

edited by github-actions bot

Loading

hmellor commented Sep 15, 2025 •

edited

Loading

noooop commented Sep 15, 2025 •

edited

Loading