[Feature] Default `ignore_eos` True for `random` dataset #28227

yewentao256 · 2025-11-06T16:05:44Z

Purpose

Idea from @mgoin

vllm bench serve --model deepseek-ai/DeepSeek-R1  --dataset-name random --host 127.0.0.1 --port 9256 --random-input-len 4 --random-output-len 1024 --request-rate inf --num-prompts 160 --max-concurrency 32

With overlap

============ Serving Benchmark Result ============
Successful requests:                     160       
Failed requests:                         0         
Maximum request concurrency:             32        
Benchmark duration (s):                  171.55    
Total input tokens:                      480       
Total generated tokens:                  123383    
Request throughput (req/s):              0.93      
Output token throughput (tok/s):         719.24    
Peak output token throughput (tok/s):    832.00    
Peak concurrent requests:                49.00     
Total Token throughput (tok/s):          722.04    
---------------Time to First Token----------------
Mean TTFT (ms):                          168.90    
Median TTFT (ms):                        87.14     
P99 TTFT (ms):                           420.81    
-----Time per Output Token (excl. 1st token)------
Mean TPOT (ms):                          39.90     
Median TPOT (ms):                        40.11     
P99 TPOT (ms):                           40.23     
---------------Inter-token Latency----------------
Mean ITL (ms):                           39.90     
Median ITL (ms):                         39.90     
P99 ITL (ms):                            46.70     
==================================================
Without (main):

============ Serving Benchmark Result ============
Successful requests:                     160       
Failed requests:                         0         
Maximum request concurrency:             32        
Benchmark duration (s):                  166.48    
Total input tokens:                      480       
Total generated tokens:                  124313    
Request throughput (req/s):              0.96      
Output token throughput (tok/s):         746.73    
Peak output token throughput (tok/s):    864.00    
Peak concurrent requests:                50.00     
Total Token throughput (tok/s):          749.62    
---------------Time to First Token----------------
Mean TTFT (ms):                          135.40    
Median TTFT (ms):                        83.12     
P99 TTFT (ms):                           286.72    
-----Time per Output Token (excl. 1st token)------
Mean TPOT (ms):                          38.42     
Median TPOT (ms):                        38.61     
P99 TPOT (ms):                           38.76     
---------------Inter-token Latency----------------
Mean ITL (ms):                           38.43     
Median ITL (ms):                         38.43     
P99 ITL (ms):                            44.55     
==================================================

Total generated tokens is different since ignore_eos is not set, which may be harmful to benchmark.

This PR make ignore_eos defaults to True if random_output_len is set.

Update:
We don't need to check random_output_len since it is already set to 128 by default

Signed-off-by: yewentao256 <zhyanwentao@126.com>

gemini-code-assist

Code Review

This pull request changes the default behavior of the serve benchmark for random datasets by setting ignore_eos to True. This is a sensible change to ensure that generation proceeds for the full requested length, improving benchmark consistency. My review focuses on improving the clarity and correctness of the implementation. The current logic contains a redundant check and a misleading comment. I've suggested changes to simplify the code and align the comments with the actual behavior, making the code easier to understand and maintain.

vllm/benchmarks/serve.py

Signed-off-by: yewentao256 <zhyanwentao@126.com>

…_len-is-set

…t#28227) Signed-off-by: yewentao256 <zhyanwentao@126.com>

…t#28227) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Default ignore_eos True when random_output_len is set

9766e3b

Signed-off-by: yewentao256 <zhyanwentao@126.com>

mergify bot added the performance Performance-related issues label Nov 6, 2025

gemini-code-assist bot reviewed Nov 6, 2025

View reviewed changes

vllm/benchmarks/serve.py Outdated Show resolved Hide resolved

vllm/benchmarks/serve.py Outdated Show resolved Hide resolved

update

7faa898

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 changed the title ~~[Feature] Default ignore_eos True when random_output_len is set~~ [Feature] Default ignore_eos True for random dataset Nov 6, 2025

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 6, 2025

Merge branch 'main' into wentao-default-ignore_eos-when-random_output…

5108962

…_len-is-set

mgoin approved these changes Nov 7, 2025

View reviewed changes

mgoin merged commit 4b1ff13 into main Nov 7, 2025
49 checks passed

mgoin deleted the wentao-default-ignore_eos-when-random_output_len-is-set branch November 7, 2025 12:35

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

[Feature] Default ignore_eos True for random dataset (vllm-projec…

cd886a0

…t#28227) Signed-off-by: yewentao256 <zhyanwentao@126.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Nov 13, 2025

[Feature] Default ignore_eos True for random dataset (vllm-projec…

01dd1c5

…t#28227) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature] Default `ignore_eos` True for `random` dataset #28227

[Feature] Default `ignore_eos` True for `random` dataset #28227

Uh oh!

yewentao256 commented Nov 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Feature] Default ignore_eos True for random dataset #28227

[Feature] Default ignore_eos True for random dataset #28227

Uh oh!

Conversation

yewentao256 commented Nov 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Feature] Default `ignore_eos` True for `random` dataset #28227

[Feature] Default `ignore_eos` True for `random` dataset #28227

yewentao256 commented Nov 6, 2025 •

edited by github-actions bot

Loading