[Misc] Fix Benchmark TTFT Calculation for Chat Completions #3768

ywang96 · 2024-04-01T05:42:07Z

This PR fixes the TTFT calculation when running serving benchmark with chat completions API. Previously it didn't check if actual content is Null in the delta, and thus role chunk with empty content could be mistakenly treated as a real token response.

Note: This does not affect existing CI results since by default vLLM uses completions instead of chat completions for benchmarking.

Fixes #3759

zhuohan123

LGTM!

fix

26f5922

zhuohan123 approved these changes Apr 1, 2024

View reviewed changes

zhuohan123 merged commit ccb58b2 into vllm-project:main Apr 1, 2024
34 checks passed

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Fix Benchmark TTFT Calculation for Chat Completions #3768

[Misc] Fix Benchmark TTFT Calculation for Chat Completions #3768

ywang96 commented Apr 1, 2024

zhuohan123 left a comment

[Misc] Fix Benchmark TTFT Calculation for Chat Completions #3768

[Misc] Fix Benchmark TTFT Calculation for Chat Completions #3768

Conversation

ywang96 commented Apr 1, 2024

zhuohan123 left a comment

Choose a reason for hiding this comment