Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Fix Benchmark TTFT Calculation for Chat Completions #3768

Merged
merged 1 commit into from
Apr 1, 2024

Conversation

ywang96
Copy link
Collaborator

@ywang96 ywang96 commented Apr 1, 2024

This PR fixes the TTFT calculation when running serving benchmark with chat completions API. Previously it didn't check if actual content is Null in the delta, and thus role chunk with empty content could be mistakenly treated as a real token response.

Note: This does not affect existing CI results since by default vLLM uses completions instead of chat completions for benchmarking.

Fixes #3759

Copy link
Collaborator

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@zhuohan123 zhuohan123 merged commit ccb58b2 into vllm-project:main Apr 1, 2024
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TTFT in openllm with vllm backend VS vllm
2 participants