-
Notifications
You must be signed in to change notification settings - Fork 179
o11y: Add TTFT and TPOT histograms for SLOs #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
// handleResponseHeaders processes the response headers | ||
func (r *OpenAIRouter) handleResponseHeaders(_ *ext_proc.ProcessingRequest_ResponseHeaders) (*ext_proc.ProcessingResponse, error) { | ||
func (r *OpenAIRouter) handleResponseHeaders(_ *ext_proc.ProcessingRequest_ResponseHeaders, ctx *RequestContext) (*ext_proc.ProcessingResponse, error) { | ||
// Best-effort TTFT measurement: record on first response headers if we have a start time and model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now we haven't tried streaming mode. In the buffered mode, the response from LLM has to be fully received before the response is received. If you can add an issue to track TTFT in streaming mode, that'll be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#128 for tracking
@tao12345666333 the go mod needs an update. Once the CI is green, this is ready to go. Thanks. |
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
What type of PR is this?
o11y: Add TTFT and TPOT histograms for SLOs
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #121
Release Notes: Yes/No