-
Notifications
You must be signed in to change notification settings - Fork 102
feat: make model metrics endpoints configurable #1000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nayihz The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
I have some doubts about adding additional fields to InferencePool. cc @kfswain @ahg-g @robscott @danehans @elevran /hold for others to comment. |
ffad486
to
f57478d
Compare
@nirrozenbaum , thanks for you advise. I think you are right. |
@nayihz I would start with command-line args with default values (the existing ones). |
f57478d
to
9fa17f7
Compare
@nayihz @nirrozenbaum this introduces a fixed endpoint for all model servers in the pool. apiVersion: v1
kind: Pod
metadata:
name: my-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "8080" |
@elevran we already have a fixed endpoint, so this PR is not introducing it :). the intention was to make that endpoint configurable. |
@@ -73,6 +73,10 @@ func (f *FakePodMetricsClient) FetchMetrics(ctx context.Context, pod *backend.Po | |||
return res.Clone(), nil | |||
} | |||
|
|||
func (f *FakePodMetricsClient) GetMetricEndpoint(_ *backend.Pod, _ int32) string { | |||
return "fake-metric-endpoint" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Pluralize func name and return value
cmd/epp/runner/runner.go
Outdated
@@ -110,6 +110,9 @@ var ( | |||
"vllm:lora_requests_info", | |||
"Prometheus metric for the LoRA info metrics (must be in vLLM label format).") | |||
|
|||
metricPort = flag.Int("metricPort", 0, "Port to scrape metrics from pods") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Pluralize CLI args
Additional verbosity should be used for the CLI arg names, e.g. podMetricsPort
or modelMetricsPort
.
@nirrozenbaum @kfswain any thoughts on the name for these CLI arga?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
podMetricsPort
or modelMetricsPort
seem reasonable enough. If we wanted to be very specific we could call it modelServerMetricsPort
as we are scraping for very specific metrics. But thats honestly being very nitty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modelServerMetricsPort
sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. PTAL.
/unhold |
9fa17f7
to
7466a28
Compare
fix: #16