Skip to content

feat: Make metrics stale time configurable #1046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nayihz
Copy link
Contributor

@nayihz nayihz commented Jun 23, 2025

fix: #336
changes ref: #336 (comment)

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 23, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: nayihz
Once this PR has been reviewed and has the lgtm label, please assign arangogutierrez for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from ahg-g and robscott June 23, 2025 12:33
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 23, 2025
Copy link

netlify bot commented Jun 23, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 75047b3
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6864a429fcd1c300083f7e11
😎 Deploy Preview https://deploy-preview-1046--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@nayihz nayihz force-pushed the feat_metric_stale_time branch from 12f8bfe to 2d42a53 Compare June 23, 2025 12:35
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 25, 2025
@nayihz nayihz force-pushed the feat_metric_stale_time branch from a339897 to 1005486 Compare June 25, 2025 05:27
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 25, 2025
@nayihz nayihz force-pushed the feat_metric_stale_time branch from 1005486 to 9b1e7e2 Compare June 25, 2025 05:28
@nayihz nayihz marked this pull request as ready for review June 25, 2025 09:22
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 25, 2025
@k8s-ci-robot k8s-ci-robot requested a review from danehans June 25, 2025 09:22
@nayihz nayihz force-pushed the feat_metric_stale_time branch from 9b1e7e2 to 12943a7 Compare June 25, 2025 09:29
@nayihz
Copy link
Contributor Author

nayihz commented Jun 25, 2025

/cc @liu-cong

@k8s-ci-robot k8s-ci-robot requested a review from liu-cong June 25, 2025 09:48
@nayihz nayihz force-pushed the feat_metric_stale_time branch from 12943a7 to 518655c Compare June 29, 2025 07:11
@nayihz nayihz force-pushed the feat_metric_stale_time branch from 518655c to e54be57 Compare June 29, 2025 13:31
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2025
@nayihz
Copy link
Contributor Author

nayihz commented Jul 1, 2025

I found that it becomes very inconvenient to write unit tests after updating PodGetAll to PodGetAllWithFreshMetrics. But after reading the code in depth, I still couldn't come up with a good solution. Any ideas on this? @nirrozenbaum @liu-cong
https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1046/files#diff-1b7741fc131b712835ea0040fe1dc86b62403c0b124f0d672ef8bfadb84d32d3R325-R328

https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1046/files#diff-1b7741fc131b712835ea0040fe1dc86b62403c0b124f0d672ef8bfadb84d32d3R353

@nirrozenbaum
Copy link
Contributor

I found that it becomes very inconvenient to write unit tests after updating PodGetAll to PodGetAllWithFreshMetrics. But after reading the code in depth, I still couldn't come up with a good solution. Any ideas on this? @nirrozenbaum @liu-cong https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1046/files#diff-1b7741fc131b712835ea0040fe1dc86b62403c0b124f0d672ef8bfadb84d32d3R325-R328

https://github.com/kubernetes-sigs/gateway-api-inference-extension/pull/1046/files#diff-1b7741fc131b712835ea0040fe1dc86b62403c0b124f0d672ef8bfadb84d32d3R353

@nayihz I don’t want to nitpick too much, but to be honest I’m not sure why the interface change was required.
we have (today, before this PR) in datastore PodGetAll and PodList(predicate).
couldn’t we implement the “get pod with fresh metrics” with PodList(predicate == function to return only fresh pod)?

@nirrozenbaum
Copy link
Contributor

nirrozenbaum commented Jul 1, 2025

I mean - to leave PodGetAll function as is.. and use the ListPod with that predicate only in the specific places it’s needed. would that help?

@nayihz nayihz force-pushed the feat_metric_stale_time branch from e54be57 to 6bde389 Compare July 2, 2025 02:30
@nayihz nayihz force-pushed the feat_metric_stale_time branch from 6bde389 to bff4272 Compare July 2, 2025 02:38
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 2, 2025
@nayihz
Copy link
Contributor Author

nayihz commented Jul 2, 2025

to leave PodGetAll function as is.. and use the ListPod with that predicate only in the specific places it’s needed.

make sense to me.

@nayihz nayihz force-pushed the feat_metric_stale_time branch from bff4272 to 1fbad64 Compare July 2, 2025 02:58
@nayihz nayihz force-pushed the feat_metric_stale_time branch from 1fbad64 to 75047b3 Compare July 2, 2025 03:14
@nayihz
Copy link
Contributor Author

nayihz commented Jul 2, 2025

if !found {
return schedulingtypes.ToSchedulerPodMetrics(d.datastore.PodGetAll())
}
// Check if endpoint key is present in the subset map and ensure there is at least one value
endpointSubsetList, found := subsetMap[subsetHintKey].([]any)
if !found {
return schedulingtypes.ToSchedulerPodMetrics(d.datastore.PodGetAll())

Will change d.datastore.PodGetAll to d.datastore.PodList(backendmetrics.FreshMetricsFn) in a follow up because we should refractor the unit test. @nirrozenbaum

@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 10, 2025
@nayihz nayihz requested review from nirrozenbaum and liu-cong July 10, 2025 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make metrics stale time configurable
4 participants