Skip to content

Add subsetting logic for epp #981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 25, 2025
Merged

Conversation

rlakhtakia
Copy link
Contributor

Issue: #415
Proposal

Add subsetting filter to ensure EPP only selects from the list of endpoints passed in through request metadata.

Changes:

  • Filter logic + unit test + integration test
  • Update request type to pass in metadata context
  • Update scheduler profiles to use filter

Copy link

netlify bot commented Jun 13, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 4364512
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/685c58d0b17b570008c0605e
😎 Deploy Preview https://deploy-preview-981--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 13, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 13, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @rlakhtakia. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 13, 2025
Copy link
Contributor

@nirrozenbaum nirrozenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envoy specifics should be only in the server.go.
rest of the code should work with go general structs like maps

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 13, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 13, 2025
@nirrozenbaum
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 16, 2025
Copy link
Collaborator

@kfswain kfswain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple comments to future proof us! otherwise lgtm

@kfswain
Copy link
Collaborator

kfswain commented Jun 17, 2025

/retest

@rlakhtakia
Copy link
Contributor Author

/retest

Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few nits

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 18, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 18, 2025
@ahg-g
Copy link
Contributor

ahg-g commented Jun 23, 2025

It is part of conformance.

My point is that this doesn't need to be the deciding factor of how to implement a feature. Plugins is an architectural design pattern where features are implemented as callbacks on defined extension points. The configuration of the plugins is where we decide which ones are a must (part of conformance) and which are optional. The former can be done in code (or via validation on the config api), the latter via the config api.

@kfswain
Copy link
Collaborator

kfswain commented Jun 23, 2025

Yeah, I see your point, just making sure we are considering this fundamental.

I think it's fine, so long as its only filters, and we don't add any extension points that run before filters.

To me it makes sense that any immutable features run strictly before or after any user configured code. So that works for Filters for now, but may not always be the case.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 23, 2025
@rlakhtakia
Copy link
Contributor Author

/retest

@nirrozenbaum
Copy link
Contributor

nirrozenbaum commented Jun 23, 2025

To me it makes sense that any immutable features run strictly before or after any user configured code. So that works for Filters for now, but may not always be the case.

++. agreed.
Subset Filter works for now, but this may not always be the case.
anyway, I didn't intend to block this PR, just to raise a point to think about.
let's make progress with merging this one as soon as it passes the tests and we can think about this more as we make progress.

@ahg-g
Copy link
Contributor

ahg-g commented Jun 23, 2025

@rlakhtakia pls rebase and run unit tests locally, this should allow you to find compile time problems earlier.

@nirrozenbaum
Copy link
Contributor

nirrozenbaum commented Jun 25, 2025

I went over the PR. overall it LGTM if we decide to keep it as filter.

one additional concern about the earlier discussion on this thread -
in case this is implemented via filter and we have multiple scheduling profiles - this filter has to run in each.
so if for example we have 4 profiles, 10000 candidate pods, and we specify in the subset key a list of 10 pods to choose from - we will need to filter the 10 pods out of the 10000 candidates 4 times.

To me it makes sense that any immutable features run strictly before or after any user configured code. So that works for Filters for now, but may not always be the case.

so in addition to the above comment @kfswain wrote, this works fine in current design as long as we have only one profile (not the case in llm-d).
the alternative is to implement it here (replace ds.PodGetAll with ds.PodList(predicate):

candidatePods := schedulingtypes.ToSchedulerPodMetrics(d.datastore.PodGetAll())
results, err := d.scheduler.Schedule(ctx, reqCtx.SchedulingRequest, candidatePods)

and then candidate pods are filtered only once before scheduling, no matter how many profiles we have.

@ahg-g @kfswain leaving the final stamp for you to decide if we want to merge or not.

Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One comment on how to deal with cases where the metadata exists but the list is empty, otherwise looks great.

@@ -304,6 +305,7 @@ func (r *Runner) initializeScheduler() (*scheduling.Scheduler, error) {
kvCacheScorerWeight := envutil.GetEnvInt("KV_CACHE_SCORE_WEIGHT", scorer.DefaultKVCacheScorerWeight, setupLog)

schedulerProfile := framework.NewSchedulerProfile().
WithFilters(filter.NewSubsetFilter()).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now, but please open an issue to allow configuring this across profiles irrespective of the source of the configuration (see the discussion we had on the issue)

if !found {
return pods
} else if len(endpointSubsetList) == 0 {
return pods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to return an empty list here, not all pods, meaning all pods are filtered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed now, and added a testcase to verify this.

@ahg-g
Copy link
Contributor

ahg-g commented Jun 25, 2025

@nirrozenbaum agree on the need for a followup, my thinking is to have a "framework" for configuring mandatory plugins via code (i.e., this is no a user facing api). An initial cut would be as simple as a list of filters that get prepended to all profiles. Since this is hardcoded, we can evolve it slowly and by reacting to new needs, it doesn't need to be fully fleshed out from the beginning since it is not user facing.

@ahg-g
Copy link
Contributor

ahg-g commented Jun 25, 2025

Created #1068 as a follow up

@ahg-g
Copy link
Contributor

ahg-g commented Jun 25, 2025

/lgtm
/approve

Congrats @rlakhtakia on your first inference gateway PR!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 25, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, rlakhtakia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 25, 2025
@k8s-ci-robot k8s-ci-robot merged commit 6b82b89 into kubernetes-sigs:main Jun 25, 2025
9 checks passed
@rlakhtakia rlakhtakia deleted the filter branch June 26, 2025 07:36
rlakhtakia added a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Jun 26, 2025
shmuelk pushed a commit to shmuelk/gateway-api-inference-extension that referenced this pull request Jun 26, 2025
rlakhtakia added a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Jun 26, 2025
EyalPazz pushed a commit to EyalPazz/gateway-api-inference-extension that referenced this pull request Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants