-
Notifications
You must be signed in to change notification settings - Fork 102
feat(conformance): Add EPP conformance test for Gateway routing #961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: zetxqx The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @zetxqx. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zetxqx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work on this @zetxqx!
/ok-to-test |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zetxqx! This is looking pretty close
# --- Backend Service --- | ||
# Service for the infra-backend-deployment. | ||
apiVersion: inference.networking.x-k8s.io/v1alpha2 | ||
kind: InferenceModel | ||
metadata: | ||
name: conformance-fake-model-server | ||
namespace: gateway-conformance-app-backend | ||
spec: | ||
modelName: conformance-fake-model | ||
criticality: Critical # Mark it as critical to bypass the saturation check since the model server is fake and don't have such metrics. | ||
poolRef: | ||
name: normal-gateway-pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a Service + I don't think we need it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename the comment.
"I don't think we need it"
I think we still need the InferenceModel to direct the traffic to the correct POD hosting the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think we're testing EPP behavior? Instead I thought that every request would tell EPP which Pod IP(s) to return, and then our test would validate EPP behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the test's intent. The need for the InferenceModel is an implementation detail of the testingEPP.
Even though we control the final endpoint list with request headers, the testingEPP still requires the InferenceModel resource to:
- Discover the initial list of all possible pods from the referenced InferencePoo. The requestHeadBased filter(testingEPP used) then selects from this list.
- Identify the modelName it is serving
So, while the test logic focuses on the gateway's reaction to the EPP's output, the testingEPP itself still needs the InferenceModel to know the universe of possible backends and the model being served
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little hard and needs more refactoring because currently the Pod lists is from the datastore(https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/epp/scheduling/scheduler.go#L108). And it's getting updated from some reconciler logic I believe. Hence there is not a easy way we can pass in a input parameters to EPP as what we did for custom filter.
Also, currently the EPP request control flow tries to find a inferenceModel(https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/epp/requestcontrol/director.go#L94-L97), so w/o inferenceModel the EPP will always return error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can do that. @zetxqx will you cut an issue and assign it to me? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks both,
- I created an issue to track the EPP refactoring: Refactor EPP to decouple core logic from
InferenceModel
resource #1001. @kfswain - I created another issue for remove inferenceModel dependency in the conformance tests: Conformance: refactor
GatewayFollowingEPPRouting
Conformance Test to RemoveInferenceModel
Dependency #1002
This pull request introduces a new conformance test,
GatewayFollowingEPPRouting
, to validate that a Gateway correctly implements routing logic based on an EPP.The new test suite ensures that the Gateway honors the EPP's endpoint selection. We used the testing-epp to control the EPP behavior using request header. The two test cases are
Verification
The new conformance test was run against the
gke-l7-regional-external-managed
GatewayClass and passed successfully.against gke-l7
Test command
Test results
against istio (FAILED)
The istio implementation may still send traffic to pods not selected by EPP.
Test results
TODO(all done now)