-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use /status/ready as readiness probe of gateway for gateway discovery #4368
Conversation
E2E (targeted) tests with KIND-based clusters were started at https://github.com/Kong/kubernetes-ingress-controller/actions/runs/5591322167 |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #4368 +/- ##
=======================================
+ Coverage 65.7% 65.9% +0.2%
=======================================
Files 157 158 +1
Lines 18159 18274 +115
=======================================
+ Hits 11939 12052 +113
- Misses 5481 5484 +3
+ Partials 739 738 -1
☔ View full report in Codecov by Sentry. |
E2E (targeted) tests with KIND-based clusters were started at https://github.com/Kong/kubernetes-ingress-controller/actions/runs/5591805351 |
b099152
to
e086e3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect this to have any affect on interaction with older versions? I don't see any testing with 3.0 (proxy pods come up fine and get config, no controller errors for the most part) but have a minor bugbear that there should be some odd behavior on versions where the endpoint doesn't exist.
There were some errors on scale up about failing to post config, but posts were successful after a few seconds, so it seems like we maybe try and push config before the instance is fully ready, but that's benign (and happens with 3.3 too, so it's not an older version behavior difference).
I don't think so. The behavior of the controller will be compatible with any version of Gateway we supported so far. Despite the readiness endpoint used for Gateway's
That's interesting - I was able to reproduce it on scale up as you mentioned. The initial readiness check passes (we succeed to create a client and it returns no error on Perhaps this could be solved by making the checks made by KIC to keep track of success calls and consider clients ready only after they reach some success threshold (and similarly - we could have such a threshold for failures). I'd say that if we'd like to implement this, we can do that in a separate PR, WDYT? |
36cf2c3
to
68a21f2
Compare
Yeah, it's not a concern for what this PR is trying to do, nor was it introduced by the changes here. I just noticed it when checking for these changes' backwards compatibility. Backwards compatibility for this is indeed fine. |
e04f1d5
to
92ffd58
Compare
What this PR does / why we need it:
Changes the Gateway's Pod readiness probe from
/status
to/status/ready
./status/ready
returns 200 after receiving the first non-empty configuration. That allows to make Gateway pods to serve proxy traffic only after they're initially configured by KIC.In KIC's Gateway discovery, instead of relying on
EndpointSlices
Endpoints
' being ready, we use the discovered endpoints despite their readiness (we only discard terminating ones) and verify their readiness on our own with the use of the Admin API's/status
in theReadinessChecker
. It requires running a periodic readiness reconciliation loop to check whether the endpoints that were ready should be moved to the pending list and vice versa.Fundamental changes that were made in this PR:
KongAdminAPIServiceReconciler
watches Admin API service endpoints and pushes all but terminating ones to theAdminAPIClientsManager
(viaNotify
method)AdminAPIClientsManager
accepts the discovered endpoints and verifies their readiness with use of a newReadinessChecker
first:GatewayClients()
to the upper layers)AdminAPIClientsManager
is responsible for running a readiness reconciliation loop in which it usesReadinessChecker
to verify if:Which issue this PR fixes:
Closes #3499.
Special notes for your reviewer:
PR Readiness Checklist:
Complete these before marking the PR as
ready to review
:CHANGELOG.md
release notes have been updated to reflect any significant (and particularly user-facing) changes introduced by this PR