New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-24363: Full implementation of KEP-1669 ProxyTerminatingEndpoints #4072
Conversation
5e6d883
to
2f1cd18
Compare
go-controller/pkg/util/kube.go
Outdated
// GetEndpointAddressesWithPrecondition first applies the (optionally) provided function fn | ||
// to filter endpoints from the given list of endpoint slices; from the resulting list, | ||
// it returns the IP addresses of eligible endpoints. | ||
func GetEndpointAddressesWithPreCondition(endpointSlices []*discovery.EndpointSlice, service *kapi.Service, fn func(discovery.Endpoint) bool) sets.Set[string] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be pedantic but why change from GetEndpointAddressesWithCondition()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's that good question :)
In the previous implementation, the selection of endpoints was easier and had no fallback case: I would just take all ready endpoints + all serving terminating endpoints. GetEndpointAddressesWithCondition
first selected endpoints that way and after that it would apply the provided filter function (in our case, to keep only local endpoints).
With the new selection logic in this PR, the order between the two steps becomes important: I need to first apply the filter function (e.g keep all local endpoints) and only then apply the new selection logic (take all ready endpoints, but if none are ready take serving terminating endpoints). The ordering here changes the result.
Imagine you have three nodes and you need to get the "eligible" endpoints local to node1:
- node1: endpoint1_serving&terminating, endpoint2_serving&terminating
- node2: endpoint3_serving&terminating, endpoint4_ready
- node3: endpoint5_ready, endpoint6_ready
If I first apply the new selection logic and only then keep local endpoints, I would end up with no endpoint:
- [endpoint selection] all ready endpoints in the cluster: endpoint4, endpoint5, endpoint6
- [filter function] none, since the ready endpoints are not on node1
If I reverse the steps I get the result I want:
- [filter function] endpoint1, endpoint2 --> they are local to node1
- [endpoint selection] endpoint1, endpoint2 --> they are serving & terminating
So I replaced Condition
with PreCondition
in the function name to highlight that it gets applied first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just added a test case to cover the scenario I described above.
https://github.com/ovn-org/ovn-kubernetes/pull/4072/files#diff-4c82fea05fd74e2aaa346d3d3caaa27442ed190cf4057e8e1a810982eafdd286R1807
2f1cd18
to
19d69c0
Compare
b4fe523
to
1c52192
Compare
/retest |
Oops, something went wrong:
|
/retest-failed |
Oops, something went wrong:
|
aa295c2
to
e3a6002
Compare
serviceStr = fmt.Sprintf(" for service %s/%s", service.Namespace, service.Name) | ||
} | ||
// separate IPv4 from IPv6 addresses for eligible endpoints | ||
for _, endpoint := range getEligibleEndpoints(validSlices, service) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the logic above this line, could it be a condFn for getEligibleEndpoints
getSelectedEligibleEndpoints
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we're separating IPv4 from IPv6 endpoints addresses. Do I need a condFn for that?
Here actually I should use the existing getEligibleEndpointAddresses
, but it's at the cost of adding one more iteration through the endpoints. I'm a little hesitant to do that because it gets executed by ovnkube-controller, which configures /all/ the endpoints for a given service: https://github.com/ovn-org/ovn-kubernetes/blob/master/go-controller/pkg/ovn/controller/services/services_controller.go#L392-L402
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the logic above this line, not the logic below. So the one where you are matching the port name & protocol of the service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh ok, I see. condFn
as I defined it is applied to single endpoints, because we need to check their node name, while in GetLbEndpoints
we're filtering out entire slices. I don't know if rewriting it once again for this particular usage would improve the overall readability :)
The previous implementation was an approximation of KEP-1669 ProxyTerminatingEndpoints: we simply included terminating serving endpoints (ready=false, serving=true, terminating=true) along with ready ones in the endpoint selection logic. Let's fully implement KEP-1669 and only include terminating endpoints if none are ready. The selection follows two simple steps: 1) Take all ready endpoints 2) If no ready endpoints were found, take all serving terminating endpoints. This should also help with an issue found in a production cluster (https://issues.redhat.com/browse/OCPBUGS-24363) where, due to infrequent readiness probes, terminating endpoints were declared as non-serving (that is, their readiness probe failed) only quite late and were included as valid endpoints for quite a bit, while the existing ready endpoints should have been preferred. Extended the test cases to include testing against multiple slices and dual stack scenarios. Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
e3a6002
to
35b1ae8
Compare
Fix included in accepted release 4.16.0-0.nightly-2024-01-31-073538 |
The previous implementation was an approximation of KEP-1669 ProxyTerminatingEndpoints: we simply included terminating serving endpoints (ready=false, serving=true, terminating=true) along with ready ones in the endpoint selection logic. Let's fully implement KEP-1669 and only include terminating endpoints if none are ready. The selection follows two simple steps:
This is done for all traffic policies (kubernetes/kubernetes#108691).
This should also help with an issue found in a production cluster (https://issues.redhat.com/browse/OCPBUGS-24363) where, due to infrequent readiness probes, terminating endpoints were declared as non-serving (that is, their readiness probe failed) only quite late and were included as valid endpoints for quite a bit, while the existing ready endpoints should have been preferred.
Extended the test cases to include testing against multiple slices and dual stack scenarios.
Successfully tested downstream on 4.16 (openshift/ovn-kubernetes#2008) and 4.14 (openshift/ovn-kubernetes#2009, which is relevant because of the OVNK architecture change)
Fixes #OCPBUGS-24363