KEP-3037: client-go alternative services #3034

aojea · 2021-11-03T12:11:55Z

One-line PR description: client-go capability to connect to multiple apiservers

Issue link: client-go: alternative services #3037

Other comments: Implementation [WIP] Alternative services kubernetes#106115

keps/sig-api-machinery/NNNN-client-go-alternative-services/README.md

deads2k · 2021-11-03T16:53:41Z

keps/sig-api-machinery/NNNN-client-go-alternative-services/README.md

+1103 12:30:59.066469 1558329 round_trippers.go:454] GET https://127.0.0.1:44267/api/v1/namespaces/default/pods?limit=500 200 OK in 1 milliseconds
+I1103 12:30:59.066484 1558329 round_trippers.go:460] Response Headers:
+I1103 12:30:59.066491 1558329 round_trippers.go:463]     Cache-Control: no-cache, private
+I1103 12:30:59.066502 1558329 round_trippers.go:463]     Alt-Svc: h2="10.0.0.2:6443", h2="10.0.0.3:6443", h2="10.0.0.4:6443


it's not clear how this will interact with readiness. the LB is checking every second or so, but the external clients will not. Because of this, it can fail over to a not-ready server (hard fail and restart perhaps), and the clients will be unaware.

the goal is not about nines of availability, is about being able to recover on a network or connectivity problem, it only failover when a network problem is detected, the trade off is, as you say, that readiness is not completely followed. On another note, only ready endpoints are published, since the apiservers remove their endpoint when they are not ready, that guarantees that only ready apiservers are published.

Per example, the most critical problem this solves is when the connection goes stale, the only way to get out of the loop is because the http2 readIdleTimeout (after 30s IIRC) will raise an error because the ping frame that will cause the client to fail over.

keps/sig-api-machinery/NNNN-client-go-alternative-services/README.md

keps/sig-api-machinery/NNNN-client-go-alternative-services/kep.yaml

keps/sig-api-machinery/NNNN-client-go-alternative-services/README.md

aojea · 2021-11-04T12:08:28Z

/assign @thockin @deads2k @lavalamp

thockin

Overall this seems plausible to me. Is there any notion of TTL for alt services?

keps/sig-api-machinery/3037-client-go-alternative-services/README.md

aojea · 2021-12-22T08:37:17Z

Overall this seems plausible to me. Is there any notion of TTL for alt services?

indeed, there is
https://datatracker.ietf.org/doc/html/rfc7838#section-2.2
https://datatracker.ietf.org/doc/html/rfc7838#section-3.1

thockin · 2021-12-22T16:55:50Z

Do we need to consider TTL here?

…

On Wed, Dec 22, 2021 at 12:37 AM Antonio Ojea ***@***.***> wrote: Overall this seems plausible to me. Is there any notion of TTL for alt services? indeed, there is https://datatracker.ietf.org/doc/html/rfc7838#section-2.2 https://datatracker.ietf.org/doc/html/rfc7838#section-3.1 — Reply to this email directly, view it on GitHub <#3034 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKWAVAU6EI4BWSRF3HLVYDUSGE4RANCNFSM5HIYLO6Q> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were assigned.Message ID: ***@***.***>

aojea · 2021-12-22T17:04:31Z

Do we need to consider TTL here?

I've included it in my first prototype but then I didn't see the benefit:

the code is much more complicated with it, right now is less than 500 LOC including comments, ... kubernetes/kubernetes@8d8ae06#diff-2b78b84e7e41ca070ae0d042da6105c1f886d9d8347269ababe21812c95c4af5
kubernetes control-plane is not a high dynamic environment
apiserver already has another mechanisms to warn clients using http codes
if the alt-service has gone you will have a connection error, the client will expire that entry from the cache and will try another alt-svc

thockin · 2021-12-22T18:14:50Z

Is it worth saying that in the KEP? :)

…

On Wed, Dec 22, 2021 at 9:04 AM Antonio Ojea ***@***.***> wrote: Do we need to consider TTL here? I've included it in my first prototype but then I didn't see the benefit: - the code is much more complicated with it, right now is less than 500 LOC including comments, ... ***@***.*** #diff-2b78b84e7e41ca070ae0d042da6105c1f886d9d8347269ababe21812c95c4af5 <kubernetes/kubernetes@8d8ae06#diff-2b78b84e7e41ca070ae0d042da6105c1f886d9d8347269ababe21812c95c4af5> - kubernetes control-plane is not a high dynamic environment - apiserver already has another mechanisms to warn clients using http codes - if the alt-service has gone you will have a connection error, the client will expire that entry from the cache and will try another alt-svc — Reply to this email directly, view it on GitHub <#3034 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKWAVDGWFTJTR7IPHVLGO3USIAKXANCNFSM5HIYLO6Q> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were assigned.Message ID: ***@***.***>

thockin · 2021-12-23T17:36:36Z

I am LGTM on this, but I defer to @deads2k and/or @lavalamp for approval

lavalamp · 2022-01-24T17:07:28Z

2 ways to configure client side HA , which are useful at different times of cluster lifecycle:

If you can keep kubeconfigs up to date, there's not really a need for a dynamic cache of the same data; if you can't keep them up to date, then it will conflict with such a cache.

IMO putting all addresses in the config file only makes sense if the addresses are static and never change.

I don't think we can do this as designed, but if some design can address those concerns, maybe there'd be room for both static and dynamic options; I just would not expect both to be on at the same time in the same cluster.

thockin · 2022-01-25T00:37:42Z

IMO putting all addresses in the config file only makes sense if the addresses are static and never change.

If I can add to that: and if they are not static, you can't put aany of them in the config file. It seems dubious to assume that one of them is static.

The potential use-case I see here is elastic apiservers (growing and shrinking number or replicas) and no load-balancer. Is that really a situation that arises?

aojea · 2022-01-25T08:35:21Z

The potential use-case I see here is elastic apiservers (growing and shrinking number or replicas) and no load-balancer. Is that really a situation that arises?

yeah, I've tried to cover that case too, there are some projects that has this elastic control plane feature

thockin · 2022-01-26T21:54:03Z

I think multiple IPs in client-go is a relatively obvious thing to do.

I don't find alt-svc to be bad, just very very niche. I don't feel like my opinion (with sig-net hat on) should be on the same tier with @lavalamp though, since he is much more in tune with api-machinery.

deads2k · 2022-01-27T17:09:14Z

After reviewing the KEP, reading the opinions of other leads, and speaking with @aojea I think there are two least-common-denominator concerns with this KEP that are not feasible to address using this design.

This design does not remove the need to have a load balancer, meta-server, or fixed IP of some kind that is always available to handle the first request. Since the first request still requires one of these, the overall operational complexity is increased, not constant or reduced.
This particular design requires that the kube-apiserver either know which IPs a client can reach or allow the kube-apiserver to return IPs that may have not-kube-apiservers listening in a client's environment.

While some individuals have additional concerns, those two are commonly shared. On balance, we (apimachinery leads) think that this particular design is not worth the additional complexity. To provide a more complete history for future attempts, we'd like to merge this KEP in a rejected state with an explanation of why.

aojea · 2022-01-27T17:49:36Z

👍 fine with me,

keps/sig-api-machinery/3037-client-go-alternative-services/kep.yaml

deads2k · 2022-01-27T17:57:37Z

/label tide/merge-method-squash

deads2k · 2022-01-27T17:58:16Z

fine with me,

Thanks.

/lgtm
/approve

deads2k · 2022-01-27T17:58:34Z

/hold

the verify failure looks real.

k8s-ci-robot · 2022-01-27T17:58:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, deads2k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-api-machinery/OWNERS~~ [deads2k]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

aojea · 2022-01-27T18:36:01Z

/hold cancel
it should be fixed now

deads2k · 2022-01-27T18:57:07Z

/lgtm

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 3, 2021

k8s-ci-robot requested review from apelisse and jpbetz November 3, 2021 12:12

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Nov 3, 2021

deads2k reviewed Nov 3, 2021

View reviewed changes