New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Ingress HA, Scheduling, and Provisioning Proposal #34013

Closed

mqliang wants to merge 1 commit into kubernetes:master from mqliang:ingress-proposal

Contributor

mqliang commented Oct 4, 2016 •

edited

The idea in this doc is proposed at a high level, API/Implementation design is also rough mainly for illustrating the idea behind it. Comments are welcomed.

@kubernetes/sig-network

cc @ddysher @hongchaodeng

This change is

googlebot added the cla: yes label

mqliang changed the title ~~ingress proposal~~ HA, Scheduling, and Provisioning Proposal

mqliang changed the title ~~HA, Scheduling, and Provisioning Proposal~~ Ingress HA, Scheduling, and Provisioning Proposal

k8s-github-robot assigned bgrant0607

k8s-github-robot added kind/design size/L release-note-label-needed labels

k8s-github-robot mentioned this pull request

[k8s.io] Pods should support retrieving logs from the container over websockets {E2eNode Suite} #31920

Closed

Contributor

k8s-ci-robot commented Oct 4, 2016

Jenkins GCI GKE smoke e2e failed for commit afa6d7a344ae1b106598396a2f6b0964b1b0d9b8. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Contributor

k8s-ci-robot commented Oct 4, 2016

Jenkins GKE smoke e2e failed for commit afa6d7a344ae1b106598396a2f6b0964b1b0d9b8. Full PR test history.

The magic incantation to run this job again is @k8s-bot gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.


          ingress proposal

d95360a

mqliang force-pushed the ingress-proposal branch from afa6d7a to d95360a Compare

October 4, 2016 16:44

Contributor

k8s-ci-robot commented Oct 4, 2016

Jenkins verification failed for commit d95360a. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

bprashanth reviewed

View reviewed changes

Contributor

bprashanth left a comment

@kubernetes/sig-network
overall suggest streamlining this with the claims issue: #30151

docs/proposals/ingress-ha-scheduling-provisioning.md

+              ## Goal
+              This Proposal aims to address the above issues by the following mechanism:
+              * Ingress HA: using keepalived and VIP to provide High Availability(mainly

Contributor

bprashanth Oct 4, 2016

clarify how much this gives us over just running a Service over the ingress controller(s)?

Contributor Author

mqliang Oct 8, 2016

Do you mean running a NodePort type Service over the ingress controller(s) and use keepalived-vip to provide HA for the Service?

I just want to simplify Ingress creation:

create Ingress ReplicaSet
create a NodePort type Service over the Ingress ReplicaSet
create a keeplived-vip daemonset to provide HA for the Ingress Service

It's not a "happy path" and hinder automation(especially considering the Ingress auto provision). Instead, it would be much helpful to just create a Ingress ReplicaSet, and implement the HA logic in Ingress Pod.

docs/proposals/ingress-ha-scheduling-provisioning.md

+                for nginx/haproxy implementation, cloud implementation usually already
+                provide HA)
+              * Ingress Scheduling: schedule Ingress Resource to Ingress Pod/ReplicaSet
+              * Ingress Provisioning: allow user to dynamically add Ingress Pod/ReplicaSet

Contributor

bprashanth Oct 4, 2016

This is a good goal, one we thought to solve with ingress claims: #30151. we haven't fleshed out the model yet.

Contributor

ddysher Oct 12, 2016

This is a pretty important feature. Is anyone actively working on it ? We've already started working on some of it, @mqliang will update the proposal in a few days.

docs/proposals/ingress-ha-scheduling-provisioning.md

+              * Ingress HA: using keepalived and VIP to provide High Availability(mainly
+                for nginx/haproxy implementation, cloud implementation usually already
+                provide HA)
+              * Ingress Scheduling: schedule Ingress Resource to Ingress Pod/ReplicaSet

Contributor

bprashanth Oct 4, 2016

Do you really want scheduling, or is that taking it too far?
With claims you have different qos classes, and a user picks a class.
The ingress is satisfied by whatever's behind that class, be it a single pod, a group of pods, a cloud lb etc.

Contributor Author

mqliang Oct 8, 2016 •

edited

I think there should be three mechanism for choosing:

if user know exactly which Ingress Pod/RS to use, just use it.
If user don't know exactly which Ingress Pod/RS to use, but he know he want a Ingress Pod with some properties(cpu/mem/bandwidth available, nginx/haproxy/cloud lb implementation, etc), he can claim one, the kubernetes will iterate all the existing Ingress Pod/RS to find a best matching for him (in other words, it's scheduling)
user can also claim a Ingress Pod/RS with a "auto-provision" annotation, in such a case, kubernetes will dynamically provide one.

Just like the relationship between PV and PVC,:

if user want use a specific cloud disk(and he know the cloud disk id), just use it.
If user just want a PV with XXX size and some properties, he can claim a PV by creating a PVC: kubernetes will find a best matching PV(scheduling) for user.
user also can create PV with auto-provision annotation, in such a case kubernetes will call cloud provider api to create a cloud disk

Contributor

ddysher Oct 12, 2016

What's involved in scheduling, qos? I agree with @bprashanth this is a little far; maybe you just mean bound?

If user don't know exactly which Ingress Pod/RS to use, but he know he want a Ingress Pod with some properties(cpu/mem/bandwidth available, nginx/haproxy/cloud lb implementation, etc), he can claim one, the kubernetes will iterate all the existing Ingress Pod/RS to find a best matching for him (in other words, it's scheduling)

docs/proposals/ingress-ha-scheduling-provisioning.md

+              * using AntiAffinity feature so that Ingress Pod created by the same Ingress
+                ReplicaSet could be scheduled to different node
+              * cluster admin choose a CIDR for Ingress VIP(AKA IngreeVIPCIDR)
+              * each Ingress Replicaset will be allocated a VIP from IngreeVIPCIDR(allocated by

Contributor

bprashanth Oct 4, 2016

In the claims proposal, each ingress claim would get a vip.
"Ingress Replicaset" is a term which doesn't make sense to me, today an Ingress points to Services which may point to replicassets.

Contributor

ddysher Oct 12, 2016 •

edited

I believe for "Ingress ReplicaSet", @mqliang means something that actually hold the vip, kind of like the ingress controller in current setup. However, by "each ingress claim would get a vip", you seem to suggest that the life cycle of vip is bound to claim? I'm trying to figure out the difference and if we want to apply the PV/PVC model to ingress claim.

If we do want to use the PV/PVC model, then there needs to be such "Ingress ReplicaSet" which actually hold the VIP, not the claim holding the VIP. Then, user can create ingress claims to claim VIP. It is the VIP, not the claim, that have attributes on it, like qos.

If not, then the only new type we need is ingress claim, in this case, how do we add more ingress resources to an existing claim? For example, if user creates an ingress claim for ".foo.com", a vip is allocated for it, but not yet useful; next, user can create ingress resources to consume the DNS and vip. Now some more ingress resources are needed for DNS ".bar.com" and user wants to use the same vip, how do they achieve this? Do they have to edit their claim?

docs/proposals/ingress-ha-scheduling-provisioning.md

+              * Ingress ReplicaSet rolling update
+              ## Ingress HA
+              (AKA: Ingress Virtual IP using keepalived)

Contributor

bprashanth Oct 4, 2016

I think we need to seperate the keepalived details from the api. We need a way to get a vip, public of private. That maybe keepalived, or iptables proxy, or something new.

Contributor Author

mqliang Oct 8, 2016

I agree

docs/proposals/ingress-ha-scheduling-provisioning.md

+              * Ingress ReplicaSets are created by cluster admin in advance
+              * If all Ingress Pods are saturated, it's cluster admin's duty to create
+                more Ingress ReplicaSets
+              * There is a Ingress Scheduler which will schedule Ingress Resources to Ingress

Contributor

bprashanth Oct 4, 2016

IMO this might not be necessary. Users pick an Ingress claim based on QoS needs. The Ingress claim has a vip. The vip is backed by pods. If the pods are saturated, the admin or an autoscaler needs to scale them, but the vip doesn't change.

Contributor Author

mqliang Oct 12, 2016 •

edited

The Ingress claim has a vip

It seems more reasonable that:

Ingress Service have a vip, external or internal(if user want in-cluster L7 loadbalancing), Ingress Service was backended by several Ingress Pods(may be created by a ReplicaSet)
User can specify a Ingress Service for Ingress Resource
If user don't now which Ingress Service to specify, it can use a IngressClaim, then a ingress-claim-controller will iterate through all Ingress Services and find a best matching one and bind the Ingress Service with IngressClaim.

Contributor Author

mqliang Oct 12, 2016 •

edited

the admin or an autoscaler needs to scale them

scale up or scale out?

bgrant0607 assigned bprashanth and unassigned bgrant0607

keontang reviewed

View reviewed changes

docs/proposals/ingress-ha-scheduling-provisioning.md

+                the IP addresss of the node where Ingress Pod is running. In case of a
+                failure the Ingress Pod can be be moved to a different node.
+              * How many Ingress Pod should run in a cluster? Should all Ingress Pod
+                list&watch all Ingress Resource with out distinction? There is no way

Contributor

keontang Oct 12, 2016

s/with out/without

keontang reviewed

View reviewed changes

docs/proposals/ingress-ha-scheduling-provisioning.md

+              ## Ingress Provisoning
+              #### High level design
+              * Ingress ReplicaSets could be dynamically provisioned on deman, instead of

Contributor

keontang Oct 12, 2016

s/deman/demand

ddysher mentioned this pull request

Ingress claims #30151

Closed

pieterlange mentioned this pull request

nginx-ingress: configurable ingress class kubernetes-retired/contrib#1880

Closed

k8s-github-robot added the do-not-merge label

k8s-github-robot commented Nov 11, 2016

This PR hasn't been active in 30 days. It will be closed in 59 days (Jan 10, 2017).

cc @bprashanth @mqliang

You can add 'keep-open' label to prevent this from happening, or add a comment to keep it open another 90 days

apelisse removed the do-not-merge label

Contributor Author

mqliang commented Nov 22, 2016

close this in flavor of #37269

mqliang closed this

mqliang mentioned this pull request

Bare-metal Ingress HA using keepalived kubernetes/ingress-nginx#23

Closed

sandys mentioned this pull request

support adding a transparent local proxy/load balancer as a sidecar (without having to change the original docker container to be aware of it)? #25961

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment