-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cloud load-balancers should have health checks for nodes #14661
Comments
Nodes will already flip to unhealthy after ~40s of kubelet silence, or immediately when something like docker death is observed by kubelet. You can probably fix it by swapping the nodelister: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/service/servicecontroller.go#L77 with the conditional node lister https://github.com/kubernetes/kubernetes/blob/master/pkg/client/cache/listers.go#L119, or diffing them. |
I mean, swap those and remove unhealthy nodes from the target instance group |
That's a lot of propagation. 40s is an eternity On Mon, Sep 28, 2015 at 10:12 AM, Prashanth B notifications@github.com
|
I mean your daemon could die too so you'll need a timeout. Avoiding the gce health checks suggestion because I'd rather do this in kube and have it work xplat. |
Healthchecking the kube-proxy seems like the best idea to me. Port 10249 is currently used for both healthz and pprof, and we might not want pprof data exposed to the internet. But it wouldn't be difficult to make healthz bind to 0.0.0.0 by default (even if we moved pprof to a different port) and then allow 10249 to be opened to healthcheckers as necessary. I don't think we should be doing our own healthchecking and manually modifying the target instance group. That code would be GCE-specific and seems like it's just re-implementing what is provided for free by the GCP cloud healthchecker as long as you can provide an http endpoint that is accessible to |
See #8673 |
I'm going close the dup (I dup'ed myself!). This one has more info, so keeping it. |
Why health check nodes instead of straight to network endpoint? If we do that, isn't the 40s eternity enough for node health checks? |
Today we load-balance to nodes. If a node dies, I want my cloud LB On Sat, Feb 13, 2016 at 9:13 PM, Prashanth B notifications@github.com
|
Yes we can easily implement a short term solution where the service/ingress controller runs a goroutine that just polls health check daemons on the node. "Bouncy" makes that harder because kube-proxy will still continue to think the endpoint is ready. Even if we don't go to the extent of reporting utiliztion information per request, I think we should come up with somethig that re-uses an existing health check idiom (either nodecontroller health or liveness/readiness), in the long run. The risk of having some sort of instantaneous and binary feedback loop is osciallation or flapping. The "right" way to solve this, IMO, is to use backend weights. |
@bprashanth @thockin which is the status of this issue? I would like to add GCE LB health checking, did you solved by making kubelet port available or how? |
We're actually using cloud healthchecks for something different now, so we On Tue, Oct 18, 2016 at 5:58 AM, Gorka Lerchundi Osa <
|
Ok @thockin can you point me to the documentation/issue/whatever to follow the new direction? I'll appreciate any hint that helps me finding out this concrete topic. Thanks |
On Tue, Oct 18, 2016 at 11:45 PM, Gorka Lerchundi Osa <
|
thanks! |
This is still an issue of course, meaning, if a node disappears due to partition we will take 40s to kill endpoints. If we had a health check endpoint, it would take O(5s). There are different ways to tackle it. One way is to make kube-proxy health checking much smarter (i.e use ipvs). We need to confirm that ipvs will autofail over if it noticed that eg a SYN got blackholed once. I wasn't able to confirm this with some preliminary tinkering: #30134 (comment) Another, is to make our health checking smarter. The problem right now is that the kubelet is responsible for reporting endpoint health and node status. We could come up with a system that reports endpoint health much more frequently, and doesn't actually run on the node: #28442 And yet another, is to actually do what this issue proposes, and add a secondary health check endpoint to nodes. Or find some clever way to leverage the same "healthcheck-nodeport" logic in a way that works for both "onlyLocal" and "Global" services. |
On Fri, Oct 21, 2016 at 12:37 PM, Prashanth B notifications@github.com wrote:
I convinced myself that it DOES NOT do what we want. it does exactly
|
Re-opening. I do think we should have a node-level healthcheck for LBs that are not using the OnlyLocal annotation. |
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522) Make GCE load-balancers create health checks for nodes From #14661. Proposal on kubernetes/community#552. Fixes #46313. Bullet points: - Create nodes health check and firewall (for health checking) for non-OnlyLocal service. - Create local traffic health check and firewall (for health checking) for OnlyLocal service. - Version skew: - Don't create nodes health check if any nodes has version < 1.7.0. - Don't backfill nodes health check on existing LBs unless users explicitly trigger it. **Release note**: ```release-note GCE Cloud Provider: New created LoadBalancer type Service now have health checks for nodes by default. An existing LoadBalancer will have health check attached to it when: - Change Service.Spec.Type from LoadBalancer to others and flip it back. - Any effective change on Service.Spec.ExternalTrafficPolicy. ```
Unassigning as GCE part was done. |
Can you update what you think still needs to be done, and who should be
doing it?
…On Thu, Nov 30, 2017 at 11:24 AM, Zihong Zheng ***@***.***> wrote:
Unassigning as GCE part was done.
/unassign
/remove-area platform/gce
/area cloudprovider
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#14661 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFVgVKTZyv6gzrkY9E3EQs_DQoc1oFsGks5s7wDmgaJpZM4GFDrX>
.
|
Sure, the real work for each cloud provider is to attach (or configure) a health check pointing to I'd expect in-tree cloud provider owners to follow up on this. Looping in the ones I found in OWNER files (omitting those don't have real loadbalancer implemention). |
@ngtuna for CloudStack |
@MrHohn This requires at least v1.7.2, correct? |
@jhorwit2 Thanks for mentioning, that is correct. v1.7.2 is the earliest k8s version that kube-proxy properly serves healthz port. We should probably make it a global const somewhere. |
Should we be doing this atthe Service controller level, or should this
simply be an implementation detail - if a cloud wants to do it, they should
do it, but it should not be part of Service Controller?
especially as we move towards Cloud Controller Manager ...
…On Thu, Nov 30, 2017 at 9:22 PM, Zihong Zheng ***@***.***> wrote:
MrHohn This requires at least v1.7.2, correct?
@jhorwit2 <https://github.com/jhorwit2> Thanks for mentioning, that is
correct. v1.7.2 is the earliest k8s version that kube-proxy properly serves
healthz port. We should probably make it a global const somewhere.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#14661 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFVgVC9slX_vEj3S4vs1_Wy88fY-mnymks5s740TgaJpZM4GFDrX>
.
|
I think they're sort of the same thing, the SC would just be calling into the cloudprovider and asking for this health check, or it would be part of the normal behavior of the SC calling the CP to create/update a load balancer. It seems like it would be helpful if Kubernetes were prescriptive here, one way or the other, and that issues per-cloud-provider were opened to track them adding this health check. Questions:
|
I'd prefer the latter, make "ensure health check" another interface may place more constraints on cloud provider implementation --- given that even how health check is attached to a LB may happen very differently, and could be coupled with LB management.
Adding health check retroactively seems risky if adding health check itself is service disruptive. Another concern is various node versions --- some node reposes health check while others (in older version) don't.
To clarify, this "/healthz" is served by kube-proxy but not kubelet. The proxy healthz port is defined in the ports package: kubernetes/pkg/master/ports/ports.go Line 43 in ca59d90
If someone choose to use a different healthz port in kube-proxy (via flag or config), they will need to do the same for service controller (or cloud controller manager). |
I don't recall the stance on version skew, but if it has been there for at least a few revisions, then I'm not worried about the presence of
Got it, thanks.
Hm, to my knowledge, most of the cloud-providers don't really even take configuration today (or if they do, it's inferred configuration discovered via their metadata servers). I guess it wouldn't be the worst thing in the world if it defaulted to the default
The user doesn't really know why any of this is happening until they find this issue, find the corresponding PR for their cloud provider and find where the new flag/config field is. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/lifecycle frozen |
@justinsb Do we have health check for AWS ELBs? We want to keep adding/removing nodes in the ELB based on health. What is the right way to do it. We are not using k8s provisioned ELB, it is the ELB we have created. We want to keep healthy kubernetes nodes under it. |
@alok87 You'd want emulate what Kubernetes does in that scenario, which is either use the kube-proxy healthz endpoint, which is by default port 10256, or use the healthcheck port on the service if it's traffic policy is local. |
@jhorwit2 What does the Kubernetes does? On what basis it taints a node healthy or not unhealthy. We have two system pods - weave and kube-proxy. We want to check both are good and also check node_ip:port is working before making a node healthy and attaching it to LB. Do I need to write a custom controller to do this? |
Curious what is remaining here ? Also do all health checks (for both https, network, container native) go to kube-proxy 10256 port ? Why do they not go to kubelet ? |
kube-proxy exposes "healthiness" as "was able to update the heartbeat recently. We may want to go further with "readiness" vs "liveness" (I have an idea brewing) but it's unlikely that it will be totally user-defined. Instead, I am focusing on things like node schedulability. If the node is unschedulable, it should probably not be considered for new LB traffic. That allow users to define arbitrary rules for what makes a node unschedulable, which is something we want anyway. Closing this. Thanks for the reminder :) |
Given the state of cloud-LB today, most (GCE, AWS, Openstack) LB implementations target nodes indiscriminately. We should ensure that the cloud-load-balancer is only targetting healthy nodes.
We should health-check to the nodes' kubelet or kube-proxy or add a new "do nothing but answer node health" daemon.
The text was updated successfully, but these errors were encountered: