AWS NLB with externalTrafficPolicy: Local: Health check defaults take 90s to remove node #73362

kellycampbell · 2019-01-27T01:09:35Z

What happened:

A Service with externalTrafficPolicy: Local and AWS NLB load balancer is very slow to recognize that a node fails health checks.

I created a DaemonSet and a Service with

metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local

When one of the pods is restarted, the NLB still allows traffic to it (which is dropped causing connect timeout) for a minimum of 3 * 30s with the default health check settings.

What you expected to happen:

Preferably, the load balancer could recognize the failed node or pod sooner.

The minimum settings for NLB are documented here: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html

Environment:

Kubernetes version (use kubectl version): 1.11.7
Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
Kernel (e.g. uname -a): 4.9.0-7-amd64
Install tools: kops 1.11.0
Others:

The text was updated successfully, but these errors were encountered:

kellycampbell · 2019-01-27T01:11:28Z

@kubernetes/sig-aws-bugs

k8s-ci-robot · 2019-01-27T01:11:35Z

@kellycampbell: Reiterating the mentions to trigger a notification:
@kubernetes/sig-aws-bugs

In response to this:

@kubernetes/sig-aws-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kellycampbell · 2019-01-29T01:10:18Z

I'm also curious if anyone has good recommendations for how to schedule pods with externalTrafficPolicy: Local such that traffic isn't dropped. I'm thinking there could be a way using a Deployment with pod anti-affinity and rolling update params such that a new pod is brought up on each node before shutting down the old pod. If that's too complex, then I suppose a custom controller could be made for this purpose.

d-nishi · 2019-02-05T02:11:55Z

/assign @M00nF1sh

kellycampbell · 2019-02-05T20:27:48Z

One solution I figured out to avoid dropping new connections during deployment updates was to setup two deployments tha can be updated in an A/B fashion. This works because the local traffic still goes through the kube-proxy so it can load balance across two pods on the same node.

For a small cluster, I use a DaemonSet, but for a larger cluster, scheduling policy would need to be set such that the deployed pods end up on the same physical nodes.

Then the Service's selector selects all pods from A and B, and the deploy or daemonsets have a selector for only their pods. When you need to rollout an update, you would create set B, let those pods start up fully, then you can stop set A.

example of the DaemonSet's: (replace the '-a' with '-b' for the B deployment.

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ambassador-dev-a
spec:
  selector:
    matchLabels:
      ab-ambassador: ambassador-dev-a
  template:
      labels:
        k8s-app: ambassador-dev
        ab-ambassador: ambassador-dev-a

and the Service selector is like this:

---
apiVersion: v1
kind: Service
metadata:
  name: ambassador-dev
  selector:
    k8s-app: ambassador-dev

fejta-bot · 2019-05-06T20:32:39Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

kellycampbell · 2019-05-28T12:19:20Z

/remove-lifecycle stale

nikhita · 2019-08-06T12:29:34Z

/sig cloud-provider

Miciah · 2019-08-06T17:48:10Z

@kellycampbell,

I'm also curious if anyone has good recommendations for how to schedule pods with externalTrafficPolicy: Local such that traffic isn't dropped. I'm thinking there could be a way using a Deployment with pod anti-affinity and rolling update params such that a new pod is brought up on each node before shutting down the old pod.

I am trying to solve the same problem using a solution along those lines. One can use pod affinity rules to ensure the new pods are scheduled on the same nodes as the old pods, and one can set the deployment strategy to surge to ensure a rolling update creates new pods before deleting old pods. The remaining piece then is to ensure the old pods are selected for deletion in the desired order (i.e., we want the scale-down of the old pods at each step to delete a pod that is on the same node as a new pod), to which end I made #80004, which changes the ReplicaSet controller to prefer pods that are colocated with pods from a related ReplicaSet when choosing pods for deletion during scale-down. I'd appreciate any feedback on that PR.

kellycampbell · 2019-08-06T20:17:53Z

@Miciah

Your solution sounds similar to the A-B strategy above except automated instead of manual. #73362 (comment)

I'm wondering if it would be good to formalize the requirements a bit more, e.g. adding some specific metadata or option to Deployments to provide this kind of A/B strategy when replacing existing pods. Or maybe it can be done within affinity, or pdb's and the scheduler such that if a node already has a healthy pod, it must maintain at least one healthy pod on that node unless a full drain is started.

Another thought I had about this is what if the kube-proxy's health check for that service would start returning failed before the last pod on a node is completely drained such that leftover traffic can still get handled for some amount of time before killing the pod?

Edit: one issue I can think of with the solution you have (actually, would be problem for any A/B on the same node) is it makes re-balancing a cluster harder. E.g. if you want a rolling update to schedule to lightly loaded nodes first.

…Local This is a fix for issue kubernetes#73362

fejta-bot · 2019-11-04T20:57:05Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

frittentheke · 2019-11-04T22:25:33Z

/remove-lifecycle stale

…Local This is a fix for issue kubernetes#73362

kellycampbell added the kind/bug Categorizes issue or PR as related to a bug. label Jan 27, 2019

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 27, 2019

k8s-ci-robot added sig/aws and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 27, 2019

kellycampbell mentioned this issue Jan 27, 2019

Shorten health check timeout for AWS NLB with externalTrafficPolicy: Local #73363

Merged

k8s-ci-robot assigned M00nF1sh Feb 5, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 6, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 28, 2019

kellycampbell changed the title ~~AWS NLB with externalTrafficPolicy: Local: Health check defaults take min 90s to remove node~~ AWS NLB with externalTrafficPolicy: Local: Health check defaults take 90s to remove node May 28, 2019

k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. and removed sig/aws labels Aug 6, 2019

k8s-ci-robot added sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 6, 2019

krish7919 mentioned this issue Aug 23, 2019

Override NLB health check values based on annotations #81469

Closed

mberhault mentioned this issue Sep 4, 2019

server: TLS handshake log spam from health checks cockroachdb/cockroach#32102

Closed

ironcladlou mentioned this issue Sep 16, 2019

Prefer to delete doubled-up pods of a ReplicaSet #80004

Merged

kellycampbell pushed a commit to kellycampbell/kubernetes that referenced this issue Sep 27, 2019

Shorten health check timeout for AWS NLB with externalTrafficPolicy: …

0b481e8

…Local This is a fix for issue kubernetes#73362

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 4, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 4, 2019

k8s-ci-robot closed this as completed in #73363 Jan 8, 2020

Ye-Tian-Zero pushed a commit to Ye-Tian-Zero/kubernetes that referenced this issue Jan 27, 2020

Shorten health check timeout for AWS NLB with externalTrafficPolicy: …

e8975d0

…Local This is a fix for issue kubernetes#73362

TBBle mentioned this issue Jun 23, 2020

NLB services with "externalTrafficPolicy: Local" route traffic to nodes that cannot handle it for a short time when a node joins the cluster kubernetes/cloud-provider-aws#87

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS NLB with externalTrafficPolicy: Local: Health check defaults take 90s to remove node #73362

AWS NLB with externalTrafficPolicy: Local: Health check defaults take 90s to remove node #73362

kellycampbell commented Jan 27, 2019

kellycampbell commented Jan 27, 2019

k8s-ci-robot commented Jan 27, 2019

kellycampbell commented Jan 29, 2019

d-nishi commented Feb 5, 2019

kellycampbell commented Feb 5, 2019 •

edited

fejta-bot commented May 6, 2019

kellycampbell commented May 28, 2019

nikhita commented Aug 6, 2019

Miciah commented Aug 6, 2019

kellycampbell commented Aug 6, 2019 •

edited

fejta-bot commented Nov 4, 2019

frittentheke commented Nov 4, 2019

AWS NLB with externalTrafficPolicy: Local: Health check defaults take 90s to remove node #73362

AWS NLB with externalTrafficPolicy: Local: Health check defaults take 90s to remove node #73362

Comments

kellycampbell commented Jan 27, 2019

kellycampbell commented Jan 27, 2019

k8s-ci-robot commented Jan 27, 2019

kellycampbell commented Jan 29, 2019

d-nishi commented Feb 5, 2019

kellycampbell commented Feb 5, 2019 • edited

fejta-bot commented May 6, 2019

kellycampbell commented May 28, 2019

nikhita commented Aug 6, 2019

Miciah commented Aug 6, 2019

kellycampbell commented Aug 6, 2019 • edited

fejta-bot commented Nov 4, 2019

frittentheke commented Nov 4, 2019

kellycampbell commented Feb 5, 2019 •

edited

kellycampbell commented Aug 6, 2019 •

edited