Allow to disable loadbalancer health probes #1394

tarioch · 2020-01-17T07:13:06Z

Currently there is no way to disable the loadbalancer health probes. It would be good if an annotation could be add to allow to disable the health probes. Either globally or for some of the ports.

jnoller · 2020-01-17T15:09:00Z

Can you provide a use case/business justification for completely disabling all health probes?

tarioch · 2020-01-17T16:42:34Z

Sure. Our use case is that we're exposing jenkins jnlp. The health probes flood on the one hand the log files with unnecessary log entries and the other more important factor seems to be that the health probes are "too aggressive" and disconnect the connection where it should still be ok and jenkins is without that able to keep the connection open.

Right now we did the workaround from here: https://stackoverflow.com/a/54257960 : Basically changing externalTrafficPolicy to Local and adding an explicit healthCheckNodePort.

Since we did that change the connection stays very stable where before it got interrupted every couple hours.

przemolb · 2020-02-09T07:38:33Z

I also would like to be able to disable all health probes. In our case it is log flooding (and our developers hate it analysing logs in case of issues ....). But also just to have an option. Not really sure what is the a use case/business justification to enforce health probes ?

pag08007 · 2020-02-26T21:33:35Z

We also ran into similar issue. We were deploying an application that was listening for TCP connections on a specific port and then triggered an event when a connection was made. The health probes were triggering our events and as a result were spamming our logs with fake errors.

alex-doerfler · 2020-02-27T07:43:43Z

For us this annotation would be helpful as well. We forward the TCP traffic to an outgoing connection. This is charged by bandwidth. The Healthprobes cause significant costs here. Therefore we had to use the workaround mentioned by @tarioch

przemolb · 2020-05-30T05:40:18Z

Any progress on this ?

github-actions · 2020-07-21T01:32:53Z

Action required from @Azure/aks-pm

Bessonov · 2020-07-21T20:12:53Z

Action required from @Azure/aks-pm

ghost · 2020-07-26T22:01:11Z

Action required from @Azure/aks-pm

TomGeske · 2020-07-27T05:36:10Z

+@palma21

antonmatsiuk · 2021-02-12T14:36:40Z

another use case is using bitnami helm chart for mysql. Health probes flood the log with Got an error reading communication packets messages

motmot80 · 2021-03-18T19:16:38Z

another use case is using the load balancer for udp services with no http or tcp endpoint.

FanerYedermann · 2021-03-22T21:19:45Z

another use case is using the load balancer for udp services with no http or tcp endpoint.

Almost the same case here. I have a raw socket that I don't want spammed.

dnovvak · 2021-05-25T06:45:23Z

Any update on this?
Our use case is connection quality measurement using TCP/UDP sockets. Health probes from the load balancer disrupts measurements.

BobClaerhout · 2021-07-22T10:57:38Z

We are experiencing this issue as well. We have an mqtt port which is behind a loadbalancer. This mqtt port requires authentication and the health probe doesn't provide the authentication (of course) nor the correct protocol which results in logging (in the business application) of a faulty incoming request.
Since this has been updated 4 days ago, is this active now? If yes, what would be the release time for this?

vishalsawale9 · 2021-08-03T18:44:34Z

I have a similar requirement. I'm hosting an HTTPS application on Azure AKS cluster with gunicorn as flask running wsgi gateway, I'm continuously getting these socket errors in pods, though the app is up and running. I suspect the health probes occupying some port, and thus getting those errors almost every 2-3 seconds.

TomasTokaMrazek · 2021-12-03T18:18:47Z

This is currently a severe blocker for our deployment. We have a service exposing non-traditional protocols like websockets and custom communication protocol over TCP. The health probe is sending some data instead of empty netcat, so every few seconds there is a exception and stack trace in our logs.

I understand that disabling health probe for ports is not a best practice, but it's fast solution to our issue discussed here. Other solution would be to allow us specify custom probe just as Kubernetes allows via readinessProbe and livenessProbe configuration.

I propose simple LB annotation service.beta.kubernetes.io/azure-load-balancer-disable-health-probe-for-port-names It's not ideal, but since we do not have health probes for UDP, it shouldn't matter that much.

Example

apiVersion: v1
kind: Service
metadata:
  name: app
  namespace : default
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "true"
    service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "SomeSubnet"
    service.beta.kubernetes.io/azure-load-balancer-disable-health-probe-for-port-names: "binary,binary-secure,jms-tcp"
spec:
  selector:
    app: app
  type: LoadBalancer
  ports:
  - name: servlet-http
    protocol: TCP
    port: 9763
    targetPort: 9763
  - name: servlet-https
    protocol: TCP
    port: 9443
    targetPort: 9443
  - name: binary
    protocol: TCP
    port: 9611
    targetPort: 9611
  - name: binary-secure
    protocol: TCP
    port: 9711
    targetPort: 9711
  - name: jms-tcp
    protocol: TCP
    port: 5672
    targetPort: 5672

I dug up some other annotations realted to health probes here, but that doesn't seem to work or I don't understand, what it does.

service.beta.kubernetes.io/azure-load-balancer-health-probe-protocol
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path

mplachter · 2022-02-18T16:47:56Z

Would love to have a way to disable load balancer health checks for specific ports, use case for example is GRPC Ports don't like when TCP things probe them and do not ask for the GRPC Preface. This causes a ton of Log flooding which is just noise.

Another valid option would be instead to allow the specific configuration to have a different healthcheck like a http healthcheck to the downstream service instead of checking the GRPC TCP port for availability.

Wuzyn · 2022-09-21T11:39:11Z

Hi all
I'm having the same issue. I'm hosting sftp server on AKS.

hterik · 2022-10-17T05:32:58Z

Putting LoadBalancer in front of a HTTP server as many have done above you need to be aware of following.
The LoadBalancer health probe runs from each node in the cluster. It opens a TCP request, holds it open, sends nothing, and then waits for 15 seconds before closing the connection. I don't know if its the responsibility of the server or prober to close it faster, but most servers i seen it just occupies one thread. Meaning your server must concurrently be able to hold at least one connections open per node. Switching to some kind of asyncio server helps a lot, otherwise you need to increase the number of threads to match at least the number of nodes in your cluster.

A better solution is to consider a Ingress controller when dealing with HTTP.

solacens · 2023-02-01T06:02:54Z

For my infrastructure it requires multiple rules across different port, so if I need to have multiple copies of my microservices, I need multiple copies of ingress-controller for TCP forwarding. As a result, I turned into the Azure CNI provided LoadBalancer type.

After that somehow the client was experiencing intermittent 502 BAD GATEWAY for requests and I highly doubt it is related to the kubernetes or kubernetes-internal load balancer health probe misdetection underneath the Azure CNI. And I would like to rule out this possibility by disabling that.

fethullahmisir · 2023-02-23T11:35:31Z

I had the same problem and was able to disable the health probe for my sftp server port with this annotation:

From the docs: https://cloud-provider-azure.sigs.k8s.io/topics/loadbalancer/#loadbalancer-annotations

service.beta.kubernetes.io/port_{port}_no_probe_rule: true

Where{port}must be replaced by the service port like service.beta.kubernetes.io/port_22_no_probe_rule.

I think this issue can be closed as disabling health probes are already supported.

TomasTokaMrazek · 2023-02-23T16:43:26Z

This was always the possibility or it was recently added as new function to AKS LB?

fethullahmisir · 2023-02-24T09:08:09Z

The doc states that it's possible since AKS Version v1.24. I don't know when the v1.24 Version was released.

triage-new-issues bot added the triage label Jan 17, 2020

github-actions bot added the action-required label Jul 21, 2020

triage-new-issues bot removed the triage label Jul 21, 2020

ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Jul 26, 2020

TomGeske assigned palma21 Jul 27, 2020

TomGeske added feature-request Requested Features and removed Needs Attention 👋 Issues needs attention/assignee/owner action-required labels Jul 27, 2020

ghost added the action-required label Jan 23, 2021

palma21 assigned phealy Jul 18, 2021

ghost removed the action-required label Jul 18, 2021

palma21 added the Feedback General feedback label Jul 18, 2021

mplachter mentioned this issue Feb 28, 2022

Health checks against the gRPC ports cause annoying log entries open-telemetry/opentelemetry-collector#4747

Closed

ghost added the action-required label Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to disable loadbalancer health probes #1394

Allow to disable loadbalancer health probes #1394

tarioch commented Jan 17, 2020

jnoller commented Jan 17, 2020

tarioch commented Jan 17, 2020

przemolb commented Feb 9, 2020

pag08007 commented Feb 26, 2020

alex-doerfler commented Feb 27, 2020

przemolb commented May 30, 2020

github-actions bot commented Jul 21, 2020

Bessonov commented Jul 21, 2020

ghost commented Jul 26, 2020

TomGeske commented Jul 27, 2020

antonmatsiuk commented Feb 12, 2021

motmot80 commented Mar 18, 2021

FanerYedermann commented Mar 22, 2021

dnovvak commented May 25, 2021

BobClaerhout commented Jul 22, 2021

vishalsawale9 commented Aug 3, 2021 •

edited

TomasTokaMrazek commented Dec 3, 2021 •

edited

mplachter commented Feb 18, 2022

Wuzyn commented Sep 21, 2022

hterik commented Oct 17, 2022

solacens commented Feb 1, 2023 •

edited

fethullahmisir commented Feb 23, 2023 •

edited

TomasTokaMrazek commented Feb 23, 2023

fethullahmisir commented Feb 24, 2023

Allow to disable loadbalancer health probes #1394

Allow to disable loadbalancer health probes #1394

Comments

tarioch commented Jan 17, 2020

jnoller commented Jan 17, 2020

tarioch commented Jan 17, 2020

przemolb commented Feb 9, 2020

pag08007 commented Feb 26, 2020

alex-doerfler commented Feb 27, 2020

przemolb commented May 30, 2020

github-actions bot commented Jul 21, 2020

Bessonov commented Jul 21, 2020

ghost commented Jul 26, 2020

TomGeske commented Jul 27, 2020

antonmatsiuk commented Feb 12, 2021

motmot80 commented Mar 18, 2021

FanerYedermann commented Mar 22, 2021

dnovvak commented May 25, 2021

BobClaerhout commented Jul 22, 2021

vishalsawale9 commented Aug 3, 2021 • edited

TomasTokaMrazek commented Dec 3, 2021 • edited

mplachter commented Feb 18, 2022

Wuzyn commented Sep 21, 2022

hterik commented Oct 17, 2022

solacens commented Feb 1, 2023 • edited

fethullahmisir commented Feb 23, 2023 • edited

TomasTokaMrazek commented Feb 23, 2023

fethullahmisir commented Feb 24, 2023

vishalsawale9 commented Aug 3, 2021 •

edited

TomasTokaMrazek commented Dec 3, 2021 •

edited

solacens commented Feb 1, 2023 •

edited

fethullahmisir commented Feb 23, 2023 •

edited