-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernets service not distributing traffic in equally , seeing imbalance in traffic . #125013
Comments
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/sig network |
Are your pods equally spread across your nodes? We noticed a similar problem, and our issue was that some nodes had more ingress-nginx pods than others, so each node would distribute the traffic it received amongst the pods hosted on itself. |
Hi @adrianmoisey , Yeah could see its spread over different nodes . each node has one replica pod running . |
And just to confirm, when you are scaled up (to 40 pods), you have and equal spread of pods to nodes? |
Yes correct .. Its being equal spread to nodes .. Each node has one replica running .. |
You need to test from inside the cluster and from outside, to investigate if is a loadbalancer problem or a kubernetes problem |
What does the Service look like? Can you paste a YAML representation of it here? |
Service looks okay .. `apiVersion: v1
|
Given that Since I agree with @aojea's suggestion of doing a test inside the cluster. That we it will help eliminate either the cluster to the load balancer. |
/remove-kind bug |
we are discussing this in the sig cloud provider meeting today, we aren't quite sure this is specific to the cloud controller manager and not a configuration with the load balancer in aws. would like to see more data related the questions asked earlier. cc @kmala |
/assign @shaneutt |
We discussed this one in the SIG Network meeting today, and it seems we have several open questions. I've assigned myself this just to try and help shepherd it forward, but @uttam-phygitalz it does seem there are some open questions about this above including a desire to see if this is something that might be happening outside the cluster? Please let us know? /triage needs-information |
Seems like this is getting stale. /lifecycle stale Let us know your thoughts on some of the above questions @uttam-phygitalz, or if you need any help or support in this? |
we talked about this again at sig cloud--provider today, deferring an acceptance on triage while we wait for more information. |
/close last comment from the reporter from May, it can always be reopened if there is more information |
@aojea: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What happened?
We are seeing traffic is not in balancing among ingress controller replicas when replica count gets higher .
We have set HPA like 40 as Maximum replicas and when the load test happen the HPA get triggered and spawn new replicas but the load is not evenly distributed even though resources are available . . PFB the screenshot .
It is deployed in AWS NLB . There is not long-lived connection preset , all are new connections being hit .
description of ingress
│ Labels: app=ingress-nginx-external-nlb │ │ app.kubernetes.io/managed-by=Helm │ Annotations: helm.sh/resource-policy: keep │ │ │ service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: │ │ │ service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp │ │ service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled: true │ │ service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout: 60 │ │ service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: 300 │ │ service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: true │ │ service.beta.kubernetes.io/aws-load-balancer-extra-security-groups: sg-0116assa519f2f2aa1fe8c │ │ service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip │ │ service.beta.kubernetes.io/aws-load-balancer-type: nlb │ │ Selector: app=ingress-nginx-external │ │ Type: LoadBalancer │ │ IP Family Policy: SingleStack │ │ IP Families: IPv4 │ │ IP: 172.20.189.13 │ │ IPs: 172.20.189.13 │ │ LoadBalancer Ingress: a47c0fada1425caa057592-76e4445441da70fa.elb.us-west-2.amazonaws.com │ │ Port: https 443/TCP │ │ TargetPort: 443/TCP │ │ NodePort: https 31411/TCP │ │ Endpoints: 100.64.165.237:443,100.65.173.35:443,100.64.244.118:443 │ │ Session Affinity: None │ │ External Traffic Policy: Local │ │ HealthCheck NodePort: 31286 │ │ Events: │ │ Type Reason Age From Message │ │ ---- ------ ---- ---- ------- │ │ Normal UpdatedLoadBalancer 16m (x163 over 2d17h) service-controller Updated load balancer with new hosts
What did you expect to happen?
The traffic should distributed among all replicas evenly or somewhere near to that .Not like totally imbalanced way .
How can we reproduce it (as minimally and precisely as possible)?
Deploy Ingress controllers
set the HPA for ingress controller ,like min 3 and max 40 .
perform the load test .
Anything else we need to know?
No response
Kubernetes version
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.8-eks-adc7111
Cloud provider
AWS
OS version
Rocky linux / alpine
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: