New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS - Randomly unhealty nodes in target groups #9990
Comments
This issue is currently awaiting triage. If Ingress contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-kind bug Is this related #9367 |
thank for reply @longwuyuan the issue linked is different for me, in my case EC2 Instaces are registerd on Target Group but they are unhealty. i've check if it was a network issue but nodes in same subnet had 2 different status (healty and unhealty) |
please show |
here the conten service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "20"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: traffic-port
service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: TCP
service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: "5"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "3"
service.beta.kubernetes.io/aws-load-balancer-type: nlb
- hostname: a8e842bcf9d14473ea8460a067058c46-f7c4d42e3047f41b.elb.eu-south-1.amazonaws.com |
is it possible to set externalTrafficPolicy to local ? just to add context the problem is the same reported in this post |
I think there is a healthz path related annotation required. Can you check docs |
but tcp healtcheck don't have a path, am i wrong? |
I am not sure. I think I have seen some comment about path. I am checking |
Sorry, it was about AKS and not EKS |
If you can edit your issue description and improve it, maybe more useful data will be available for debugging.
|
i'm not able to provde you another info, seem something goes down on K8, so that port are unavailabe on the host |
@longwuyuan the issu is the same reported here #8312 |
I am wondering if this is related
#9367
…On Thu, 25 May, 2023, 1:20 pm Antonio Bitonti, ***@***.***> wrote:
@longwuyuan <https://github.com/longwuyuan> the issu is the same reported
here #8312 <#8312>
—
Reply to this email directly, view it on GitHub
<#9990 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGZVWS47ALZUGBUWZEEUELXH4FOFANCNFSM6AAAAAAYMDOW3A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi , I have the same problem, sometimes I have 0 healthy node in the target group and a few minutes later I have one or two node up. Have you found a solution to this problem or do you still have this problem @bjtox ? Thanks |
@sebastienrospars Yeah, I faced with the same problem. I have no idea about the difference, is it a mistake? @longwuyuan |
Hi, i try to implement a nginx-ingress controller on my EKS installation.
i'm try to move on a new fresh installation on aws. i'm able to provide the NLB and the target group but seem not all nodes pass the health check, seem ramdomly fail, currently there are only 2 nodes on 5 availabe on my cluster.
the issue is the same of this one 8312
we move our application from k8s 1.22 to 1.26. We use the chart version 4.6.1 and we hope all nodes going healty.
Seem the node port on nodes are unavailabe for some reason i can't understand
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
NGINX Ingress controller
Release: v1.7.1
Build: f48b03b
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.21.6
Kubernetes version (use
kubectl version
):Server Version: version.Info{Major:"1", Minor:"26+", GitVersion:"v1.26.4-eks-0a21954", GitCommit:"4a3479673cb6d9b63f1c69a67b57de30a4d9b781", GitTreeState:"clean", BuildDate:"2023-04-15T00:33:09Z", GoVersion:"go1.19.8", Compiler:"gc", Platform:"linux/amd64"}
Environment:
QA
Cloud provider or hardware configuration:
AWS
OS (e.g. from /etc/os-release):
Install tools:
Basic cluster related info:
kubectl version
kubectl get nodes -o wide
ip-10-176-0-227.eu-south-1.compute.internal Ready 3d23h v1.26.4-eks-0a21954 10.176.0.227 Amazon Linux 2 5.10.178-162.673.amzn2.x86_64 containerd://1.6.19
ip-10-176-0-77.eu-south-1.compute.internal Ready 3d23h v1.26.4-eks-0a21954 10.176.0.77 Amazon Linux 2 5.10.178-162.673.amzn2.x86_64 containerd://1.6.19
ip-10-176-1-124.eu-south-1.compute.internal Ready 3d23h v1.26.4-eks-0a21954 10.176.1.124 Amazon Linux 2 5.10.178-162.673.amzn2.x86_64 containerd://1.6.19
ip-10-176-1-68.eu-south-1.compute.internal Ready 3d23h v1.26.4-eks-0a21954 10.176.1.68 Amazon Linux 2 5.10.178-162.673.amzn2.x86_64 containerd://1.6.19
How was the ingress-nginx-controller installed:
helm ls -A | grep -i ingress
helm -n <ingresscontrollernamepspace> get values <helmreleasename>
Current State of the controller:
kubectl describe ingressclasses
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=s-oms-ingress
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingressnginx
app.kubernetes.io/part-of=ingressnginx
app.kubernetes.io/version=1.7.1
helm.sh/chart=ingressnginx-4.6.1
Annotations: meta.helm.sh/release-name: s-oms-ingress
meta.helm.sh/release-namespace: s-oms
Controller: k8s.io/ingress-nginx
Events:
kubectl -n <ingresscontrollernamespace> get all -A -o wide
kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>
kubectl -n <ingresscontrollernamespace> describe svc <ingresscontrollerservicename>
Current state of ingress object, if applicable:
kubectl -n <appnnamespace> get all,ing -o wide
kubectl -n <appnamespace> describe ing <ingressname>
Others:
How to reproduce this issue:
Anything else we need to know:
no other information are availabe
Thanks in advance best regards
The text was updated successfully, but these errors were encountered: