generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 71
Closed
Description
I am running a self-manged (kOps based) k8s cluster.
- Restarted the deployment 1. Looks like it is taking forever for the controller to detect the new IPs and update the target group with those IPs, as a result targets stay unhealthy. I am testing it now, it has been 20+ minutes and targets are still pointing to stale IPs as a result staying unhealthy.
Admin:~/environment $ k get pods -n kops-sample-webapp-1 sample-read-endpoint-1-79db8686dd-fls6q -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
sample-read-endpoint-1-79db8686dd-fls6q 1/1 Running 0 62m 10.100.12.44 i-09c46490ebd9d7a7a <none> <none>
Admin:~/environment $ k get pods -n kops-sample-webapp-1 sample-read-endpoint-1-79db8686dd-7gr4t -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
sample-read-endpoint-1-79db8686dd-7gr4t 1/1 Running 0 63m 10.100.11.72 i-06976711a4a8d5bda <none> <none>
Admin:~/environment $
Admin:~/environment $ aws vpc-lattice list-targets --target-group-identifier tg-0f64ed0c67e307b45
{
"items": [
{
"id": "10.100.11.6",
"port": 80,
"reasonCode": "ConnectionTimeout",
"status": "UNHEALTHY"
},
{
"id": "10.100.12.108",
"port": 80,
"reasonCode": "ConnectionTimeout",
"status": "UNHEALTHY"
}
]
}
Admin:~/environment $
- Restarted yet another deployment - deployment 2. The new pods for deployment 2 claimed the old stale IPs of the write endpoint. And now the deployment 1 targets are marked healthy (as they were never removed from the target group and health check succeeded). Now traffic intended for write endpoint is being served by/route to read endpoint while traffic to read endpoint is hanging:
Admin:~/environment $ k get pods -n kops-sample-webapp-1 sample-write-endpoint-1-84cfb6f8bd-b29pn -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
sample-write-endpoint-1-84cfb6f8bd-b29pn 1/1 Running 0 88m 10.100.12.252 i-09c46490ebd9d7a7a <none> <none>
Admin:~/environment $ k get pods -n kops-sample-webapp-1 sample-write-endpoint-1-84cfb6f8bd-vxfhn -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
sample-write-endpoint-1-84cfb6f8bd-vxfhn 1/1 Running 0 88m 10.100.11.57 i-06976711a4a8d5bda <none> <none>
Admin:~/environment $
Admin:~/environment $ aws vpc-lattice list-targets --target-group-identifier tg-095f618fb72e3c199
{
"items": [
{
"id": "10.100.12.44",
"port": 80,
"status": "HEALTHY"
},
{
"id": "10.100.11.72",
"port": 80,
"status": "HEALTHY"
}
]
}
Admin:~/environment $
Metadata
Metadata
Assignees
Labels
No labels