-
Notifications
You must be signed in to change notification settings - Fork 39.3k
-
Notifications
You must be signed in to change notification settings - Fork 39.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status inconsistencies between deployment and its pods #82405
Comments
@kubernetes/sig-scheduling |
/sig apps |
Can you reproduce this issue on a more recent version of Kubernetes? v1.5.0 is a few years old now and is no longer supported, the current supported versions are v1.13, v1.14 and v1.15 |
@Joseph-Irving sorry, it's a typo, the version is 1.15 |
can you show the full output of the deployment, |
@Joseph-Irving here is the output for another deployment with the same problem: |
Interesting, there are a few things that don't look right there, Pod ready is Deployment says it has 1 unavailable replica here The replicaset seems to have an incomplete status field. I would expect it to have |
@Joseph-Irving do you have a clue where to dig further? |
So the reason the Deployment/Replicaset don't appear to have the correct status is because the Pod has the condition
|
@Joseph-Irving there were issues, when kubelet tried to connect to the api-server, but these issues were related to another pod. And this pod and corresponding deployment are fine now. I suppose the issue related to the kube-apiserver restart. https://gist.github.com/kayrus/eac4891efdf1b7817e40d0bf15c0a277 UPD: I restarted the kubelet and it fixed the consistency issue. I suppose that once kubelet can't connect to the server, it stops further tries. |
@Joseph-Irving I was able to reproduce this case by using iptables rule:
UPD: there is a race condition somewhere, because sometimes after the same operation pod gets proper status update within the
|
I added more debug into the
status.status contains Type:Ready Status:True , when the actual kube-apiserver status is Type:Ready Status:False .
Then I added more debug into the kubernetes/pkg/kubelet/config/config.go Line 448 in 7a5929d
existing and ref contain Ready False .
So far I suspect this func: kubernetes/pkg/kubelet/config/config.go Line 252 in 7a5929d
|
the patch below, adapted to v1.15.x from the #84951 PR, solved the issue. Tested multiple times. |
fixed in k8s 1.15.8 (1.15.9) |
What happened:
I face issues with the deployment status. Its pods are ready and have Running status, however the deployment status doesn't show readiness.
What you expected to happen:
I expect the see the Ready 3/3 deployment status.
How to reproduce it (as minimally and precisely as possible):
This happens, when a kubelet reestablishes a connection to a kube-apiserver. Not always, but I'm able to reproduce the issue with 50% chance.
Some further debugging showed that the pod status cache map stuck with
Ready True
inpkg/kubelet/status/status_manager.go
as an old and a new value, therefore reconciliation is not triggered.Cache map in
pkg/kubelet/config/config.go
stuck withReady False
for both old and new pod statuses, reconciliation is not triggered as well.For some reason reconciliation loops don't merge these two values and they persist until you restart kubelet. Still trying to understand what exactly is wrong (probably one line fix is needed, where pointer is used instead of DeepCopy, probably some lack of mutex lock)
Environment:
kubectl version
): 1.15.4cat /etc/os-release
): coreos stable@kubernetes/sig-scheduling
The text was updated successfully, but these errors were encountered: