Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods remaining in NotReady state #84931

Open
krzysied opened this issue Nov 7, 2019 · 17 comments · May be fixed by #84951

Comments

@krzysied
Copy link
Contributor

@krzysied krzysied commented Nov 7, 2019

What happened:
Pods remaining in running but not ready state after node become unready for few seconds.

What you expected to happen:
Pods should become ready when node is ready.

How to reproduce it (as minimally and precisely as possible):
Created large deployment. Make node unready for few seconds. Pods will remain not ready.
However is requires node lifecycle controler to be in eviction pass, so my guess it will not be reproducible in 100% of cases.

Anything else we need to know?:
I achieve this state lock with kubemark run. There is chance that vanilla k8s is unaffected.

Environment:

  • Kubernetes version (use kubectl version):
    1.18
  • Cloud provider or hardware configuration:
    gce - kubemark
  • OS (e.g: cat /etc/os-release):
    linux
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 7, 2019

My guess of what happened is:

  1. Few nodes become not ready.
  2. Node lifecycle controller store node statuses
  3. Nodes become ready again
  4. Eviction pass (based on stored status) marked pods not ready even though nodes were ready at that time
@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 7, 2019

@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 7, 2019

kube-controller-manager log:

I1107 13:28:27.310469       1 controller_utils.go:169] Recording status change NodeNotReady event message for node hollow-node-pz68p
I1107 13:28:27.310607       1 event.go:281] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"hollow-node-pz68p", UID:"4296a241-d0f4-47e5-8a2c-79c3040130fe", APIVersion:"", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'NodeNotReady' Node hollow-node-pz68p status is now: NodeNotReady
E1107 13:28:27.387889       1 node_lifecycle_controller.go:171] error checking if node hollow-node-pz68p exists: error splitting providerID
I1107 13:28:31.234461       1 controller_utils.go:116] Update ready status of pods on node [hollow-node-pz68p]

The error was created here:

exists, err := ensureNodeExistsByProviderID(context.TODO(), instances, node)

Assuming that node still existed at that time, it didn't changed execution of the logic.
OTOH support for kubemark provider would be nice.

@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 7, 2019

There is a possibility that node lifecycle controller didn't spot node updates.
I'm bumping up log level for this case: #84933

@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 7, 2019

Apiserver logs (ragrding node status):

I1107 13:28:23.766320       1 httplog.go:90] PUT /apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/hollow-node-pz68p?timeout=10s: (2.898907ms) 200 [kubemark/v1.18.0 (linux/amd64) kubernetes/de158c0 35.237.0.154:32966]
I1107 13:28:27.310188       1 httplog.go:90] PUT /api/v1/nodes/hollow-node-pz68p/status: (2.453214ms) 200 [kube-controller-manager/v1.18.0 (linux/amd64) kubernetes/de158c0/system:serviceaccount:kube-system:node-controller [::1]:37018]
I1107 13:28:27.364569       1 httplog.go:90] PATCH /api/v1/nodes/hollow-node-pz68p: (52.809816ms) 200 [kube-controller-manager/v1.18.0 (linux/amd64) kubernetes/de158c0/system:serviceaccount:kube-system:node-controller [::1]:37018]
I1107 13:28:30.110484       1 httplog.go:90] PATCH /api/v1/nodes/hollow-node-pz68p/status?timeout=10s: (31.375022ms) 200 [kubemark/v1.18.0 (linux/amd64) kubernetes/de158c0 35.237.0.154:32966]
I1107 13:28:30.172280       1 httplog.go:90] PATCH /api/v1/nodes/hollow-node-pz68p: (52.819529ms) 200 [kube-controller-manager/v1.18.0 (linux/amd64) kubernetes/de158c0/system:serviceaccount:kube-system:node-controller [::1]:37018]
I1107 13:28:33.773756       1 httplog.go:90] PUT /apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/hollow-node-pz68p?timeout=10s: (1.927627ms) 200 [kubemark/v1.18.0 (linux/amd64) kubernetes/de158c0 35.237.0.154:32966]
I1107 13:28:43.791654       1 httplog.go:90] PUT /apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/hollow-node-pz68p?timeout=10s: (11.188822ms) 200 [kubemark/v1.18.0 (linux/amd64) kubernetes/de158c0 35.237.0.154:32966]

Updating pod (from node hollow-node-pz68p):

I1107 13:28:31.239215       1 httplog.go:90] PUT /api/v1/namespaces/test-8u54hi-21/pods/saturation-deployment-0-7dd8d95bd7-rtxn9/status: (2.209386ms) 200 [kube-controller-manager/v1.18.0 (linux/amd64) kubernetes/de158c0/system:serviceaccount:kube-system:node-controller [::1]:37018]
@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 7, 2019

Can someone give a hint what sets the pod ready status to true, when node is ready again?

@renjian52

This comment has been minimized.

Copy link

@renjian52 renjian52 commented Nov 7, 2019

I also find this bug and It is introduced by #77661. when I rollback this PR, This bug is fixed

@tedyu

This comment has been minimized.

Copy link
Contributor

@tedyu tedyu commented Nov 8, 2019

@renjian52
If you have a chance, can you see if the following fixes the issue you faced (without reverting #77661) ?

diff --git a/pkg/kubelet/kubelet_getters.go b/pkg/kubelet/kubelet_getters.go
index fb5c85dc06..e61687b1ac 100644
--- a/pkg/kubelet/kubelet_getters.go
+++ b/pkg/kubelet/kubelet_getters.go
@@ -31,6 +31,7 @@ import (
        "k8s.io/kubernetes/pkg/kubelet/cm"
        "k8s.io/kubernetes/pkg/kubelet/config"
        kubecontainer "k8s.io/kubernetes/pkg/kubelet/container"
+       kubelettypes "k8s.io/kubernetes/pkg/kubelet/types"
        "k8s.io/kubernetes/pkg/util/mount"
        utilnode "k8s.io/kubernetes/pkg/util/node"
        utilpath "k8s.io/utils/path"
@@ -167,8 +168,10 @@ func (kl *Kubelet) GetPods() []*v1.Pod {
        // a kubelet running without apiserver requires an additional
        // update of the static pod status. See #57106
        for _, p := range pods {
-               if status, ok := kl.statusManager.GetPodStatus(p.UID); ok {
-                       p.Status = status
+               if kubelettypes.IsStaticPod(p) {
+                       if status, ok := kl.statusManager.GetPodStatus(p.UID); ok {
+                               p.Status = status
+                       }
                }
        }
        return pods
@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 8, 2019

If you have a chance, can you see if the following fixes the issue you faced (without reverting #77661) ?

I haven't find a way to reproduce this issue every time, so it is hard for me to say if it fixes the issue.

Can you say more about what issue this patch addresses? Is there a problem with statusManager having wrong/old pod status?

@krzysied

This comment has been minimized.

Copy link
Contributor Author

@krzysied krzysied commented Nov 8, 2019

I've re-run the test. There was also a problem with hollow-nodes. This time test didn't have problem with pods. k8s build included patch posted above.
@tedyu It looks like your patch helps. I still would like to hear some backstory of what happens there.

@tedyu

This comment has been minimized.

Copy link
Contributor

@tedyu tedyu commented Nov 8, 2019

My patch refines the code to achieve what #77661 intended to do.

Can you say more about problem with hollow-node ?
Please post log snippet if you have it.

Thanks

@renjian52

This comment has been minimized.

Copy link

@renjian52 renjian52 commented Nov 8, 2019

hi, @tedyu . I have already test your fix in kubelet_getters.go and it can solve this bug. Still, I have a suggestion. Could you use UpdatePod() function in podManager instead of directly changing pod status in podManager by p.Status=status? If some other codes also use this direct way to modify pod infos in podManager, it will be very hard to trace pod infos are modified by which code , still there is no any log show it has been changed.

@wojtek-t

This comment has been minimized.

Copy link
Member

@wojtek-t wojtek-t commented Nov 8, 2019

@wojtek-t

This comment has been minimized.

Copy link
Member

@wojtek-t wojtek-t commented Nov 8, 2019

@k8s-ci-robot k8s-ci-robot added sig/node and removed needs-sig labels Nov 8, 2019
@tedyu

This comment has been minimized.

Copy link
Contributor

@tedyu tedyu commented Nov 8, 2019

@renjian52
I have added log for status update over #84951

w.r.t. using UpdatePod() function in podManager, the review / test may take more time. I will think about it meanwhile.

@tedyu tedyu referenced a pull request that will close this issue Nov 8, 2019
@jmcmeek

This comment has been minimized.

Copy link
Contributor

@jmcmeek jmcmeek commented Nov 11, 2019

Is this issue the same problem? Sure looks like it: #82346

I've been following that issue in hopes of a fix as we see this happen a few times a day (several large clusters). Very happy to see progress.

@tedyu

This comment has been minimized.

Copy link
Contributor

@tedyu tedyu commented Nov 12, 2019

@renjian52
Can you attach kubelet log (without the patch) somewhere and share the URL ?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.