-
Notifications
You must be signed in to change notification settings - Fork 38.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node is not ready because of kubelet report a meaningful conflict error #58002
Comments
@kubernetes/sig-node-bugs |
@foxyriver: Reiterating the mentions to trigger a notification: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I've seen several of these bug reports, and am having one myself, but they all get ignored. |
@SleepyBrett Have you solved this issue? and have any solution on it? |
It seems to me that this is caused by:
I think we shouldn't treat the conflict as a conflict in this case... Details from when I observed it:
|
Here is the work-around to restore the node:
I still think this is a bug in the kubelet though, I'm going to investigate that code. |
two things stand out to me:
|
Seems @foxyriver is running v1.6.9 with etcd v3.0.17. For clusters older than 1.9, are there any possibility of hitting this problem? |
Yes, running against an etcd cluster without that set to true is known to lead to correctness issues. |
using default value in --etcd-quorum-read configuration, v1.6.9 is false
using etcd v3 schema. no migration |
thx :), I will try to set --etcd-quorum-read=true in kube-apiserver |
@liggitt in my case it was a single etcd server (rpi cluster) does that flag still apply? |
@brendandburns |
the server-side patch application logic that could lead to this error was removed by #63146 |
Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object ```release-note fixes spurious "meaningful conflict" error encountered by nodes attempting to update status, which could cause them to be considered unready ```
Automatic merge from submit-queue. Remove patch retry conflict detection Minimal backport of #63146 Fixes #58002 Fixes spurious patch errors for CRDs Fixes patch errors for nodes when the watch cache has a persistently stale version of an object ```release-note fixes spurious "meaningful conflict" error encountered by nodes attempting to update status, which could cause them to be considered unready ```
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
kubelet report an error.kubectl get node is not ready
there is a meaningful conflict (firstResourceVersion: "104201", currentResourceVersion: "4293"): diff1={"metadata":{"resourceVersion":"4293"},"status":{"conditions":[{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"DiskPressure"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"MemoryPressure"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"OutOfDisk"},{"lastHeartbeatTime":"2018-01-03T07:38:24Z","lastTransitionTime":"2018-01-03T07:42:59Z","message":"Kubelet stopped posting node status.","reason":"NodeStatusUnknown","status":"Unknown","type":"Ready"}]}} , diff2={"status":{"conditions":[{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has no disk pressure","reason":"KubeletHasNoDiskPressure","status":"False","type":"DiskPressure"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has sufficient memory available","reason":"KubeletHasSufficientMemory","status":"False","type":"MemoryPressure"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet has sufficient disk space available","reason":"KubeletHasSufficientDisk","status":"False","type":"OutOfDisk"},{"lastHeartbeatTime":"2018-01-04T09:31:09Z","lastTransitionTime":"2018-01-04T09:31:09Z","message":"kubelet is posting ready status","reason":"KubeletReady","status":"True","type":"Ready"}],"nodeInfo":{"gpus":[]}}} E0104 17:31:09.779522 7223 kubelet_node_status.go:318] Unable to update node status: update node status exceeds retry count
What you expected to happen:
I found a PR about this issue,
#44788
it has been picked to 1.6. I want to know why this issue still happend.
How to reproduce it (as minimally and precisely as possible):
this issue is incidental, bug I found an issue still meet this problem
#52498
he solves it by changing etcd 3.1.10 instead of 3.2.7.
when I change the leader of etcd cluster, this issue will be gone.
is this a known incompatibility between k8s and etcd or a etcd bug?
Anything else we need to know?:
Environment:
kubectl version
): 1.6.9The text was updated successfully, but these errors were encountered: