Kube-scheduler dies with "Schedulercache is corrupted" #46347

pizzarabe · 2017-05-24T14:08:51Z

BUG REPORT

Kubernetes version (use kubectl version):

kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:44:38Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.3+coreos.1", GitCommit:"bc000d3336a0b11155ac222193e6f24b6dcb5cd1", GitTreeState:"clean", BuildDate:"2017-05-19T00:19:21Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Self-hosted:
OS
Container Linux by CoreOS 1353.7.0 (Master + most nodes)
and
Container Linux by CoreOS 1298.5.0 (some nodes)
Kernel
1353.7.0: (4.9.24-coreos #1 SMP Wed Apr 26 21:44:23 UTC 2017 x86_64 Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz GenuineIntel GNU/Linux)
1298.5.0: (Linux alien8 4.9.9-coreos-r1 #1 SMP Tue Feb 28 00:06:10 UTC 2017 x86_64 Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz GenuineIntel GNU/Linux)
Install tools:
Hyperkube (https://coreos.com/kubernetes/docs/latest/getting-started.html)
Others:
The underlying Hypervisor is ESXi 5.5
Kubelet is running in rkt, other kubernetes components (incl. Scheduler) are running in docker
The etcd cluster (3 Nodes) is v3.0.10, running inside of rkt containers (using the CoreOS etcd-member.service)

What happened:

After some testing with my private image registry (using the docs at https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/) I had some problems with the kube-scheduler.
Creating a new pod e.g.

apiVersion: v1
kind: Pod
metadata:
  name: privateimage-test
spec:
  containers:
    - name: private-container
      image: server:4567/namespace/container:tag
  imagePullSecrets:
    - name: privateregkey

kills the kube-scheduler

I0524 13:22:55.458837       1 event.go:217] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"privateimage-test", UID:"ff287e49-4083-11e7-8c95-005056b70482", APIVersion:"v1", ResourceVersion:"4460349", FieldPath:""}): type: 'Normal' reason: 'Scheduled' Successfully assigned privateimage-test to alien6
E0524 13:29:33.440433       1 cache.go:290] Pod default/privateimage-test removed from a different node than previously added to.
F0524 13:29:33.440454       1 cache.go:291] Schedulercache is corrupted and can badly affect scheduling decisions

the kube-scheduler is restarted by kubelet and the pod is started after that.

Some output from kubelet

May 24 15:29:34 alien1 kubelet-wrapper[1634]: E0524 13:29:34.085188    1634 event.go:259] Could not construct reference to: '&v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"kube-scheduler-alien1", UID:"cd159dc8e8cca207146c55f953a24533", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{kube-scheduler}"}' due to: 'object does not implement the List interfaces'. Will not report event: 'Warning' 'Unhealthy' 'Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: getsockopt: connection refused'
May 24 15:29:34 alien1 kubelet-wrapper[1634]: I0524 13:29:34.407447    1634 kuberuntime_manager.go:458] Container {Name:kube-scheduler Image:quay.io/coreos/hyperkube:v1.6.3_coreos.1 Command:[/hyperkube scheduler --master=http://127.0.0.1:8080 --leader-elect=true] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10251,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
May 24 15:29:34 alien1 kubelet-wrapper[1634]: I0524 13:29:34.407623    1634 kuberuntime_manager.go:742] checking backoff for container "kube-scheduler" in pod "kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)"
May 24 15:29:34 alien1 kubelet-wrapper[1634]: I0524 13:29:34.407815    1634 kuberuntime_manager.go:752] Back-off 1m20s restarting failed container=kube-scheduler pod=kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)
May 24 15:29:34 alien1 kubelet-wrapper[1634]: E0524 13:29:34.407895    1634 pod_workers.go:182] Error syncing pod cd159dc8e8cca207146c55f953a24533 ("kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)"), skipping: failed to "StartContainer" for "kube-scheduler" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=kube-scheduler pod=kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)"
May 24 15:29:44 alien1 kubelet-wrapper[1634]: I0524 13:29:44.085354    1634 kuberuntime_manager.go:458] Container {Name:kube-scheduler Image:quay.io/coreos/hyperkube:v1.6.3_coreos.1 Command:[/hyperkube scheduler --master=http://127.0.0.1:8080 --leader-elect=true] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10251,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
May 24 15:29:44 alien1 kubelet-wrapper[1634]: I0524 13:29:44.085464    1634 kuberuntime_manager.go:742] checking backoff for container "kube-scheduler" in pod "kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)"
May 24 15:29:44 alien1 kubelet-wrapper[1634]: I0524 13:29:44.085588    1634 kuberuntime_manager.go:752] Back-off 1m20s restarting failed container=kube-scheduler pod=kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)
May 24 15:29:44 alien1 kubelet-wrapper[1634]: E0524 13:29:44.085627    1634 pod_workers.go:182] Error syncing pod cd159dc8e8cca207146c55f953a24533 ("kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)"), skipping: failed to "StartContainer" for "kube-scheduler" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=kube-scheduler pod=kube-scheduler-alien1_kube-system(cd159dc8e8cca207146c55f953a24533)"

The journal entry right before that shows:
May 24 15:29:33 alien1 dockerd[1757]: time="2017-05-24T15:29:33.460151779+02:00" level=error msg="Error closing logger: invalid argument"

Creating a new pod with sth. like /kubectl run busybox --image busybox /bin/sh does not kill the scheduler!

What you expected to happen:
The Scheduler should not die :)

How to reproduce it :

Not sure, the cluster worked ~ 30days without problems. The problem started after working with my private image registry. Before that I updated k8s from 1.6.1 to .1.6.3.

Anything else we need to know:

I restarted the Master after the problem first occurred (that helped before with some other problems)

The text was updated successfully, but these errors were encountered:

k82cn · 2017-05-24T15:21:50Z

@kubernetes/sig-scheduling-bugs

davidopp · 2017-05-24T15:34:49Z

@wojtek-t @bsalamat

timothysc · 2017-05-24T16:08:51Z

There are some PR's around this that came in recently, /cc @smarterclayton

wojtek-t · 2017-05-24T18:12:34Z

@pizzarabe - can you please provide the whole scheduler log? This is definitely some kind of race, and without logs it will be hard to debug (I think we fixed some issue with it in 1.5.x, but it seems there is something more here...)

smarterclayton · 2017-05-24T18:14:45Z

we were mutating the shared cache from a threaded environment, so my changes should in theory fix that.

timothysc · 2017-05-24T18:45:59Z

xref: #46223

bsalamat · 2017-05-24T19:04:20Z

Given this issue, it is important to backport Clayton's fix to 1.6 and 1.5.

smarterclayton · 2017-05-24T20:29:42Z

We can actually do a much smaller fix - just copy before you put things back in the cache. If that's the actual problem here (may not be).

…

On Wed, May 24, 2017 at 3:04 PM, Bobby (Babak) Salamat < ***@***.***> wrote: Given this issue, it is important to backport Clayton's fix to 1.6 and 1.5. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#46347 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABG_p1U7xFihA1wEdHdvWW9Q__3AL0iwks5r9H89gaJpZM4NlKXU> .

wojtek-t · 2017-05-25T08:28:14Z

@smarterclayton @timothysc - can you clarify why do you think this PR can help with this problem? I looked into it again, and I don't see any place where we didn't do copy and we are doing now. I agree it fixes watch semantics, but this is unrelated to this issue.

smarterclayton · 2017-05-25T14:41:20Z

We mutate the status of the pod when we call update without copying in 1.6 and earlier.

…

On Thu, May 25, 2017 at 4:28 AM, Wojciech Tyczynski < ***@***.***> wrote: @smarterclayton <https://github.com/smarterclayton> @timothysc <https://github.com/timothysc> - can you clarify why do you think this PR can help with this problem? I looked into it again, and I don't see any place where we didn't do copy and we are doing now. I agree it fixes watch semantics, but this is unrelated to this issue. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#46347 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABG_p8rf9QieGQU07EebB9jwT3GZRi8yks5r9TutgaJpZM4NlKXU> .

wojtek-t · 2017-05-25T14:44:52Z

Aah ok - you mean this one, right:
https://github.com/kubernetes/kubernetes/pull/46223/files#diff-67f2b61521299ca8d8687b0933bbfb19R169

wojtek-t · 2017-05-25T14:47:01Z

The problem I can see is that if it's just about status (and we are not mutating spec), we are not using Status in the cache...

k82cn · 2017-05-26T03:05:26Z

According to the log E0524 13:29:33.440433 1 cache.go:290] Pod default/privateimage-test removed from a different node than previously added to., the issues is triggered by "assigned non-terminated" Pods which is readonly in scheduler.

And it's said

Not sure, the cluster worked ~ 30days without problems. The problem started after working with my private image registry. Before that I updated k8s from 1.6.1 to .1.6.3.

So it maybe related with #45453 , but no direct evidence :(.

wojtek-t · 2017-05-26T06:32:19Z

@k82cn in the past we used to have a problem (that would cause exactly this behavior) that:

the pod failed scheduling for the first time
then we were trying to schedule it again, we managed to find a node
we were assuming it and were sending binding to apiserver
in the meantime because of some race we were trying to schedule it one more time
we were finding a different node for it, and sending binding
the second binding actually succeeded
and the pod ended up being assumed to a different node that it actually was finally binded to

I thought that this is already fixed, but maybe we should revisit if this is really fixed...

k82cn · 2017-05-26T07:41:52Z

@wojtek-t , got the case :) Let me revisit it to see whether any clue here.

pizzarabe · 2017-05-26T07:50:18Z

@wojtek-t
sure, here is the output from kube-scheduler --v=4 (but I am not sure, if this is more useful then the previous log)

$ docker ps -a
3d358c583a73        quay.io/coreos/hyperkube@sha256:30afab2ccb3d48e77d253249fd085023e68633731a2f0b62f0b89dcd227836bd        "/hyperkube scheduler"   44 seconds ago       Up 44 seconds                                     k8s_kube-scheduler_kube-scheduler-alien1_kube-system_75cd12d030f068637ff8094530e04049_1
02518c502df1        quay.io/coreos/hyperkube@sha256:30afab2ccb3d48e77d253249fd085023e68633731a2f0b62f0b89dcd227836bd        "/hyperkube scheduler"   About a minute ago   Exited (255) 44 seconds ago                   k8s_kube-scheduler_kube-scheduler-alien1_kube-system_75cd12d030f068637ff8094530e04049_0

[The dead scheduler]:

$ docker logs 02518c502df1
I0526 07:42:16.897427       1 factory.go:300] Creating scheduler from algorithm provider 'DefaultProvider'
I0526 07:42:16.897522       1 factory.go:346] Creating scheduler with fit predicates 'map[MaxAzureDiskVolumeCount:{} NoDiskConflict:{} GeneralPredicates:{} CheckNodeMemoryPressure:{} CheckNodeDiskPressure:{} NoVolumeZoneConflict:{} MaxEBSVolumeCount:{} MaxGCEPDVolumeCount:{} MatchInterPodAffinity:{} PodToleratesNodeTaints:{}]' and priority functions 'map[BalancedResourceAllocation:{} NodePreferAvoidPodsPriority:{} NodeAffinityPriority:{} TaintTolerationPriority:{} SelectorSpreadPriority:{} InterPodAffinityPriority:{} LeastRequestedPriority:{}]
I0526 07:42:16.897705       1 reflector.go:198] Starting reflector *v1.Pod (0s) from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:451
I0526 07:42:16.897817       1 reflector.go:236] Listing and watching *v1.Pod from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:451
I0526 07:42:16.897920       1 leaderelection.go:179] attempting to acquire leader lease...
I0526 07:42:16.898283       1 reflector.go:198] Starting reflector *v1.Pod (0s) from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:454
I0526 07:42:16.898327       1 reflector.go:236] Listing and watching *v1.Pod from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:454
I0526 07:42:16.898440       1 reflector.go:198] Starting reflector *v1.Service (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.898958       1 reflector.go:198] Starting reflector *v1.Node (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.899033       1 reflector.go:236] Listing and watching *v1.Service from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.899161       1 reflector.go:236] Listing and watching *v1.Node from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.900623       1 reflector.go:198] Starting reflector *v1.PersistentVolumeClaim (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.900661       1 reflector.go:236] Listing and watching *v1.PersistentVolumeClaim from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.900838       1 reflector.go:198] Starting reflector *v1beta1.StatefulSet (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.900925       1 reflector.go:236] Listing and watching *v1beta1.StatefulSet from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.901626       1 reflector.go:198] Starting reflector *v1beta1.ReplicaSet (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.901663       1 reflector.go:236] Listing and watching *v1beta1.ReplicaSet from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.901764       1 reflector.go:198] Starting reflector *v1.PersistentVolume (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.901802       1 reflector.go:236] Listing and watching *v1.PersistentVolume from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.903348       1 reflector.go:198] Starting reflector *v1.ReplicationController (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.903384       1 reflector.go:236] Listing and watching *v1.ReplicationController from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:16.908259       1 leaderelection.go:189] successfully acquired lease kube-system/kube-scheduler
I0526 07:42:16.908463       1 event.go:217] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"kube-scheduler", UID:"404f58d4-40d6-11e7-8c95-005056b70482", APIVersion:"v1", ResourceVersion:"4732494", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' alien1 became leader
I0526 07:42:18.914381       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:20.921037       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:22.927073       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:24.933276       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:26.939491       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:28.945485       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:30.951875       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:32.958256       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:34.964190       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:36.970310       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:38.976780       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:40.983164       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:42.989471       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:44.995575       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:47.001928       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:49.008236       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:51.014911       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:53.021414       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:55.027536       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:42:56.335172       1 factory.go:461] About to try and schedule pod privateimage-test
E0526 07:42:56.335185       1 cache.go:290] Pod default/privateimage-test removed from a different node than previously added to.
I0526 07:42:56.335194       1 scheduler.go:156] Attempting to schedule pod: default/privateimage-test
F0526 07:42:56.335199       1 cache.go:291] Schedulercache is corrupted and can badly affect scheduling decisions

[The restarted scheduler]:

$ docker logs 3d358c583a73
I0526 07:42:57.111661       1 factory.go:300] Creating scheduler from algorithm provider 'DefaultProvider'
I0526 07:42:57.111769       1 factory.go:346] Creating scheduler with fit predicates 'map[NoVolumeZoneConflict:{} MaxEBSVolumeCount:{} GeneralPredicates:{} PodToleratesNodeTaints:{} CheckNodeMemoryPressure:{} CheckNodeDiskPressure:{} MaxGCEPDVolumeCount:{} MaxAzureDiskVolumeCount:{} MatchInterPodAffinity:{} NoDiskConflict:{}]' and priority functions 'map[SelectorSpreadPriority:{} InterPodAffinityPriority:{} LeastRequestedPriority:{} BalancedResourceAllocation:{} NodePreferAvoidPodsPriority:{} NodeAffinityPriority:{} TaintTolerationPriority:{}]
I0526 07:42:57.111919       1 reflector.go:198] Starting reflector *v1.Pod (0s) from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:451
I0526 07:42:57.112005       1 reflector.go:236] Listing and watching *v1.Pod from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:451
I0526 07:42:57.112123       1 leaderelection.go:179] attempting to acquire leader lease...
I0526 07:42:57.112603       1 reflector.go:198] Starting reflector *v1.Service (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.112671       1 reflector.go:198] Starting reflector *v1beta1.ReplicaSet (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.112671       1 reflector.go:236] Listing and watching *v1.Service from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.112748       1 reflector.go:236] Listing and watching *v1beta1.ReplicaSet from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.112787       1 reflector.go:198] Starting reflector *v1beta1.StatefulSet (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.112832       1 reflector.go:236] Listing and watching *v1beta1.StatefulSet from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.112757       1 reflector.go:198] Starting reflector *v1.Pod (0s) from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:454
I0526 07:42:57.113230       1 reflector.go:236] Listing and watching *v1.Pod from k8s.io/kubernetes/plugin/pkg/scheduler/factory/factory.go:454
I0526 07:42:57.113581       1 reflector.go:198] Starting reflector *v1.PersistentVolume (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.113638       1 reflector.go:236] Listing and watching *v1.PersistentVolume from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.114138       1 reflector.go:198] Starting reflector *v1.Node (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.114311       1 reflector.go:236] Listing and watching *v1.Node from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.115015       1 reflector.go:198] Starting reflector *v1.PersistentVolumeClaim (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.115343       1 reflector.go:236] Listing and watching *v1.PersistentVolumeClaim from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.116031       1 reflector.go:198] Starting reflector *v1.ReplicationController (0s) from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.116071       1 reflector.go:236] Listing and watching *v1.ReplicationController from k8s.io/kubernetes/pkg/client/informers/informers_generated/externalversions/factory.go:70
I0526 07:42:57.122205       1 leaderelection.go:189] successfully acquired lease kube-system/kube-scheduler
I0526 07:42:57.122482       1 factory.go:461] About to try and schedule pod privateimage-test
I0526 07:42:57.122530       1 scheduler.go:156] Attempting to schedule pod: default/privateimage-test
I0526 07:42:57.122569       1 factory.go:492] Ignoring node alien1 since it is unschedulable
I0526 07:42:57.122527       1 event.go:217] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"kube-scheduler", UID:"404f58d4-40d6-11e7-8c95-005056b70482", APIVersion:"v1", ResourceVersion:"4732579", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' alien1 became leader
I0526 07:42:57.123942       1 factory.go:584] Attempting to bind privateimage-test to alien8
I0526 07:42:57.128634       1 event.go:217] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"privateimage-test", UID:"eed52c47-41e6-11e7-8c95-005056b70482", APIVersion:"v1", ResourceVersion:"4732574", FieldPath:""}): type: 'Normal' reason: 'Scheduled' Successfully assigned privateimage-test to alien8
I0526 07:42:59.128393       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler
I0526 07:43:01.137046       1 leaderelection.go:204] succesfully renewed lease kube-system/kube-scheduler

Not sure if this helps anybody but here is the kube-scheduler manifest:

apiVersion: v1
kind: Pod
metadata:
  name: kube-scheduler
  namespace: kube-system
spec:
  hostNetwork: true
  containers:
  - name: kube-scheduler
    image: quay.io/coreos/hyperkube:v1.6.3_coreos.1
    command:
    - /hyperkube
    - scheduler
    - --master=http://127.0.0.1:8080
    - --leader-elect=true
    - --v=4
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10251
      initialDelaySeconds: 15
      timeoutSeconds: 1

wojtek-t · 2017-05-26T09:25:01Z

@pizzarabe - thanks a lot for those logs

Looking into those from dead scheduler, they look extremely strange to me.
It seems that the pod that cause problems was the first pod we were trying to even schedule (since the scheduler started). Also the fact that we are removing a pod from the cache before it was even scheduled seems extremely strange:

I0526 07:42:56.335172       1 factory.go:461] About to try and schedule pod privateimage-test
E0526 07:42:56.335185       1 cache.go:290] Pod default/privateimage-test removed from a different node than previously added to.
I0526 07:42:56.335194       1 scheduler.go:156] Attempting to schedule pod: default/privateimage-test
F0526 07:42:56.335199       1 cache.go:291] Schedulercache is corrupted and can badly affect scheduling decisions

wojtek-t · 2017-05-26T09:28:25Z

or maybe it's because of flushing? the logs from cache.go are Error/Fatal so are flushed immediately, the other are Info, so are not flushed immediately. So maybe the order is a red herring?

pizzarabe · 2017-05-26T11:53:19Z

@wojtek-t

It seems that the pod that cause problems was the first pod we were trying to even schedule (since the scheduler started).

Yeah, this is my test cluster, I reproduced the error before that, you can see the scheduler was started ~ 1 min before killing it with the kubectl create -f my.yaml again.

Also the fact that we are removing a pod from the cache before it was even scheduled seems extremely strange

~~I removed the pod with kubectl delete pod default/privateimage-test before crating it again with kubectl create -f~~
Edit:
E0526 07:42:56.335185 1 cache.go:290] Pod default/privateimage-test removed from a different node than previously added to.

This was not caused by kubectl delete pod (timestamp does not match the delete command)

k82cn · 2017-05-26T14:08:48Z

It seems both "assigned non-terminated ListerWatcher" and "un-assigned non-terminated ListerWatcher" got Pod default/privateimage-test, @wojtek-t , do you think apiserver's log will help to see what happens to this Pod?

tinselspoon · 2017-05-26T17:29:32Z

I noticed you said your etcd version was v3.0.10 - could you be running into #45506 where the etcd version is too old to work correctly with Kubernetes? I had that issue occur on my cluster which caused all sorts of strange behaviours, as pod deletions were not correctly observed by the Kubernetes master components since they weren't getting the DELETED events.

If it is this issue you should see in your apiserver logs lines like expected type *api.Pod, but watch event object had type <nil>. Solution was to update the version of etcd3.

bsalamat · 2017-05-26T17:31:33Z

@pizzarabe Are you saying that you know the reproduction steps? If so, it would help a lot if you cloud give us the steps so that we can investigate more and see if the issue still exists in presence of recent fixes.

pizzarabe · 2017-05-29T07:56:26Z

@tinselspoon Yes, my Kube-Api Log says thing like reflector.go:363] k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/storage/cacher.go:215: expected type *api.Pod, but watch event object had type <nil>
I run etcd 3.10 bc it is the current Version in CoreOS stable :/

@bsalamat I can reproduce the scheduler to die (creating a pod with the manifest I mentioned before).
I don't know what I did to cause that bug in the first place. I can reinstall the whole cluster (incl. etcd) but I am not sure if I can get the server in the same state :(

wojtek-t · 2017-05-29T09:30:22Z

@pizzarabe - if you are using etcd in 3.0.10 version, this is definitely possible. You should upgrade to at least 3.0.12 (prefferably to 3.0.17).
With 3.0.10 a bunch of different things may not work in your cluster (not only scheduler).

We should wait if this is reproducible also with 3.0.17 etcd version. I suspect it's not, and this is etcd issue that we are aware of (and was fixed in higher versions).

pizzarabe · 2017-05-31T11:04:11Z

@wojtek-t upgrading to 3.0.17 or 3.0.12 seems to fix that problem, after downgrading to 3.0.10 the problem occurs again.
So I guess this is a duplicate of #45506. I will close that issue, feel free to re-open it :)

k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label May 24, 2017

timothysc added the kind/bug Categorizes issue or PR as related to a bug. label May 24, 2017

pizzarabe closed this as completed May 31, 2017

pizzarabe mentioned this issue May 31, 2017

Upgrade etcd-member to >= 3.0.12 to prevent kubernetes bugs coreos/bugs#1986

Closed

k82cn mentioned this issue Jun 8, 2017

Scheduler shutdown randomly: Schedulercache is corrupted and can badly affect scheduling decisions #47150

Closed

julia-stripe mentioned this issue Aug 18, 2017

Scheduler dies with "Schedulercache is corrupted" #50916

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kube-scheduler dies with "Schedulercache is corrupted" #46347

Kube-scheduler dies with "Schedulercache is corrupted" #46347

pizzarabe commented May 24, 2017

k82cn commented May 24, 2017

davidopp commented May 24, 2017

timothysc commented May 24, 2017

wojtek-t commented May 24, 2017

smarterclayton commented May 24, 2017

timothysc commented May 24, 2017

bsalamat commented May 24, 2017

smarterclayton commented May 24, 2017 via email

wojtek-t commented May 25, 2017

smarterclayton commented May 25, 2017 via email

wojtek-t commented May 25, 2017

wojtek-t commented May 25, 2017

k82cn commented May 26, 2017

wojtek-t commented May 26, 2017

k82cn commented May 26, 2017 •

edited

Loading

pizzarabe commented May 26, 2017

wojtek-t commented May 26, 2017

wojtek-t commented May 26, 2017

pizzarabe commented May 26, 2017 •

edited

Loading

k82cn commented May 26, 2017

tinselspoon commented May 26, 2017

bsalamat commented May 26, 2017

pizzarabe commented May 29, 2017

wojtek-t commented May 29, 2017

pizzarabe commented May 31, 2017

Kube-scheduler dies with "Schedulercache is corrupted" #46347

Kube-scheduler dies with "Schedulercache is corrupted" #46347

Comments

pizzarabe commented May 24, 2017

k82cn commented May 24, 2017

davidopp commented May 24, 2017

timothysc commented May 24, 2017

wojtek-t commented May 24, 2017

smarterclayton commented May 24, 2017

timothysc commented May 24, 2017

bsalamat commented May 24, 2017

smarterclayton commented May 24, 2017 via email

wojtek-t commented May 25, 2017

smarterclayton commented May 25, 2017 via email

wojtek-t commented May 25, 2017

wojtek-t commented May 25, 2017

k82cn commented May 26, 2017

wojtek-t commented May 26, 2017

k82cn commented May 26, 2017 • edited Loading

pizzarabe commented May 26, 2017

wojtek-t commented May 26, 2017

wojtek-t commented May 26, 2017

pizzarabe commented May 26, 2017 • edited Loading

k82cn commented May 26, 2017

tinselspoon commented May 26, 2017

bsalamat commented May 26, 2017

pizzarabe commented May 29, 2017

wojtek-t commented May 29, 2017

pizzarabe commented May 31, 2017

k82cn commented May 26, 2017 •

edited

Loading

pizzarabe commented May 26, 2017 •

edited

Loading