Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kube scheduler panic #544

Closed
machine424 opened this issue Jul 31, 2022 · 8 comments
Closed

Kube scheduler panic #544

machine424 opened this issue Jul 31, 2022 · 8 comments
Assignees
Labels
bug Something isn't working stale

Comments

@machine424
Copy link

Hello,

Describe the bug
When trying to schedule a Pod (for the first time) with a high kube priority class (https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/) and no node has enough free disk for it, the kube scheduler crashes.

Environments

To Reproduce

  • kubectl apply
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx
  serviceName: "nginx"
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      terminationGracePeriodSeconds: 10
      priorityClassName: HIGH-PRIORITY-CLASS
      nodeSelector:
        node-role.kubernetes.io/topolvm: 'true'
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "topolvm-provisioner"
      resources:
        requests:
          storage: 30000Gi => Make sure no node can fulfil this.
E0731 12:32:37.742749       1 runtime.go:78] Observed a panic: runtime.boundsError{x:0, y:0, signed:true, code:0x0} (runtime error: index out of range [0] with length 0)
goroutine 313527 [running]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1e2e2a0, 0xc0117e75d8)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86
panic(0x1e2e2a0, 0xc0117e75d8)
	/usr/local/go/src/runtime/panic.go:965 +0x1b9
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.pickOneNodeForPreemption(0xc0075f1bf0, 0x6, 0x8)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:495 +0xd55
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.SelectCandidate(0xc01397ff00, 0x6, 0x8, 0xc00de14000, 0x2218e88)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:435 +0x89
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.(*DefaultPreemption).preempt(0xc00053b1a0, 0x2218c90, 0xc014a377c0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0xc0070b9820, 0x40d84c, 0x7fd7d0f37ff0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:163 +0x6eb
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.(*DefaultPreemption).PostFilter(0xc00053b1a0, 0x2218c90, 0xc014a377c0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0x0, 0x0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:95 +0xad
k8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).runPostFilterPlugin(0xc00059bc00, 0x2218c90, 0xc014a377c0, 0x7fd7d1ddcee0, 0xc00053b1a0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0xc0137525e8, 0xc0070b9928)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:632 +0x87
k8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).RunPostFilterPlugins(0xc00059bc00, 0x2218c90, 0xc014a377c0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0x0, 0x0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:617 +0x1e5
k8s.io/kubernetes/pkg/scheduler.(*Scheduler).scheduleOne(0xc00039a120, 0x2218c90, 0xc014a377c0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/scheduler.go:479 +0x842
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x37
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0086bff20)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0070b9f20, 0x21d8140, 0xc009a3a5a0, 0xc014a37701, 0xc003fe8000)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0086bff20, 0x0, 0x0, 0x19dc401, 0xc003fe8000)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x2218c90, 0xc014a377c0, 0xc0086bff80, 0x0, 0x0, 0xc0086bff01)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0xa6
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.UntilWithContext(...)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99
k8s.io/kubernetes/pkg/scheduler.(*Scheduler).Run(0xc00039a120, 0x2218c90, 0xc014a377c0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/scheduler.go:316 +0x92
k8s.io/kubernetes/cmd/kube-scheduler/app.Run.func2(0x2218c90, 0xc014a377c0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:202 +0x55
created by k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:207 +0x11b
panic: runtime error: index out of range [0] with length 0 [recovered]
	panic: runtime error: index out of range [0] with length 0

goroutine 313527 [running]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x109
panic(0x1e2e2a0, 0xc0117e75d8)
	/usr/local/go/src/runtime/panic.go:965 +0x1b9
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.pickOneNodeForPreemption(0xc0075f1bf0, 0x6, 0x8)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:495 +0xd55
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.SelectCandidate(0xc01397ff00, 0x6, 0x8, 0xc00de14000, 0x2218e88)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:435 +0x89
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.(*DefaultPreemption).preempt(0xc00053b1a0, 0x2218c90, 0xc014a377c0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0xc0070b9820, 0x40d84c, 0x7fd7d0f37ff0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:163 +0x6eb
k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption.(*DefaultPreemption).PostFilter(0xc00053b1a0, 0x2218c90, 0xc014a377c0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0x0, 0x0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go:95 +0xad
k8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).runPostFilterPlugin(0xc00059bc00, 0x2218c90, 0xc014a377c0, 0x7fd7d1ddcee0, 0xc00053b1a0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0xc0137525e8, 0xc0070b9928)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:632 +0x87
k8s.io/kubernetes/pkg/scheduler/framework/runtime.(*frameworkImpl).RunPostFilterPlugins(0xc00059bc00, 0x2218c90, 0xc014a377c0, 0xc009a3ab70, 0xc00de14000, 0xc012275d10, 0x0, 0x0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/framework/runtime/framework.go:617 +0x1e5
k8s.io/kubernetes/pkg/scheduler.(*Scheduler).scheduleOne(0xc00039a120, 0x2218c90, 0xc014a377c0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/scheduler.go:479 +0x842
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0x37
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc0086bff20)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0070b9f20, 0x21d8140, 0xc009a3a5a0, 0xc014a37701, 0xc003fe8000)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0086bff20, 0x0, 0x0, 0x19dc401, 0xc003fe8000)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x2218c90, 0xc014a377c0, 0xc0086bff80, 0x0, 0x0, 0xc0086bff01)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 +0xa6
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.UntilWithContext(...)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99
k8s.io/kubernetes/pkg/scheduler.(*Scheduler).Run(0xc00039a120, 0x2218c90, 0xc014a377c0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/scheduler/scheduler.go:316 +0x92
k8s.io/kubernetes/cmd/kube-scheduler/app.Run.func2(0x2218c90, 0xc014a377c0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:202 +0x55
created by k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:207 +0x11b

Expected behavior

The scheduler extender should not crash kube scheduler, looks the same as kubernetes/kubernetes#101548.

The kube scheduler can protect itself against this but only starting from v1.22: kubernetes/kubernetes#101560

The scheduler should not evict any pod as this will not free out disk.

The pod should and does stay in pending state, because no node is available to fulfil its requests.


Didn't make a test with available disk but not enough cpu/ram on the node, don't know how the preemption code will react.

@machine424 machine424 added the bug Something isn't working label Jul 31, 2022
@llamerada-jp
Copy link
Contributor

llamerada-jp commented Aug 1, 2022

Thank you for your report. I tried it like below but I can't reproduce it.

  • Could you please tell us how to reproduce it using TopoLVM's e2e environment?
  • Even if we can reproduce it, the root cause may be the Kubescheduler, there may not be much we can do. (e.g. alerting users via slack).
$ git checkout v0.11.1
$ make -C e2e setup
$ KUBERNETES_VERSION="1.21.2" make -C e2e start-lvmd
$ KUBERNETES_VERSION="1.21.2" make -C e2e test

Applied manifest below.

apiVersion: v1
kind: Pod
metadata:
  name: huge-pod
  labels:
    app.kubernetes.io/name: ubuntu
spec:
  containers:
    - name: ubuntu
      image: quay.io/cybozu/ubuntu:20.04
      command: ["/usr/local/bin/pause"]
      volumeMounts:
        - mountPath: /test1
          name: my-volume
  volumes:
    - name: my-volume
      persistentVolumeClaim:
        claimName: huge-pvc
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: huge-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 30000Gi
  storageClassName: topolvm-provisioner

I got the result but not reproduced.

$ kubectl get pod
NAME       READY   STATUS    RESTARTS   AGE
huge-pod   0/1     Pending   0          6m3s

$ kubectl  get pod -n kube-system
NAME                                                READY   STATUS    RESTARTS   AGE
...
kube-scheduler-topolvm-e2e-control-plane            1/1     Running   0          38m # looks healthy

@machine424
Copy link
Author

Hello @llamerada-jp,

Thanks for your fast reaction.

I should have emphasized it more but the panic occurs when the pod to be scheduled has a priorityClassName (a high one, see my manifest) as the panic is triggered in preemption code apparently.

Your Pod manifest doesn’t specify a priority class.

@machine424
Copy link
Author

Hello @llamerada-jp,

You can use the shipped-with priority class system-cluster-critical for huge-pod.

@llamerada-jp
Copy link
Contributor

@machine424
I can't reproduce it yet. I guess some preemptions have occurred according to your first log. Are there any other pods to preemption? Could you reproduce this issue using another CSI driver like local-path-provisioner?

@machine424
Copy link
Author

machine424 commented Aug 29, 2022

Hello @llamerada-jp

Yes, as I've already mentioned, it seems to be related to preemption, I reproduce this with:

  • node N with size ofS lvm disk.
  • node N hosts a Pod/Pods that use S1 of the disk. these pods should have a low priority class (the default one (=0 for me)
  • try to deploy a Pod with priority class system-cluster-criticaland which needs more than S-S1 disk on node N.

preemption code would run and fail because of topolvm-scheduler input.

Hope this helps.

@llamerada-jp
Copy link
Contributor

@machine424
I have not been able to reproduce the problem, but I consider the following. Pods using PVs created by TopoLVM should not be subject to preemption because the real LVM volume cannot be moved. To keep this, I suggest using a priority class like the document below.
https://github.com/topolvm/topolvm/blob/main/docs/user-manual.md#pod-priority
Also, already existing pods using TopoLVM and new pods should preferably use the same priority class. This is because if the priority class of the new pod is higher, the existing pod using TopoLVM may be subject to preemption. About the panic of kube-scheduler, I'm aware that it is a bug of kube-scheduler and should be updated to patched Kubernetes. If you meet with this problem before updating, you may resolve it by removing the pod manually to make it available for scheduling.
I think it is better to alert via slack or document as a known issue. what do you think?

@github-actions
Copy link
Contributor

github-actions bot commented Oct 5, 2022

This issue has been automatically marked as stale because it has not had any activity for 30 days. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Oct 5, 2022
@github-actions
Copy link
Contributor

This issue has been automatically closed due to inactivity. Please feel free to reopen this issue (or open a new one) if this still requires investigation. Thank you for your contribution.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
Archived in project
Development

No branches or pull requests

3 participants