You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are 3 nodes in my k8s cluster with v0.14.1 Hwameistor like this:
root@pw-k8s01:~# kubectl get node
NAME STATUS ROLES AGE VERSION
pw-k8s01 Ready control-plane,controlplane,etcd,master 58d v1.28.3+rke2r2
pw-k8s02 Ready control-plane,controlplane,etcd,master 56d v1.28.3+rke2r2
pw-k8s03 Ready control-plane,controlplane,etcd,master 56d v1.28.3+rke2r2
Every node has a capacity of 20G for LVM. Then I apply a test yaml file(There are 4 pods and 4 pvcs whose requests storage is 6G.):
root@pw-k8s01:~/pangwei/yaml# kubectl apply -f local-pvc-test.yaml
persistentvolumeclaim/pw-pvc1 created
persistentvolumeclaim/pw-pvc2 created
persistentvolumeclaim/pw-pvc3 created
persistentvolumeclaim/pw-pvc4 created
pod/pw-pod-1 created
pod/pw-pod-2 created
pod/pw-pod-3 created
pod/pw-pod-4 created
You can see:
root@pw-k8s01:~/pangwei/yaml# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pw-pod-1 1/1 Running 0 6h15m 100.65.76.184 pw-k8s03 <none><none>
pw-pod-2 1/1 Running 0 6h15m 100.65.76.185 pw-k8s03 <none><none>
pw-pod-3 1/1 Running 0 6h15m 100.65.76.183 pw-k8s03 <none><none>
pw-pod-4 0/1 Pending 0 6h15m <none><none><none><none>
You can find error log:
time="2024-03-19T07:03:49Z" level=debug msg="Filtered out the node" error="can't schedule the LVM volume to node pw-k8s03" node=pw-k8s03 pod=pw-pod-4
I0319 07:03:49.095941 1 scheduler.go:351] "Unable to schedule pod; no fit; waiting" pod="default/pw-pod-4" err="0/3 nodes are available: 1 can't schedule the LVM volume to node pw-k8s03, 2 node(s) didn't find available persistent volumes to bind. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling."
Pvc pw-pvc4 is like this:
apiVersion: v1kind: PersistentVolumeClaimmetadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pw-pvc4","namespace":"default"},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"6Gi"}},"storageClassName":"hwameistor-storage-lvm-hdd","volumeMode":"Block"}}volume.beta.kubernetes.io/storage-provisioner: lvm.hwameistor.io# As you can see, pvc has been scheduled to pw-k8s03volume.kubernetes.io/selected-node: pw-k8s03volume.kubernetes.io/storage-provisioner: lvm.hwameistor.iocreationTimestamp: "2024-03-19T03:22:17Z"finalizers:
- kubernetes.io/pvc-protectionname: pw-pvc4namespace: defaultresourceVersion: "56041423"uid: eea7fed7-fff1-4304-b4ce-fa1b06e4c942spec:
accessModes:
- ReadWriteOnceresources:
requests:
storage: 6GistorageClassName: hwameistor-storage-lvm-hddvolumeMode: Blockstatus:
phase: Pending
Pod pw-pod-4 and Pvc pw-pvc4 keep pending even though there are enough capacity in node1/node2.
The text was updated successfully, but these errors were encountered:
I've learned hwameistor scheduler source code. In my opinion, there are some problems in hwameistor scheduler:
Lack of reservation mechanism. Currently, there is a window period between LV being scheduled to a certain node and actual creation and resource recording to lsn. Therefore, there is a lag in resource recording. During the window period, if a bunch of pods need to be created, it can lead to: 1. Nodes with insufficient resources can pass Filter function in scheduler 2. There is no obvious difference between node scores. The above reasons finally make pods be scheduled to a same node. After scheduling to a node, the creation may fail due to insufficient resources.
Lack of reschedule mechanism. After LV is scheduled to a certain node, insufficient resources lead to LV creation failure. The CSI interface(hwameistor) should be implemented correctly to enable pvc reschedule.
There are 3 nodes in my k8s cluster with
v0.14.1
Hwameistor like this:Every node has a capacity of 20G for LVM. Then I apply a test yaml file(There are 4 pods and 4 pvcs whose requests storage is 6G.):
root@pw-k8s01:~/pangwei/yaml# kubectl apply -f local-pvc-test.yaml persistentvolumeclaim/pw-pvc1 created persistentvolumeclaim/pw-pvc2 created persistentvolumeclaim/pw-pvc3 created persistentvolumeclaim/pw-pvc4 created pod/pw-pod-1 created pod/pw-pod-2 created pod/pw-pod-3 created pod/pw-pod-4 created
You can see:
You can find error log:
Pvc
pw-pvc4
is like this:Pod
pw-pod-4
and Pvcpw-pvc4
keeppending
even though there are enough capacity in node1/node2.The text was updated successfully, but these errors were encountered: