You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Deviation from expected behavior:
Keeps looping with message:
op-osd: waiting... 2 of 3 OSD prepare jobs have finished processing and 3 of 3 OSDs have been updated
Expected behavior:
Should finish reconciliation.
How to reproduce it (minimal and precise):
It occurred on upgrading v.1.6.7 based chart to same v.1.6.7 based chart, thus restarting the operator.
We tried restarting the operator and deleting the configmap rook-ceph-osd-nodename-3-status, the thing that fixed it was manually deleting the job for nodename-3 and restarting the operator then it went into "reconciliation complete".
Deleting the status configmap and osd prepare job was the right thing to do when the operator gets stuck. It has been a rare issue to track down, so do let us know if you see it again.
Thanks for the logs. Do you still have the osd status configmaps in that state, or did you already workaround and delete them? I can probably just work off the logs you provided if you don't have the configmaps.
Is this a bug report or feature request?
Deviation from expected behavior:
Keeps looping with message:
op-osd: waiting... 2 of 3 OSD prepare jobs have finished processing and 3 of 3 OSDs have been updated
Expected behavior:
Should finish reconciliation.
How to reproduce it (minimal and precise):
It occurred on upgrading v.1.6.7 based chart to same v.1.6.7 based chart, thus restarting the operator.
We tried restarting the operator and deleting the configmap rook-ceph-osd-nodename-3-status, the thing that fixed it was manually deleting the job for nodename-3 and restarting the operator then it went into "reconciliation complete".
File(s) to submit:
$ kubectl get cm rook-ceph-osd-nodename-3-status -n kube-system -o yaml --> https://pastebin.com/LbxsVTFJ
$ kubectl get cephclusters.ceph.rook.io -A -o yaml --> https://pastebin.com/evNz5EFT
$ kubectl get pods -n kube-system --> https://pastebin.com/An64qYhU
$ kubectl logs rook-ceph-operator-5d684ccc5c-jlngj -n kube-system --> https://pastebin.com/YHwzTspv
$ kubectl logs rook-ceph-osd-prepare-nodename-3-4tq6w -n kube-system --> https://pastebin.com/fnAfJWBa
$ kubectl get job -n kube-system rook-ceph-osd-prepare-nodename-3 -o yaml --> https://pastebin.com/4jsx6vwy
Environment:
uname -a
): 3.10.0-1160.36.2.el7.x86_64rook version
inside of a Rook Pod): v1.6.7ceph -v
): 16.2.2 (e8f22dde28889481f4dda2beb8a07788204821d3)kubectl version
): GitVersion:"v1.21.1"ceph health
in the Rook Ceph toolbox): HEALTH_WARN noout flag(s) setThe text was updated successfully, but these errors were encountered: