Monitor reboot after create storageclass #11081

lincate · 2022-09-29T09:35:56Z

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior:
After create Storageclass rook-ceph-block, provisioner: rook-ceph.rbd.csi.ceph.com, monitor will rebooting again and again.
Then, the PVC which reference rook-ceph-block status is pending.

Expected behavior:
Storageclass can be created, PVC can be bound.

How to reproduce it (minimal and precise):
kubectl create -f crds.yaml -f commons.yaml -f operator.yaml
kubectl create -f cluster.yaml
after that, ceph-cluster status is HEALTH_OK
then kubectl create -f storageclass.yaml

File(s) to submit:

Cluster CR (custom resource), typically called cluster.yaml, if necessary

Logs to submit:
Info: Checking mon quorum and ceph health details
HEALTH_OK

after create storageclass:
one of monitor log is:
debug 2022-09-29T09:10:21.318+0000 7f937291a700 0 log_channel(cluster) log [DBG] : fsmap
debug 2022-09-29T09:10:21.318+0000 7f937291a700 0 log_channel(cluster) log [DBG] : osdmap e25: 3 total, 3 up, 3 in
debug 2022-09-29T09:10:21.319+0000 7f937291a700 0 log_channel(cluster) log [DBG] : mgrmap e30: no daemons active (since 18s)
debug 2022-09-29T09:10:21.320+0000 7f937291a700 0 log_channel(cluster) log [WRN] : Health check update: 1/3 mons down, quorum a,c (MON_DOWN)
debug 2022-09-29T09:10:59.737+0000 7f9371117700 1 mon.a@0(leader) e3 handle_auth_request failed to assign global_id
debug 2022-09-29T09:11:15.942+0000 7f9370916700 1 mon.a@0(leader) e3 handle_auth_request failed to assign global_id
debug 2022-09-29T09:11:25.176+0000 7f9378125700 -1 received signal: Terminated from Kernel ( Could be generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0
debug 2022-09-29T09:11:25.176+0000 7f9378125700 -1 mon.a@0(leader) e3 *** Got Signal Terminated ***
debug 2022-09-29T09:11:25.176+0000 7f9378125700 1 mon.a@0(leader) e3 shutdown

Operator's logs, if necessary
Crashing pod(s) logs, if necessary

To get logs, use kubectl -n <namespace> logs <pod name>
When pasting logs, always surround them with backticks or use the insert code button from the Github UI.
Read GitHub documentation if you need help.

Cluster Status to submit:

Output of krew commands, if necessary

To get the health of the cluster, use kubectl rook-ceph health
To get the status of the cluster, use kubectl rook-ceph ceph status
For more details, see the Rook Krew Plugin

Environment:

OS (e.g. from /etc/os-release): Centos 7
Kernel (e.g. uname -a): Linux 4.18.0-147.5.1.6.h579.x86_64
Cloud provider or hardware configuration: on-prem
Rook version (use rook version inside of a Rook Pod): 1.10.1 & 1.10.0
Storage backend version (e.g. for ceph do ceph -v): 17.2.3
Kubernetes version (use kubectl version): 1.25.0 & 1.24.2
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): bare metal
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):

The text was updated successfully, but these errors were encountered:

travisn · 2022-09-29T16:18:41Z

Some questions:

This is a default install, or did you modify the cluster.yaml? If you added resource limits on the mons, they may need to be increased
The liveness probes may be failing for the mons. You could try disabling the liveness probes to see if it helps.

shangjin92 · 2022-10-10T09:09:44Z

It maybe similar to #10110

github-actions · 2022-12-09T20:02:02Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions · 2022-12-17T20:01:59Z

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

lincate added the bug label Sep 29, 2022

github-actions bot added the wontfix label Dec 9, 2022

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitor reboot after create storageclass #11081

Monitor reboot after create storageclass #11081

lincate commented Sep 29, 2022

travisn commented Sep 29, 2022

shangjin92 commented Oct 10, 2022

github-actions bot commented Dec 9, 2022

github-actions bot commented Dec 17, 2022

Monitor reboot after create storageclass #11081

Monitor reboot after create storageclass #11081

Comments

lincate commented Sep 29, 2022

travisn commented Sep 29, 2022

shangjin92 commented Oct 10, 2022

github-actions bot commented Dec 9, 2022

github-actions bot commented Dec 17, 2022