[rook-ceph] cannot connect OSDs after disaster-recovery #7557
-
https://github.com/rook/rook/blob/master/Documentation/ceph-disaster-recovery.md I am having a very hard day because of this. For some reason I have been trying to recover an old rook-ceph cluster from a new cluster. I thought it would be easy because I had a proper memory before, but it caused me a lot of pain. I am a beginner to ceph. Here are some clues you can guess. My cluster consists of 3 nodes and 11 OSDs per node. ceph -s
I have done disaster recovery more than 3 times. But they all got the same result. Receive SLOW_OPS WARNING for MON. In fact, that's not going on at all. The age of mon and the elapsed time of SLOW_OPS are exactly the same. In addition, the IP of kubernetes logs osd.1 pod
It seems that mon and osd are not connected. It seems to be related to the phenomenon that the actual IP and the IP of the cluster are not the same. How can I solve it? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
osd: 33 osds: 33 up -> After starting, it looks up for a while, but soon changes to down. |
Beta Was this translation helpful? Give feedback.
-
It's good to see that the mon is healthy. The OSD should see the pod IP and report it to the mons when it starts, so I'm not sure why the OSDs would be stuck with the old ip address. Which section of the disaster recovery guide did you go through? This one? This is certainly a difficult process unfortunately. |
Beta Was this translation helpful? Give feedback.
It's good to see that the mon is healthy. The OSD should see the pod IP and report it to the mons when it starts, so I'm not sure why the OSDs would be stuck with the old ip address.
Which section of the disaster recovery guide did you go through? This one? This is certainly a difficult process unfortunately.