-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Document for cluster recovery when running on PVCs #4838
Conversation
The PVCs behind OSDs are not easily identifiable. This adds a label to the OSDs in order to query on app=rook-ceph-osd. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
When the cluster CR is deleted, all the resources are also deleted. To prevent accidental removal of critical data, we don't want to remove PVCs behind MONs or OSDs automatically. The PVCs behind OSDs already do not have owner references for this reason. Now this change removes the owner reference from the MONs as well. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
When catastrophe strikes and an entire kubernetes cluster is destroyed, it is still possible to restore Rook in a new Kubernetes cluster as long as the PVs underneath the MONs and OSDs are still available. This guide walks through the restoration of a cluster. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
## Scenario | ||
|
||
1. The Kubernetes environment underlying a running Rook Ceph cluster failed catastrophically, requiring a new Kubernetes environment in which the user wishes to recover the previous Rook Ceph cluster. | ||
2. The underlying PVs with the Ceph data (OSDs) and metadata (MONs) are still available in the cloud environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. The underlying PVs with the Ceph data (OSDs) and metadata (MONs) are still available in the cloud environment. | |
2. The underlying PVs with the Ceph data (OSDs) and metadata (Monitors) are still available in the cloud environment. |
### Exporting Critical Info | ||
|
||
Critical keys and info about the mons must be exported from the original cluster. This info is not stored on the PVs by either the mons | ||
or osds. This info is necessary to restore the cluster in case of disaster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or osds. This info is necessary to restore the cluster in case of disaster. | |
or OSDs. This info is necessary to restore the cluster in case of disaster. |
kubectl -n ${namespace} get cm rook-ceph-mon-endpoints -o yaml > critical/rook-ceph-mon-endpoints.yaml | ||
kubectl -n ${namespace} get svc -l app=rook-ceph-mon -o yaml > critical/rook-ceph-mon-svc.yaml | ||
# information about PVCs and PVs to help reconstruct them later | ||
# TODO: Can we just export these as yamls and import them again directly? At a minimum we would need to filter the PV list since more than Rook PVs would be included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would mean using --export
if users kubectl
still supports it.
|
||
1. Start the new Kubernetes clusterr | ||
|
||
2. Modify the critical resources before creating them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fields to trim:
creationTimestamp
namespace
if differentresourceVersion
uid
|
||
<TODO: Commands to create the PVs> | ||
|
||
<TODO: How do we know which PVs belonged to the MONs or OSDs? The volumes just have random names. Do we need to rely on the PV size to indicate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes!
On a running cluster, you might see these PVs: | ||
```console | ||
$ oc get pvc -l ceph.rook.io/DeviceSet=set1 | ||
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PVC name is set1-data-0-w4bgt
, that's the new format now.
<TODO: Commands to bind the PVCs to the PVs> | ||
|
||
7. Create PVCs for the OSD volumes. | ||
- The PVCs must follow the Rook naming convention `<device-set-name>-<index>-<type>-<suffix>` where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PVC name is set1-data-0-w4bgt
, that's the new format now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe we should present both.
incorrect from the services that were imported from the previous cluster. The mon endpoints are part of their identity | ||
and cannot change. If they do need to change, see the section above on [restoring mon quorum](#restoring-mon-quorum). | ||
|
||
12. Verify that the cluster is working. You should see three MONs, some number of OSDs, and one MGR daemon running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12. Verify that the cluster is working. You should see three MONs, some number of OSDs, and one MGR daemon running. | |
12. Verify that the cluster is working. You should see three Monitors, some number of OSDs, and one MGR daemon running. |
@travisn please rebase. |
This pull request has merge conflicts that must be resolved before it can be merged. @travisn please rebase it. https://rook.io/docs/rook/master/development-flow.html#updating-your-fork |
@travisn any updates on this one? |
This still needs testing |
There are too many open questions when assuming that the entire K8s cluster is lost. Closing this in favor of #6452, which requires the backup of the critical resources to be restored later in the new cluster. |
Description of your changes:
When catastrophe strikes and an entire kubernetes cluster is destroyed, it is still possible to restore Rook in a new Kubernetes cluster as long as the PVs underneath the MONs and OSDs are still available and some critical metadata was backed up before the loss. This guide walks through the restoration of such a cluster.
My testing has not yet included loss of an entire cluster. Thus far it has only been tested on a cluster where the cluster CR was removed and the PVCs and PVs remained intact.
Checklist:
make codegen
) has been run to update object specifications, if necessary.[test ceph]