doc: description and workaround for known issue #5841

Signed-off-by: Webber Huang <webber.huang@suse.com>
harvester · May 23, 2024 · bd81131 · bd81131
1 parent ad44f90
commit bd81131
Show file tree

Hide file tree

Showing 2 changed files with 74 additions and 0 deletions.
diff --git a/static/img/v1.2/vm/vm_backup_fail.png b/static/img/v1.2/vm/vm_backup_fail.png
diff --git a/versioned_docs/version-v1.2/vm/backup-restore.md b/versioned_docs/version-v1.2/vm/backup-restore.md
@@ -187,3 +187,77 @@ Example:
 | `b-c` | `a` | `a-b-c.cfg` |
 
 Harvester v1.3.0 fixes this issue by changing the metadata file path to `<storage-path>/harvester/vmbackups/<vmbackup-namespace>/<vmbackup-name>.cfg`. If you are using an earlier version, however, ensure that VM backup names do not cause the described file naming conflicts.
+
+### Backup Fail for a Stopped VM
+
+During taking the backup for a stopped VM, if you find the VM backup failed like this, it may encounter an known issue.
+
+![](/img/v1.2/vm/vm_backup_fail.png)
+
+To determine if the issue has occurred, locate the VM backup on the Dashboard screen, and perform the following steps:
+
+1. Obtain the name of the problematic **VolumeSnapshots** which are related to the VM backup.
+
+    ```
+    $ kubectl get virtualmachinebackups.harvesterhci.io <VM backup name> -o json | jq '.status.volumeBackups[] | select(.readyToUse == false) | .name '
+    ```
+
+    Example:
+
+    ```
+    $ kubectl get virtualmachinebackups.harvesterhci.io extra-default.off -o json | jq '.status.volumeBackups[] | select(.readyToUse == false) | .name '
+    extra-default.off-volume-vm-extra-default-rootdisk-vp3py
+    extra-default.off-volume-vm-extra-default-disk-1-oohjf
+    ```
+
+2. Obtain the name of the related **VolumeSnapshotContent** from problematic VolumeSnapshots.
+
+    ```
+    $ SNAPTSHOT_CONTENT=$(kubectl get volumesnapshot <VolumeSnapshot Name> -o json | jq -r '.status.boundVolumeSnapshotContentName')
+    ```
+
+    Example:
+    ```
+    $ SNAPTSHOT_CONTENT=$(kubectl get volumesnapshot extra-default.off-volume-vm-extra-default-rootdisk-vp3py -o json | jq -r '.status.boundVolumeSnapshotContentName')
+    ```
+
+3. Obtain the name of the related **Longhorn Snapshot**.
+
+    ```
+    $ LH_SNAPSHOT=snapshot-$(echo "$SNAPTSHOT_CONTENT" | sed 's/^snapcontent-//')
+    ```
+
+4. Check **READYTOUSE** of the related **Longhorn Snapshot**.
+
+    ```
+    $ kubectl -n longhorn-system get snapshots.longhorn.io $LH_SNAPSHOT -o json | jq '.status.readyToUse'    
+    ```
+
+    Example:
+    ```
+    $ kubectl -n longhorn-system get snapshots.longhorn.io $LH_SNAPSHOT -o json | jq '.status.readyToUse'
+    true
+    ```    
+
+5. Check **State** of the related **Longhorn Backup**.
+
+    ```
+    $ kubectl -n longhorn-system get backups.longhorn.io -o json | jq --arg snapshot "$LH_SNAPSHOT" '.items[] | select(.spec.snapshotName == $snapshot) | .status.state'
+    ```
+
+    Example:
+    ```
+    $ kubectl -n longhorn-system get backups.longhorn.io -o json | jq --arg snapshot "$LH_SNAPSHOT" '.items[] | select(.spec.snapshotName == $snapshot) | .status.state'
+    Completed
+    ```
+
+Step 2 - 5 should be appied to all the VolumeSnapshots pointed in step 1.
+
+If the all the related **Longhorn Snapshot** have **READYTOUSE** as **true**, and all the related **Longhorn Backup** have **State** as **Completed**, you likely encounter the known issue [`#5841`](https://github.com/harvester/harvester/issues/5841), please follow the workaround to turn the VM backp into a ready state.
+
+- Related issue:
+  - [[BUG] Fail to backup a Stopped/Off VM due to volume error state](https://github.com/harvester/harvester/issues/5841)
+- Workaround:
+    ```
+    $ kubectl -n longhorn-system rollout restart deployment csi-snapshotter
+    ```