Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions modules/virt-retain-storage-checkup-resources.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
// Module included in the following assemblies:
//
// * virt/monitoring/virt-running-cluster-checkups.adoc

:_mod-docs-content-type: PROCEDURE
[id="virt-retain-storage-checkup-resources_{context}"]
= Retaining resources for troubleshooting storage checkups

[role="_abstract"]
The predefined storage checkup includes `skipTeardown` configuration options, which control resource clean up after a storage checkup runs.
By default, the `skipTeardown` field value is `Never`, which means that the checkup always performs teardown steps and deletes all resources after the checkup runs.

You can retain resources for further inspection in case a failure occurs by setting the `skipTeardown` field to `onfailure`.

.Prerequisites

* You have installed the {oc-first}.

.Procedure

. Run the following command to edit the `storage-checkup-config` config map:
+
[source,terminal]
----
$ oc edit configmap storage-checkup-config -n <checkup_namespace>
----

. Configure the `skipTeardown` field to use the `onfailure` value. You can do this by modifying the `storage-checkup-config` config map, stored in the `storage_checkup.yaml` file:
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: storage-checkup-config
namespace: <checkup_namespace>
data:
spec.param.skipTeardown: onfailure
# ...
----

. Reapply the `storage-checkup-config` config map by running the following command:
+
[source,terminal]
----
$ oc apply -f storage_checkup.yaml -n <checkup_namespace>
----
41 changes: 41 additions & 0 deletions modules/virt-troubleshoot-storage-checkup-error-codes.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
// Module included in the following assemblies:
//
// * virt/monitoring/virt-running-cluster-checkups.adoc

:_mod-docs-content-type: REFERENCE
[id="virt-troubleshoot-storage-checkup-error-codes_{context}"]
= Storage checkup error codes

[role="_abstract"]
The following error codes might appear in the `storage-checkup-config` config map after a storage checkup fails.

[options="header"]
|===

|Error code |Meaning

|`ErrNoDefaultStorageClass`
|No default storage class is configured.

|`ErrPvcNotBound`
|One or more persistent volume claims (PVCs) failed to bind.

|`ErrMultipleDefaultStorageClasses`
|Multiple default storage classes are configured.

|`ErrEmptyClaimPropertySets`
|There are `StorageProfile` objects containing empty `ClaimPropertySets` specs.

|`ErrVMsWithUnsetEfsStorageClass`
|There are VMs using elastic file system (EFS) storage classes, where the GID and UID are not set in the `StorageClass` object.

|`ErrGoldenImagesNotUpToDate`
|One or more golden images has a `DataImportCron` object that is either not up to date or has a `DataSource` object which is not ready.

|`ErrGoldenImageNoDataSource`
|The `DataSource` object of the golden image has either no PVC or no snapshot source configured.

|`ErrBootFailedOnSomeVMs`
|Some VMs failed to boot within the expected time.

|===
44 changes: 44 additions & 0 deletions modules/virt-troubleshoot-storage-checkup.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
// Module included in the following assemblies:
//
// * /virt/monitoring/virt-running-cluster-checkups.adoc

:_mod-docs-content-type: PROCEDURE
[id="virt-troubleshoot-storage-checkup_{context}"]
= Troubleshooting a failed storage checkup

[role="_abstract"]
If a storage checkup fails, there are steps that you can take to identify the reason for failure.

.Prerequisites

* You have installed the {oc-first}.
* You have downloaded the directory provided by the `must-gather` tool.

.Procedure

. Review the `status.failureReason` field in the `storage-checkup-config` config map by running the following command and observing the output:
+
[source,terminal]
----
$ oc get configmap storage-checkup-config -n <namespace> -o yaml
----
+
.Example output config map
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: storage-checkup-config
labels:
kiagnose/checkup-type: kubevirt-storage
data:
spec.timeout: 10m
status.succeeded: "false" # <1>
status.failureReason: "ErrNoDefaultStorageClass" # <2>
# ...
----
<1> If the checkup has failed, the `status.succeeded` value is `false`.
<2> If the checkup has failed, the `status.failureReason` field contains an error message. In this example output, the `ErrNoDefaultStorageClass` error message means that no default storage class is configured.

. Search the directory provided by the `must-gather` tool for logs, events, or terms related to the error in the `data.status.failureReason` field value.
14 changes: 13 additions & 1 deletion virt/monitoring/virt-running-cluster-checkups.adoc
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
:_mod-docs-content-type: ASSEMBLY
include::_attributes/common-attributes.adoc[]
[id="virt-running-cluster-checkups"]
= {VirtProductName} cluster checkup framework
include::_attributes/common-attributes.adoc[]
:context: virt-running-cluster-checkups

toc::[]

[role="_abstract"]
A _checkup_ is an automated test workload that allows you to verify if a specific cluster functionality works as expected. The cluster checkup framework uses native Kubernetes resources to configure and execute the checkup.

:FeatureName: The {VirtProductName} cluster checkup framework
Expand Down Expand Up @@ -36,10 +37,21 @@ You can use a storage checkup to verify that the cluster storage is optimally co

include::snippets/virt-about-running-checkups.adoc[]

include::modules/virt-retain-storage-checkup-resources.adoc[leveloffset=+2]

include::modules/virt-storage-checkup-web-console.adoc[leveloffset=+2]

include::modules/virt-checking-storage-configuration.adoc[leveloffset=+2]

include::modules/virt-troubleshoot-storage-checkup.adoc[leveloffset=+2]

[role="_additional-resources"]
.Additional resources
* xref:../../virt/support/virt-collecting-virt-data.adoc#virt-collecting-virt-data[Collecting data for Red{nbsp}Hat Support]
* xref:../../virt/support/virt-collecting-virt-data.adoc#virt-using-virt-must-gather_virt-collecting-virt-data[Using the `must-gather` tool for {VirtProductName}]

include::modules/virt-troubleshoot-storage-checkup-error-codes.adoc[leveloffset=+2]

[id="virt-running-cluster-checkups-dpdk"]
== Running predefined DPDK checkups

Expand Down
5 changes: 3 additions & 2 deletions virt/support/virt-collecting-virt-data.adoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
:_mod-docs-content-type: ASSEMBLY
include::_attributes/common-attributes.adoc[]
[id="virt-collecting-virt-data"]
= Collecting data for Red Hat Support
= Collecting data for Red{nbsp}Hat Support
:context: virt-collecting-virt-data

toc::[]

When you submit a xref:../../support/getting-support.adoc#support-submitting-a-case_getting-support[support case] to Red Hat Support, it is helpful to provide debugging information for {product-title} and {VirtProductName} by using the following tools:
[role="_abstract"]
When you submit a xref:../../support/getting-support.adoc#support-submitting-a-case_getting-support[support case] to Red{nbsp}Hat Support, it is helpful to provide debugging information for {product-title} and {VirtProductName} by using the following tools:

// must-gather not supported for ROSA/OSD, per Dustin Row
ifndef::openshift-rosa,openshift-dedicated,openshift-rosa-hcp[]
Expand Down