Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 96 additions & 69 deletions modules/virt-creating-and-exposing-mediated-devices.adoc
Original file line number Diff line number Diff line change
@@ -1,32 +1,101 @@
// Module included in the following assemblies:
//
// * virt/virtual_machines/advanced_vm_management/virt-configuring-virtual-gpus.adoc
// * virt/managing_vms/advanced_vm_management/virt-configuring-virtual-gpus.adoc

:_mod-docs-content-type: PROCEDURE
[id="virt-creating-exposing-mediated-devices_{context}"]
= Creating and exposing mediated devices

As an administrator, you can create mediated devices and expose them to the cluster by editing the `HyperConverged` custom resource (CR).
As an administrator, you can create mediated devices and expose them to the cluster by editing the `HyperConverged` custom resource (CR). Before you edit the CR, explore a worker node to find the configuration values that are specific to your hardware devices.

.Prerequisites

* You have installed the {oc-first}.
* You installed the {oc-first}.
* You enabled the Input-Output Memory Management Unit (IOMMU) driver.
* If your hardware vendor provides drivers, you installed them on the nodes where you want to create mediated devices.
** If you use NVIDIA cards, you link:https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/openshift-virtualization.html[installed the NVIDIA GRID driver].

// [IMPORTANT]
// ====
// Before {VirtProductName} 4.14, the `mediatedDeviceTypes` field was named `mediatedDevicesTypes`. Ensure that you use the correct field name when configuring mediated devices.
// ====

.Procedure

. Identify the name selector and resource name values for the mediated devices by exploring a worker node:

.. Start a debugging session with the worker node by using the `oc debug` command. For example:
+
[source,terminal]
----
$ oc debug node/node-11.redhat.com
----

.. Change the root directory of the shell process to the file system of the host node by running the following command:
+
[source,terminal]
----
# chroot /host
----

.. Navigate to the `mdev_bus` directory and view its contents. Each subdirectory name is a PCI address of a physical GPU. For example:
+
[source,terminal]
----
# cd sys/class/mdev_bus && ls
----
+
Example output:
+
[source,terminal]
----
0000:4b:00.4
----

.. Go to the directory for your physical device and list the supported mediated device types as defined by the hardware vendor. For example:
+
[source,terminal]
----
# cd 0000:4b:00.4 && ls mdev_supported_types
----
+
Example output:
+
[source,terminal]
----
nvidia-742 nvidia-744 nvidia-746 nvidia-748 nvidia-750 nvidia-752
nvidia-743 nvidia-745 nvidia-747 nvidia-749 nvidia-751 nvidia-753
----

.. Select the mediated device type that you want to use and identify its name selector value by viewing the contents of its `name` file. For example:
+
[source,terminal]
----
# cat nvidia-745/name
----
+
Example output:
+
[source,terminal]
----
NVIDIA A2-2Q
----

. Open the `HyperConverged` CR in your default editor by running the following command:
+
[source,terminal,subs="attributes+"]
----
$ oc edit hyperconverged kubevirt-hyperconverged -n {CNVNamespace}
----

. Create and expose the mediated devices by updating the configuration:

.. Create mediated devices by adding them to the `spec.mediatedDevicesConfiguration` stanza.

.. Expose the mediated devices to the cluster by adding the `mdevNameSelector` and `resourceName` values to the `spec.permittedHostDevices.mediatedDevices` stanza. The `resourceName` value is based on the `mdevNameSelector` value, but you use underscores instead of spaces.
+
Example `HyperConverged` CR:
+
.Example configuration file with mediated devices configured
[%collapsible]
====
[source,yaml,subs="attributes+"]
----
apiVersion: hco.kubevirt.io/v1
Expand All @@ -37,87 +106,45 @@ metadata:
spec:
mediatedDevicesConfiguration:
mediatedDeviceTypes:
- nvidia-231
- nvidia-745
nodeMediatedDeviceTypes:
- mediatedDeviceTypes:
- nvidia-233
- nvidia-746
nodeSelector:
kubernetes.io/hostname: node-11.redhat.com
permittedHostDevices:
mediatedDevices:
- mdevNameSelector: GRID T4-2Q
resourceName: nvidia.com/GRID_T4-2Q
- mdevNameSelector: GRID T4-8Q
resourceName: nvidia.com/GRID_T4-8Q
# ...
----
====

. Create mediated devices by adding them to the `spec.mediatedDevicesConfiguration` stanza:
+
.Example YAML snippet
[source,yaml]
----
# ...
spec:
mediatedDevicesConfiguration:
mediatedDeviceTypes: <1>
- <device_type>
nodeMediatedDeviceTypes: <2>
- mediatedDeviceTypes: <3>
- <device_type>
nodeSelector: <4>
<node_selector_key>: <node_selector_value>
- mdevNameSelector: NVIDIA A2-2Q
resourceName: nvidia.com/NVIDIA_A2-2Q
- mdevNameSelector: NVIDIA A2-4Q
resourceName: nvidia.com/NVIDIA_A2-4Q
# ...
----
<1> Required: Configures global settings for the cluster.
<2> Optional: Overrides the global configuration for a specific node or group of nodes. Must be used with the global `mediatedDeviceTypes` configuration.
<3> Required if you use `nodeMediatedDeviceTypes`. Overrides the global `mediatedDeviceTypes` configuration for the specified nodes.
<4> Required if you use `nodeMediatedDeviceTypes`. Must include a `key:value` pair.
+
[IMPORTANT]
====
Before {VirtProductName} 4.14, the `mediatedDeviceTypes` field was named `mediatedDevicesTypes`. Ensure that you use the correct field name when configuring mediated devices.
====
where:

. Identify the name selector and resource name values for the devices that you want to expose to the cluster. You will add these values to the `HyperConverged` CR in the next step.
.. Find the `resourceName` value by running the following command:
+
[source,terminal]
----
$ oc get $NODE -o json \
| jq '.status.allocatable \
| with_entries(select(.key | startswith("nvidia.com/"))) \
| with_entries(select(.value != "0"))'
----
`mediatedDeviceTypes`:: Specifies global settings for the cluster and is required.

.. Find the `mdevNameSelector` value by viewing the contents of `/sys/bus/pci/devices/<slot>:<bus>:<domain>.<function>/mdev_supported_types/<type>/name`, substituting the correct values for your system.
+
For example, the name file for the `nvidia-231` type contains the selector string `GRID T4-2Q`. Using `GRID T4-2Q` as the `mdevNameSelector` value allows nodes to use the `nvidia-231` type.
`nodeMediatedDeviceTypes`:: Specifies global configuration overrides for a specific node or group of nodes and is optional. Must be used with the global `mediatedDeviceTypes` configuration.

. Expose the mediated devices to the cluster by adding the `mdevNameSelector` and `resourceName` values to the
`spec.permittedHostDevices.mediatedDevices` stanza of the `HyperConverged` CR:
+
.Example YAML snippet
[source,yaml]
----
# ...
permittedHostDevices:
mediatedDevices:
- mdevNameSelector: GRID T4-2Q <1>
resourceName: nvidia.com/GRID_T4-2Q <2>
# ...
----
<1> Exposes the mediated devices that map to this value on the host.
<2> Matches the resource name that is allocated on the node.
`mediatedDeviceTypes`:: Specifies an override to the global `mediatedDeviceTypes` configuration for the specified nodes. Required if you use `nodeMediatedDeviceTypes`.

`nodeSelector`:: Specifies the node selector and must include a `key:value` pair. Required if you use `nodeMediatedDeviceTypes`.

`mdevNameSelector`:: Specifies the mediated devices that map to this value on the host.

`resourceName`:: Specifies the matching resource name that is allocated on the node.

. Save your changes and exit the editor.

.Verification

* Optional: Confirm that a device was added to a specific node by running the following command:
* Confirm that the virtual GPU is attached to the node by running the following command:
+
[source,terminal]
----
$ oc describe node <node_name>
$ oc get node <node_name> -o json \
| jq '.status.allocatable \
| with_entries(select(.key | startswith("nvidia.com/"))) \
| with_entries(select(.value != "0"))'
----