Skip to content

Documentation Improvements for NVIDIA GPU Operator on Red Hat OCP #314

@jensmueller-com

Description

@jensmueller-com

This issue follows up on this issue. These changes improve automation, reduce manual edits, and make the documentation less error-prone.

The documentation for the NVIDIA GPU Operator on Red Hat OpenShift Container Platform includes a chapter titled Installing the NVIDIA GPU Operator on OpenShift. I have two suggestions for improvement.

1. Suggestion for section: Installing the NVIDIA GPU Operator using the CLI

Current behavior: Steps 3–5 require manual edits for channel and starting CSV.

Suggested improvements:

  • Step 3: Set CHANNEL dynamically (similar to step 9):

    $ CHANNEL=$(oc get packagemanifest gpu-operator-certified -n openshift-marketplace -o jsonpath='{.status.defaultChannel}')

    Omit the example output and the redundant command in step 4.

  • Step 4: Store the starting CSV in a variable:

    $ STARTING_CSV=$(oc get packagemanifests/gpu-operator-certified -n openshift-marketplace -ojson | jq -r '.status.channels[] | select(.name == "'$CHANNEL'") | .currentCSV')
  • Step 5: Use these variables when creating the subscription file:

    $ cat <<EOF > nvidia-gpu-sub.yaml
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: gpu-operator-certified
      namespace: nvidia-gpu-operator
    spec:
      channel: $CHANNEL
      installPlanApproval: Manual
      name: gpu-operator-certified
      source: certified-operators
      sourceNamespace: openshift-marketplace
      startingCSV: $STARTING_CSV
    EOF

    Omit the note about manually editing channel and startingCSV.

2. Suggestion for section: Create the cluster policy using the CLI (below "Create the ClusterPolicy instance")

Current behavior: No note about replacing the hardcoded starting CSV (gpu-operator-certified.v22.9.0).

Suggested improvement: If STARTING_CSV is stored as above, update the command:

$ oc get csv -n nvidia-gpu-operator $STARTING_CSV -ojsonpath={.metadata.annotations.alm-examples} | jq '.[0]' > clusterpolicy.json

Note that to support zsh, .[0] must also be put in quotes.

Apply the same change in the other Create the cluster policy using the CLI section (below "Create the ClusterPolicy instance with NVIDIA vGPU").

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions