Skip to content

GPU operator 22.9.2 installation is failing #541

@likku123

Description

@likku123

1. Quick Debug Checklist

  • Are you running on an Ubuntu 18.04 node? No
  • Are you running Kubernetes v1.13+? No
  • Are you running Docker (>= 18.06) or CRIO (>= 1.13+)?Docker version 20.10.21, build 20.10.21-0ubuntu1~22.04.3

1. Issue or feature description

I am trying to install specific version of GPU operator (22.9.2) via helm chart using ansible. Previously I am not specfying the version number and installing the latest . Just to be on a safer side I have specified the specific version to deploy.

image

I have collected logs based on the below instructions.

-->curl -o must-gather.sh -L https://raw.githubusercontent.com/NVIDIA/gpu-operator/master/hack/must-gather.sh
-->chmod +x must-gather.sh
-->./must-gather.sh

gpu_operand_pod_nvidia-container-toolkit-daemonset-tx9zk.zip
[gpu_operand_pod_gpu-feature-discovery-57ds2.log](https://github.com/NVIDIA/gpu-operator/files/11784844/gpu_
gpu_operand_pod_nvidia-operator-validator-5zwwb.zip

gpu_operand_pod_nvidia-dcgm-exporter-5fl25.log
operand_pod_gpu-feature-discovery-57ds2.log)

Please let me know any more logs are required from my side

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions