Skip to content

Decide on must-gather.sh inclusion in Dockerfile after distroless migration #2436

@rajathagasthya

Description

@rajathagasthya

Part of NVIDIA/cloud-native-team#299. Deferred from PR #2434.

hack/must-gather.sh is a 286-line bash script copied into the image
at docker/Dockerfile:103 as /usr/bin/gather, the OpenShift
oc adm must-gather --image=… plugin entrypoint. It shells out to
kubectl/oc, neither of which ship in the distroless base today.
The script is therefore largely unusable from inside the pod already,
and once the image base drops -dev (no shell), it cannot run at all.

Decide and implement one of:

  1. Rewrite the script in Go as a subcommand of nvidia-validator
    or a new dedicated binary. Use a vendored Go Kubernetes client.
    Stays inside the gpu-operator distroless image.
  2. Split must-gather into its own image — a small RHEL/UBI-based
    image that vendors oc and the bash script. Document a separate
    image tag for the must-gather plugin.
  3. Remove /usr/bin/gather from the gpu-operator image entirely
    and document that customers should run must-gather.sh directly
    from outside the cluster. Customers report this is what they
    already do in practice.

Acceptance:

  • Decision is made and recorded
  • Customer-facing must-gather workflow is documented

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprovements to existing features, performance, or usability (not bug fixes or new features).

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions