Part of NVIDIA/cloud-native-team#299. Deferred from PR #2434.
hack/must-gather.sh is a 286-line bash script copied into the image
at docker/Dockerfile:103 as /usr/bin/gather, the OpenShift
oc adm must-gather --image=… plugin entrypoint. It shells out to
kubectl/oc, neither of which ship in the distroless base today.
The script is therefore largely unusable from inside the pod already,
and once the image base drops -dev (no shell), it cannot run at all.
Decide and implement one of:
- Rewrite the script in Go as a subcommand of
nvidia-validator
or a new dedicated binary. Use a vendored Go Kubernetes client.
Stays inside the gpu-operator distroless image.
- Split must-gather into its own image — a small RHEL/UBI-based
image that vendors oc and the bash script. Document a separate
image tag for the must-gather plugin.
- Remove
/usr/bin/gather from the gpu-operator image entirely
and document that customers should run must-gather.sh directly
from outside the cluster. Customers report this is what they
already do in practice.
Acceptance:
Part of NVIDIA/cloud-native-team#299. Deferred from PR #2434.
hack/must-gather.shis a 286-line bash script copied into the imageat
docker/Dockerfile:103as/usr/bin/gather, the OpenShiftoc adm must-gather --image=…plugin entrypoint. It shells out tokubectl/oc, neither of which ship in the distroless base today.The script is therefore largely unusable from inside the pod already,
and once the image base drops
-dev(no shell), it cannot run at all.Decide and implement one of:
nvidia-validatoror a new dedicated binary. Use a vendored Go Kubernetes client.
Stays inside the gpu-operator distroless image.
image that vendors
ocand the bash script. Document a separateimage tag for the must-gather plugin.
/usr/bin/gatherfrom the gpu-operator image entirelyand document that customers should run
must-gather.shdirectlyfrom outside the cluster. Customers report this is what they
already do in practice.
Acceptance: