Note
Clearly since this is cluster configuration this only applies to administrators. Normal users will have no need to follow this cluster configuration step.
To enable nvidia-container-toolking previously nvidia-docker, by editing /etc/docker/daemon.json since kubernetes does not support the docker --gpu option:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Then we need to install the nvidia operator:
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin \
&& helm repo update \
&& helm install --generate-name nvdp/nvidia-device-plugin
Test pod for cuda functionality
- _ example
../../examples/cuda-add.yaml
Test job for nvidia-smi
- _ example
../../examples/nvidia-smi-job.yaml