-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVIDIA GPU support #2079
Comments
I would like to work on this. I've tested a deployment with kubespray and (manually) added Nvidia GPU support. This functions well.
some notes
This will install the drivers without installing Xorg. |
Hello,
Thank you in advance. |
@AtzeDeVries as I am playing with mining setups now, I can confirm that installing via the runfile is the best thing to do, I had too much pain with apt-get installations. Haven't had any problems with license or so, here is part of the code I am using (it's not very clean yet, but to give generic idea):
X server is needed if you need to access nvidia-settings as nvidia-smi doesn't expose all GPU characteristics. Generally, I am not sure if driver installation should be part of the code, perhaps just kubernetes and docker configureation and leave driver installation to the user as it is done on the host itself and not related directly to kubernetes? |
Thank you very much for the help !!! |
xref #3438 |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This would make it easier to use kubespray in conjuction with kubeflow.
I think this is what it would take:
accelerators_enabled
anddevice_plugins_enabled
toinventory/group_vars/k8s-cluster.yml
kube_feature_gates
list var inmain.yml
to setAccelerators={{ accelerators_enabled|string }}
andDevicePlugins={{ device_plugins_enabled|string }}
contrib/nvidia_gpu/nvidia_gpu_ubuntu.yml
that does the following:On every node:
a) Install a user-configurable
nvidia-XYZ
driver apt package. (Perhaps default tonvidia-384
)c) Install the
nvidia-docker2
package following the instructions here: https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)d) Update
/etc/docker/daemon.json
to set"default-runtime": "nvidia"
e) Run
pkill -SIGHUP dockerd
f) Install the k8s-device-plugin daemonset via
kubectl
The text was updated successfully, but these errors were encountered: