Set up GPU device if available #62

ldx · 2019-07-26T23:07:24Z

Two main changes:

Build AMI with a relatively recent NVIDIA driver and toolchain.
Set up units for GPU access if an NVIDIA GPU is detected on the host.

I also built a dev AMI, but I have not promoted it to prod yet. Probably we want to pin our excessively large customer base to the current stable itzo release and prod AMI first.

To test it, have server.yml use the milpadev AMI and set the default instance type to e.g. p2.xlarge. Then create a pod using e.g. tensorflow-gpu:

kubectl run tf --image=tensorflow/tensorflow:latest-gpu --overrides='{"apiVersion": "apps/v1", "spec": {"template": {"metadata": {"annotations": {"kubernetes.io/target-runtime":"kiyot"}}, "spec": {"nodeSelector": {"kubernetes.io/role": "milpa-worker"}}}}}' --command=true -- sleep infinity

Once it's running, exec into the pod, and check that tensorflow is able to detect and use the GPU:

python -c "import tensorflow as tf; sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))"

Next step is to have kiyot request GPU(s) when the pod sets a GPU resource limit, and implement our deviceplugin for the kubelet that exposes GPU capabilities.

justnoise

Great job getting this working!

Vilmos Nebehaj added 9 commits July 25, 2019 15:22

Build nvidia driver into image

ee078b5

Provide kernel name to nvidia install script

5b5d981

Update libc link as last step in image build

3966393

Enable nvidia driver

4a1be59

Install nvidia-container-cli in AMI

f378328

Remove packages after installing them

4982d5f

Keep glibc packages in AMI

59817a5

Add comment on how to build nvidia-container-cli

aaae24d

Set up GPU when present

c5093fb

ldx requested a review from justnoise July 26, 2019 23:07

justnoise approved these changes Jul 29, 2019

View reviewed changes

ldx merged commit 088d616 into master Jul 29, 2019

ldx deleted the vilmos-imagebuild-nvidia branch July 29, 2019 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up GPU device if available #62

Set up GPU device if available #62

ldx commented Jul 26, 2019 •

edited

Loading

justnoise left a comment

Set up GPU device if available #62

Set up GPU device if available #62

Conversation

ldx commented Jul 26, 2019 • edited Loading

justnoise left a comment

Choose a reason for hiding this comment

ldx commented Jul 26, 2019 •

edited

Loading