- On the Docker host machine, install
nvidia-container-toolkit
andnvidia-container-runtime
packages. See the official install guide - On the Docker host machine: type
nvidia-smi
and examine the CUDA version that appears on the right.
- On the Docker host machine (replace the cuda version with the one you have installed), type:
docker run --rm --gpus all nvidia/cuda:10.1-base nvidia-smi
you should get the same output as you did when you ran the nvidia-smi on the host machine
- In the file
/etc/docker/daemon.json
you should see:
"runtimes": {"nvidia": { "path": "/usr/bin/nvidia-container-runtime","runtimeArgs": [] } }
If not, add it.
- In your docker-compose YML, you should have:
runtime: nvidia
- Reboot to make sure the driver are all loaded. Alternatively you can try just running
sudo systemctl restart docker.service
but the result isn't garanteed.
Run this with
docker-compose up
or manually with
docker run --gpus all -it --rm sam1902/test-tensorflow:latest
$ docker-compose up
Recreating test-tensorflow_test-tf_1 ... error
ERROR: for test-tensorflow_test-tf_1 Cannot create container for service test-tf: Unknown runtime specified nvidia
ERROR: for test-tf Cannot create container for service test-tf: Unknown runtime specified nvidia
ERROR: Encountered errors while bringing up the project.
If you encouter this, make sure you followed step 4. in the Install section.