"CUDA driver version is insufficient" despite sufficient drivers #24

wielandbrendel · 2015-12-21T17:49:48Z

I am receiving the following error upon running deviceQuery in nvidia/cuda:7.0-cudnn3-devel:

./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

However, I have run containers with CUDA7 on the same system before. The host system is on 346.46, which should be sufficient. The container was started with

docker run --device /dev/nvidia-uvm:/dev/nvidia/uvm --device /dev/nvidia0:/dev/nvidia0 --device  \
/dev/nvidia1:/dev/nvidia1 --device /dev/nvidia2:/dev/nvidia2 --device /dev/nvidia3:/dev/nvidia3 --device \
/dev/nvidiactl:/dev/nvidiactl -it nvidia/cuda:7.0-cudnn3-devel bash

Any idea why that happens or what I should check? A big thanks in advance!

The text was updated successfully, but these errors were encountered:

flx42 · 2015-12-21T18:04:33Z

You need to use our nvidia-docker wrapper script.

With our approach, we do not install the driver inside the image. This is the only solution in order to have CUDA images truly independent with the driver version of the host.
As a result, we need to mount the driver libraries from the host inside the container when it is started:

nvidia-docker/nvidia-docker

Lines 20 to 24 in 237aa2b

    
           NV_LIBS_CUDA="cuda \ 
        
                         nvcuvid \ 
        
                         nvidia-compiler \ 
        
                         nvidia-encode \ 
        
                         nvidia-ml"

If you don't want to use the wrapper script, you could do like Tensorflow:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/docker#running-the-container
But it's less portable than using our script.

wielandbrendel · 2015-12-21T18:24:42Z

Sorry for overlooking the wrapper - it indeed works perfectly on the nvidia/cuda:7.0-cudnn3-devel image. However, if I am running the same script on an image that is based on nvidia/cuda then I get

[ NVIDIA ] =INFO= Not a CUDA image, nothing to be done

which seems to be due to the script checking the version/name of the image it is applied to. How would you recommend to handle images based on your images?

flx42 · 2015-12-21T21:44:07Z

It should not happen, it's checking the label present in the base image:
https://github.com/NVIDIA/nvidia-docker/blob/master/ubuntu-14.04/cuda/7.5/runtime/Dockerfile#L13

So it should work fine with images based on this one, do you have a small repro where it doesn't work?
Thanks!

flx42 · 2015-12-21T21:55:02Z

Ah, I think I found the problem, if you don't have any image locally:

$ nvidia-docker run -ti nvidia/cuda:7.5
Error: No such image or container: nvidia/cuda:7.5
[ NVIDIA ] =INFO= Not a CUDA image, nothing to be done

Unable to find image 'nvidia/cuda:7.5' locally
7.5: Pulling from nvidia/cuda
0bf056161913: Pull complete 
[...]

$ nvidia-docker run -ti nvidia/cuda:7.5
[ NVIDIA ] =INFO= Driver version: 352.68
[ NVIDIA ] =INFO= CUDA image version: 7.5

The first time, the image is not present locally. So it's a bug in our label detection code.

flx42 · 2015-12-21T22:10:00Z

We will think of a solution, in the mean time doing docker pull user/myimage before nvidia-docker run user/myimage should work.
Or use the other option I described in my first reply.

wielandbrendel · 2015-12-22T08:21:14Z

There is an additional problem unrelated to non-pulled images: I discovered that the error occurs if the number of flags is too large. E.g. the following does not work:

GPU=0,1,2,3 ./nvidia-docker run -m 300M -a stdout -a stdin -i -t -d nvidia/cuda:7.0-cudnn3-devel

but

GPU=0,1,2,3 ./nvidia-docker run -m 300M -a stdout -a stdin -itd nvidia/cuda:7.0-cudnn3-devel

does (aside from some conflicting flag issues...). The error I get in the first line is

flag provided but not defined: -m0
See 'docker inspect --help'.
[ NVIDIA ] =INFO= Not a CUDA image, nothing to be done

So the current workaround would be too lower the number of flags used.

flx42 · 2015-12-22T08:27:55Z

Ping @3XX0

3XX0 · 2015-12-22T13:18:04Z

Sorry, it was an issue in the parsing, it's fixed now.

Regarding the issue where you need to pull before doing run, unfortunately this is a limitation with nvidia-docker. This is due to the fact that there is no way to inspect an image stored remotely.

flx42 · 2016-01-04T18:57:09Z

Closing, since I believe this is fixed.

flx42 closed this as completed Jan 4, 2016

3XX0 added the bug label Jan 9, 2016

jmaronas mentioned this issue Dec 22, 2016

cannot run cuda code in home made docker #277

Closed

This was referenced Mar 12, 2020

docker: Error response from daemon #1217

Closed

Unable to create container #1218

Closed

ehfd mentioned this issue May 1, 2022

Please Post Minor Feature Requests or Regression Reports Here selkies-project/docker-nvidia-glx-desktop#23

Closed

jmontalt mentioned this issue May 20, 2022

Compiling without cuda? mrphys/tensorflow-nufft#23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"CUDA driver version is insufficient" despite sufficient drivers #24

"CUDA driver version is insufficient" despite sufficient drivers #24

wielandbrendel commented Dec 21, 2015

flx42 commented Dec 21, 2015

wielandbrendel commented Dec 21, 2015

flx42 commented Dec 21, 2015

flx42 commented Dec 21, 2015

flx42 commented Dec 21, 2015

wielandbrendel commented Dec 22, 2015

flx42 commented Dec 22, 2015

3XX0 commented Dec 22, 2015

flx42 commented Jan 4, 2016

"CUDA driver version is insufficient" despite sufficient drivers #24

"CUDA driver version is insufficient" despite sufficient drivers #24

Comments

wielandbrendel commented Dec 21, 2015

flx42 commented Dec 21, 2015

wielandbrendel commented Dec 21, 2015

flx42 commented Dec 21, 2015

flx42 commented Dec 21, 2015

flx42 commented Dec 21, 2015

wielandbrendel commented Dec 22, 2015

flx42 commented Dec 22, 2015

3XX0 commented Dec 22, 2015

flx42 commented Jan 4, 2016