Only works once #34

andyneff · 2016-01-13T22:34:46Z

I'm not very familiar with docker volume, but it appears to be ONLY good for one use.

sudo ./nvidia-docker volume setup
```
nvidia_driver_352.55
```

docker volume ls

DRIVER              VOLUME NAME
local               nvidia_driver_352.55

./nvidia-docker run --rm nvidia/cuda nvidia-smi

+------------------------------------------------------+                       
| NVIDIA-SMI 352.55     Driver Version: 352.55         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 680     Off  | 0000:01:00.0     N/A |                  N/A |
| 34%   54C    P8    N/A /  N/A |    653MiB /  4093MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 580     Off  | 0000:02:00.0     N/A |                  N/A |
| 46%   54C   P12    N/A /  N/A |      7MiB /  3071MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K20c          Off  | 0000:03:00.0     Off |                  Off |
| 37%   49C    P0    48W / 225W |     96MiB /  5119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+

./nvidia-docker run --rm nvidia/cuda nvidia-smi

Error response from daemon: Error looking up volume plugin nvidia-docker: Plugin not found

docker volume ls
```
DRIVER              VOLUME NAME
```

Tested on Ubuntu 14.04 running Docker 1.91 and Centos 7 running docker 1.9.0

It just seems like if sudo nvidia-docker volume setup is in the "Initial setup" section, than it shouldn't need to be run every time I create a new container, or am I missing something?

The text was updated successfully, but these errors were encountered:

3XX0 · 2016-01-14T01:38:33Z

Yes this is one of our limitations documented here
This is due to --rm removing the volumes attached to a container (equivalent to docker rm -v).
Here is the corresponding Docker issue: moby/moby#17907

A workaround would be to change volume setup to create a data container referencing the volume.
I'm not thrilled by this solution though...

andyneff · 2016-01-14T05:25:07Z

I'm actually a little bit of a fan of the data container idea.

I'm not 100% sure how the volumes work, are they a mounts actually to /var/lib/docker/volumes/foo/_data, or more mount magic... But if you are doing hard links in there, that suggests to me it's more of a normal directory. At any rate, I remember hearing from a security stand point, it's better to rely on a data container than direct mounting to your host devices. Some people may care about that
If the driver files are copied to a data container, that should alleviate this
You no longer need root to set it up, you just need docker group permissions.

I was already playing with this idea when you mentioned it, it seems to work well to me :)

I used a Makefile with

install:
        docker build -t nvidia_driver -f Dockerfile_nvidia_driver .
        if docker inspect nvidia_driver_${NVIDIA_VERSION} > /dev/null 2>&1; then \
          docker rm nvidia_driver_${NVIDIA_VERSION}; \
        fi
        docker run -v /usr/bin:/hostbin:ro -v /usr/lib64/nvidia:/hostlib64 --name nvidia_driver_${NVIDIA_VERSION} nvidia_driver

run:
        docker run -it --rm \
                   --volumes-from nvidia_driver_${NVIDIA_VERSION}:ro \
                   $$(ls /dev/nvidia* | sed 's|^|--device |') \
                   cuda_example

And a Dockerfile_nvidia_driver of

FROM centos:7

VOLUME /usr/local/nvidia

CMD mkdir -p /usr/local/nvidia/bin && \
    cp -a /hostbin/nvidia* /usr/local/nvidia/bin/ && \
    cp -ra /hostlib64 /usr/local/nvidia/lib64

Sorry it's a little messy, but it was just a quick poc to prove to myself it would work

3XX0 · 2016-01-14T06:53:29Z

Data containers and volumes are exactly the same thing under the hood. Using volumes directly makes more sense because that's where Docker is headed with persistent volumes and the new volume CLI. It also keeps things unified between the standalone version and the plugin version (i.e. nvidia-docker standalone uses a local driver).

Creating an image and a container for the sake of having a volume referenced is not ideal. Besides, you still have to make sure that the container is not deleted.

This, really is a Docker issue and will be fixed upstream eventually. In the meantime I suggest you run your container without --rm, or use nvidia-docker-plugin.
If you really want to lock the volume with a data container, it's just a matter of doing:

volume="$(sudo nvidia-docker volume setup)"
nvidia-docker create --name=LOCK -v $volume:/data:ro tianon/true

nvidia-docker run --rm nvidia/cuda nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

Regarding copy vs hardlink, we chose to do so to keep the ecosystem as light as possible. Copying around MB of driver files in order to launch a container is not an option.

3XX0 · 2016-01-18T21:52:20Z

Closing since it's an issue with upstream Docker.
I updated the documentation accordingly.

3XX0 · 2016-02-05T00:25:58Z

Fixed in Docker 1.10, the documentation has been updated.

orian · 2016-04-13T10:47:32Z

Just a notice, I've run the install instruction from README and tried to test, it failed with error:

docker: Error response from daemon: create nvidia_driver_361.28: create nvidia_driver_361.28: Error looking up volume plugin nvidia-docker: plugin not found.

The solution was:

sudo ./nvidia-docker volume setup

Installed version: nvidia-docker_1.0.0.beta.3-1_amd64.deb

3XX0 · 2016-04-13T16:47:04Z

Are you running Ubuntu? If so, can you show me the output of:

cat /var/log/upstart/nvidia-docker.log

guoquan · 2016-04-25T12:47:37Z

hi @3XX0, similar problem as @orian.
I am running Ubuntu 14.04.
Install follows the instructions on the wiki

# Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.0-beta.3/nvidia-docker_1.0.0.beta.3-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker_1.0.0.beta.3-1_amd64.deb && rm /tmp/nvidia-docker*.deb

and when I test it,

# Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi

it give the the error (which take me to this issue)

Error response from daemon: Error looking up volume plugin nvidia-docker: Plugin Error: Plugin.Activate, 400 Bad Request: malformed Host header

My nvidia-docker.log looks like this

$ sudo cat /var/log/upstart/nvidia-docker.log
/usr/bin/nvidia-docker-plugin | 2016/04/25 15:36:41 Loading NVIDIA management library
/usr/bin/nvidia-docker-plugin | 2016/04/25 15:36:41 Loading NVIDIA unified memory
/usr/bin/nvidia-docker-plugin | 2016/04/25 15:36:41 Discovering GPU devices
/usr/bin/nvidia-docker-plugin | 2016/04/25 15:36:43 Provisioning volumes at /var/lib/nvidia-docker/volumes
/usr/bin/nvidia-docker-plugin | 2016/04/25 15:36:43 Serving plugin API at /var/lib/nvidia-docker
/usr/bin/nvidia-docker-plugin | 2016/04/25 15:36:43 Serving remote API at localhost:3476

3XX0 · 2016-04-25T16:53:57Z

@guoquan see #83

oneklc · 2016-09-07T18:31:55Z

I had this error after upgrading my nvida-driver to the latest version (wanted to use cuda 8):
"nvidia-docker run --rm nvidia/cuda nvidia-smi

^[[Adocker: Error response from daemon: create nvidia_driver_367.44: create nvidia_driver_367.44: Error looking up volume plugin nvidia-docker: plugin not found.
See 'docker run --help'.
."

Running on centos 7.
After a reboot, upgrading docker and nvida-docker-plugin and another reboot i realised that the plugin wasn't running.

sudo systemctl start nvidia-docker

fixed my issues

abelatnvidia · 2016-11-09T23:51:00Z

Running on AWS ami linux using nvidia-docker fails to initially launch container nvidia/cuda:7.5-devel

nvidia-docker run --rm nvidia/cuda:7.5-devel nvidia-smi
Error response from daemon: create nvidia_driver_352.99: Post http://%2Frun%2Fdocker%2Fplugins%2Fnvidia-docker.sock/VolumeDriver.Create: http: ContentLength=44 with Body length 0.

nvidia-docker volume ls
DRIVER VOLUME NAME
nvidia-docker nvidia_driver_352.99

Then when I try to launch the container again it succeeds.

Currently using docker version 1.11.2, build b9f10c9/1.11.2

cat /tmp/nvidia-docker.log
nvidia-docker-plugin | 2016/11/09 23:25:35 Loading NVIDIA unified memory
nvidia-docker-plugin | 2016/11/09 23:25:36 Loading NVIDIA management library
nvidia-docker-plugin | 2016/11/09 23:25:36 Discovering GPU devices
nvidia-docker-plugin | 2016/11/09 23:25:40 Provisioning volumes at /var/lib/nvidia-docker/volumes
nvidia-docker-plugin | 2016/11/09 23:25:40 Serving plugin API at /run/docker/plugins
nvidia-docker-plugin | 2016/11/09 23:25:40 Serving remote API at localhost:3476
nvidia-docker-plugin | 2016/11/09 23:32:06 Received activate request
nvidia-docker-plugin | 2016/11/09 23:32:06 Plugins activated [VolumeDriver]
nvidia-docker-plugin | 2016/11/09 23:32:07 Received create request for volume 'nvidia_driver_352.99'

3XX0 added the wontfix label Jan 18, 2016

3XX0 closed this as completed Jan 18, 2016

3XX0 mentioned this issue Feb 10, 2016

Volume setup fails across device boundaries #47

Closed

daveselinger mentioned this issue Jul 15, 2016

Hard power loss killed nvidia-docker #140

Closed

HoloSound mentioned this issue Sep 2, 2016

VolumeDriver.Create: internal error #188

Closed

convneato mentioned this issue Nov 10, 2019

Docker issue: error response from daemon: OCI runtime create failed #1121

Closed

8 tasks

RenaudWasTaken mentioned this issue Nov 24, 2019

Failed to fetch http://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64/InRelease #1126

Closed

9 tasks

This was referenced Mar 12, 2020

docker: Error response from daemon #1217

Closed

Unable to create container #1218

Closed

maghrebi84 mentioned this issue Mar 30, 2020

Strange behavior with running the container #1232

Closed

9 tasks

PriamX mentioned this issue Sep 19, 2022

Fedora installation procedure #706

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only works once #34

Only works once #34

andyneff commented Jan 13, 2016

3XX0 commented Jan 14, 2016

andyneff commented Jan 14, 2016

3XX0 commented Jan 14, 2016

3XX0 commented Jan 18, 2016

3XX0 commented Feb 5, 2016

orian commented Apr 13, 2016

3XX0 commented Apr 13, 2016

guoquan commented Apr 25, 2016

3XX0 commented Apr 25, 2016

oneklc commented Sep 7, 2016

abelatnvidia commented Nov 9, 2016

Only works once #34

Only works once #34

Comments

andyneff commented Jan 13, 2016

3XX0 commented Jan 14, 2016

andyneff commented Jan 14, 2016

3XX0 commented Jan 14, 2016

3XX0 commented Jan 18, 2016

3XX0 commented Feb 5, 2016

orian commented Apr 13, 2016

3XX0 commented Apr 13, 2016

guoquan commented Apr 25, 2016

3XX0 commented Apr 25, 2016

oneklc commented Sep 7, 2016

abelatnvidia commented Nov 9, 2016